API Documentation
Introduction
API Reference for directly calling AnyParser API
Welcome
The AnyParser API is a real-time parser designed to extract content from various file formats, including PDFs and DOCX files. The API accepts file content encoded in base64, processes it, and returns the parsed data in markdown format.
File formats
The SDK processes the following types of files:
- PDF files
- Office files:
docx
,pptx
- Image files:
png
,jpg
,jpeg
Features
The AnyParser API has the following features:
- Full content parsing
- Key-value extraction
Sync vs Async API
In the AnyParser API, you can use either the sync or async endpoints.
- Sync API: This is a blocking API that will return the results of the parse or extraction. It will time out after 30 seconds. This works well for shorter and simple files.
- Async API: This is an asynchronous API that will return a file ID. You can use this file ID to fetch the results of the extraction at a later time. Consider using if you have longer or more complex files.
API Base URL
SDK
If you’d prefer to use the AnyParser SDK, please refer to the SDK Reference