Parse With OCR - CambioML

Overview

Using AnyParser, you can parse the full content from your documents into markdown. The Parse with OCR model refines parsing results by applying OCR detection and correction techniques.

Setup

Refer to the Quickstart guide to install the AnyParser SDK and get your api key. Next, set up your AnyParser sync or async client.

anyparser_async.py

from any_parser import AnyParser

ap = AnyParser(api_key="...")
# start the parsing request
file_id = ap.async_parse_with_ocr(file_path="/path/to/your/file")
# fetch results (5s polling up to 60s)
markdown_string = ap.async_fetch(file_id, sync=True, sync_timeout=60, sync_interval=5)

Output

A string containing the markdown representation of the given file.

Full Notebook Examples

Check out these notebooks for more detailed examples of using AnyParser BASE and PRO models:

AnyParser Async API: Parse longer documents (which may take longer than 30 seconds).

AnyParser Async Parse Example

Extracting content from a table of contents.

AnyParser SDK Reference

​Overview

​Setup

​Output

​Full Notebook Examples

AnyParser Async Parse Example

Overview

Setup

Output

Full Notebook Examples