Overview

Using AnyParser, you can extract PII from your documents, including

  • Name
  • Phone Number
  • Address
  • Email Address
  • Linkedin URL
  • Github URL
  • Summary

Setup

Refer to the Quickstart guide to install the AnyParser SDK and get your api key.

First, set up your AnyParser client.

anyparser_pii.py
from any_parser import AnyParser

ap = AnyParser(api_key="...")

Then, use the anyparser_pii method, passing in the following:

  • file_path (str): the path to the local file
anyparser_extract_pii.py
pii_result, total_time = ap.extract_pii(file_path="/path/to/your/file")

This will return two things:

  • pii_result (dict): Dictionary with the keys corresponding to PII types, and the values extracted from the document
  • total_time (str): the time elapsed in seconds

Full Code

anyparser_extract_pii.py
from any_parser import AnyParser

ap = AnyParser(api_key="...")

local_file_path = "/path/to/your/file"

pii_result, total_time = ap.extract_pii(local_file_path)

Output

A dictionary containing Personally Identifiable Information (PII).

Full Notebook Examples

Check out these notebooks for more detailed examples of using both sync and async AnyParser.