Document Extraction

Upload a document, define a schema, and extract structured data with Sparrow.

Input document

PDF or image · max 5 MB · removed after inference

Drop a document or browse files

Supported: PDF, PNG, JPG, TIFF · max 5 MB

Extraction schema

JSON describing the fields to extract

Provide a JSON array of fields with their expected types. Sparrow returns a matching JSON document.

Processing options

Control how the model interprets your document

Table-only extraction

Focus on tabular content — best for dense tables, financial reports, lab results, portfolio statements.

Schema validation

Validate extracted values against the schema types before returning.

Vision LLM model

Standard works well for most documents. Advanced is recommended for complex forms.

Sparrow key · optional

Without a key, usage is limited to 30 calls per 6 hours and 3-page documents. Contact us for a key.

Try an example

Pre-loaded sample documents

Privacy by default

Documents are never stored — the upload is removed as soon as inference completes.

Response

JSON output from Sparrow

No results yet

Upload a document and define a schema, then run extraction to see structured output here.

Document Extraction

Structured data from invoices, receipts, statements, and tables — multi-page PDF, page classification, bounding-box annotation, and schema validation. No cloud dependencies.

Business Rules

Push formatting, derived fields, classification, and transformations to the LLM itself — no post-processing code, just typed schema fields with optional defaults.

Sparrow Agent

Orchestrate Vision LLM extraction with Text LLM reasoning — chain classification, extraction, and field validation with visual monitoring and error handling.