Skip to main content

Jobs

A job represents a single document extraction task. Upload a file, specify a schema, and ExtractForm processes it asynchronously.

Create a job (file upload)

curl -X POST http://localhost:4000/api/jobs \
-H "Authorization: Bearer YOUR_TOKEN" \
-F "file=@document.pdf" \
-F "type=EXTRACTION" \
-F "schemaId=YOUR_SCHEMA_ID"

Job types

TypeDescription
EXTRACTIONExtract structured data using a schema

Job statuses

StatusDescription
PENDINGCreated, awaiting queue
QUEUEDIn processing queue
PROCESSINGCurrently being processed
COMPLETEDSuccessfully finished
FAILEDProcessing failed
CANCELLEDCancelled by user

Presigned upload (large files)

For direct-to-storage uploads:

Step 1 — Get upload URL:

curl -X POST http://localhost:4000/api/jobs/upload-url \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"filename": "large-document.pdf"}'

Step 2 — Upload to the returned uploadUrl

Step 3 — Confirm and create job:

curl -X POST http://localhost:4000/api/jobs/confirm-upload \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"key": "RETURNED_KEY",
"filename": "large-document.pdf",
"type": "EXTRACTION",
"schemaId": "YOUR_SCHEMA_ID"
}'

Re-extract from existing document

curl -X POST http://localhost:4000/api/jobs/from-document/DOCUMENT_ID \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"type": "EXTRACTION", "schemaId": "YOUR_SCHEMA_ID"}'

Get job and extraction

# Job status and metadata
curl http://localhost:4000/api/jobs/JOB_ID \
-H "Authorization: Bearer YOUR_TOKEN"

# Extraction result
curl http://localhost:4000/api/jobs/JOB_ID/extraction \
-H "Authorization: Bearer YOUR_TOKEN"

Update extraction (manual edit)

Patch the extracted fields after review:

curl -X PATCH http://localhost:4000/api/jobs/JOB_ID/extraction \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"fields": {"invoice_number": "INV-001-CORRECTED"}}'

This emits a job.updated webhook event.

Cancel and restart

# Cancel a pending or queued job
curl -X POST http://localhost:4000/api/jobs/JOB_ID/cancel \
-H "Authorization: Bearer YOUR_TOKEN"

# Restart a failed or completed job
curl -X POST http://localhost:4000/api/jobs/JOB_ID/restart \
-H "Authorization: Bearer YOUR_TOKEN"

List jobs

curl "http://localhost:4000/api/jobs?page=1&limit=10&status=COMPLETED" \
-H "Authorization: Bearer YOUR_TOKEN"

Query parameters: page, limit, type, status, parentRunId, schemaId.

Delete a job

curl -X DELETE http://localhost:4000/api/jobs/JOB_ID \
-H "Authorization: Bearer YOUR_TOKEN"

Idempotency

Pass Idempotency-Key header on create endpoints to prevent duplicate jobs on retries. See Authentication.