Jobs
A job represents a single document extraction task. Upload a file, specify a schema, and ExtractForm processes it asynchronously.
Create a job (file upload)
curl -X POST http://localhost:4000/api/jobs \
-H "Authorization: Bearer YOUR_TOKEN" \
-F "file=@document.pdf" \
-F "type=EXTRACTION" \
-F "schemaId=YOUR_SCHEMA_ID"
Job types
| Type | Description |
|---|---|
EXTRACTION | Extract structured data using a schema |
Job statuses
| Status | Description |
|---|---|
PENDING | Created, awaiting queue |
QUEUED | In processing queue |
PROCESSING | Currently being processed |
COMPLETED | Successfully finished |
FAILED | Processing failed |
CANCELLED | Cancelled by user |
Presigned upload (large files)
For direct-to-storage uploads:
Step 1 — Get upload URL:
curl -X POST http://localhost:4000/api/jobs/upload-url \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"filename": "large-document.pdf"}'
Step 2 — Upload to the returned uploadUrl
Step 3 — Confirm and create job:
curl -X POST http://localhost:4000/api/jobs/confirm-upload \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"key": "RETURNED_KEY",
"filename": "large-document.pdf",
"type": "EXTRACTION",
"schemaId": "YOUR_SCHEMA_ID"
}'
Re-extract from existing document
curl -X POST http://localhost:4000/api/jobs/from-document/DOCUMENT_ID \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"type": "EXTRACTION", "schemaId": "YOUR_SCHEMA_ID"}'
Get job and extraction
# Job status and metadata
curl http://localhost:4000/api/jobs/JOB_ID \
-H "Authorization: Bearer YOUR_TOKEN"
# Extraction result
curl http://localhost:4000/api/jobs/JOB_ID/extraction \
-H "Authorization: Bearer YOUR_TOKEN"
Update extraction (manual edit)
Patch the extracted fields after review:
curl -X PATCH http://localhost:4000/api/jobs/JOB_ID/extraction \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"fields": {"invoice_number": "INV-001-CORRECTED"}}'
This emits a job.updated webhook event.
Cancel and restart
# Cancel a pending or queued job
curl -X POST http://localhost:4000/api/jobs/JOB_ID/cancel \
-H "Authorization: Bearer YOUR_TOKEN"
# Restart a failed or completed job
curl -X POST http://localhost:4000/api/jobs/JOB_ID/restart \
-H "Authorization: Bearer YOUR_TOKEN"
List jobs
curl "http://localhost:4000/api/jobs?page=1&limit=10&status=COMPLETED" \
-H "Authorization: Bearer YOUR_TOKEN"
Query parameters: page, limit, type, status, parentRunId, schemaId.
Delete a job
curl -X DELETE http://localhost:4000/api/jobs/JOB_ID \
-H "Authorization: Bearer YOUR_TOKEN"
Idempotency
Pass Idempotency-Key header on create endpoints to prevent duplicate jobs on retries. See Authentication.