Skip to main content

Batch Runs

A JobRun (batch run) processes multiple documents in a single operation. All child jobs share the same schema and are tracked under one parent run.

Use batch runs when you need to:

  • Upload multiple files at once
  • Import from connected cloud storage
  • Process a list of public URLs
  • Trigger mixed sources via the integrations API

Upload multiple files

curl -X POST http://localhost:4000/api/job-runs \
-H "Authorization: Bearer YOUR_TOKEN" \
-F "files=@invoice1.pdf" \
-F "files=@invoice2.pdf" \
-F "schemaId=YOUR_SCHEMA_ID" \
-F "jobType=EXTRACTION"

Response:

{
"id": "job-run-id",
"status": "PROCESSING",
"totalJobs": 2,
"completedJobs": 0,
"failedJobs": 0
}

Import from connected storage

curl -X POST http://localhost:4000/api/job-runs/from-sources \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"integrationId": "INTEGRATION_ID",
"fileIds": ["file-id-1", "file-id-2"],
"schemaId": "SCHEMA_ID",
"jobType": "EXTRACTION"
}'

Returns { "jobRunId", "importBatchId", "totalFiles" }.

Import from public URLs

curl -X POST http://localhost:4000/api/job-runs/from-urls \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"schemaId": "SCHEMA_ID",
"jobType": "EXTRACTION",
"urls": [
{ "url": "https://example.com/invoice1.pdf", "externalRef": "row-1" },
{ "url": "https://example.com/invoice2.pdf", "externalRef": "row-2" }
]
}'

For Excel workflows: parse the spreadsheet in your app or n8n, then POST the URL array.

Mixed-source trigger

Single endpoint for automation (n8n, custom scripts):

curl -X POST http://localhost:4000/api/integrations/trigger \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"schemaId": "SCHEMA_ID",
"jobType": "EXTRACTION",
"files": [
{ "url": "https://example.com/doc.pdf", "externalRef": "n8n-1" },
{
"integrationId": "INTEGRATION_ID",
"fileId": "DRIVE_FILE_ID",
"externalRef": "n8n-2"
}
]
}'

Monitor progress

# Batch run status
curl http://localhost:4000/api/job-runs/JOB_RUN_ID \
-H "Authorization: Bearer YOUR_TOKEN"

# Child jobs
curl http://localhost:4000/api/job-runs/JOB_RUN_ID/jobs \
-H "Authorization: Bearer YOUR_TOKEN"

Poll until status is COMPLETED or FAILED, or subscribe to webhook events (job_run.completed, job_run.progress).

Cancel a batch run

curl -X POST http://localhost:4000/api/job-runs/JOB_RUN_ID/cancel \
-H "Authorization: Bearer YOUR_TOKEN"

List batch runs

curl "http://localhost:4000/api/job-runs?page=1&limit=10" \
-H "Authorization: Bearer YOUR_TOKEN"

Query parameters: page, limit, status, schemaId.

JobRun vs Job

ConceptScope
JobRunParent batch — tracks overall progress across many files
JobSingle document extraction within a run (or standalone)

Standalone jobs (single upload via POST /api/jobs) have no parentRunId. Batch jobs always belong to a JobRun.