Documents
Every uploaded or imported file is registered as a Document record. Documents persist independently of jobs — you can re-run extraction with a different schema without re-uploading.
Get document metadata
curl http://localhost:4000/api/documents/DOCUMENT_ID \
-H "Authorization: Bearer YOUR_TOKEN"
List extractions for a document
View all extraction results across jobs for a single document:
curl http://localhost:4000/api/documents/DOCUMENT_ID/extractions \
-H "Authorization: Bearer YOUR_TOKEN"
Re-run extraction
Create a new extraction job from an existing document:
curl -X POST http://localhost:4000/api/documents/DOCUMENT_ID/extractions \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"type": "EXTRACTION",
"schemaId": "NEW_SCHEMA_ID"
}'
Alternatively, use POST /api/jobs/from-document/:documentId — see Jobs.
Document lifecycle
Documents are created when:
- You upload via
POST /api/jobs - Files are imported from integrations (Google Drive, Dropbox, S3, URLs)
- Presigned upload is confirmed via
POST /api/jobs/confirm-upload
Imported files are downloaded asynchronously via the import queue before jobs are created.
Storage
By default, files are stored locally (STORAGE_DRIVER=local, ./uploads). In production, configure S3-compatible storage via environment variables. Static files are served at /uploads when using local storage.