Pathnovo Logo

API reference

REST API for uploading documents and pulling typed JSON results. Version 0.1.0. Base path /api/v1.

Authentication

Every request is authenticated with the API key issued to your project. Send it in the Authorization header:

Authorization header
Authorization: Bearer YOUR_API_KEY

Treat the key as a secret. Don't commit it to source control or ship it in client-side code. If a key is exposed, rotate it from the Pathnovo console.

Errors

Errors return a JSON body with a detail field and the relevant HTTP status:

Error response
{
  "detail": "Project not found"
}
StatusMeaning
400Bad request. The body or query is malformed.
401Missing or invalid API key.
403The key does not have access to the requested project or resource.
404Resource does not exist.
409Conflict. Usually a duplicate upload that has been deduplicated.
413File or archive is too large.
415Unsupported file type.
422Validation error. Field-level details in the response body.
429Rate limit exceeded.
500Server error. Safe to retry with backoff.

Rate limits

Default limits: 60 requests per minute per API key for read endpoints, 30 per minute for upload endpoints. Limits are returned in the X-RateLimit-* headers on every response. If you need higher limits, contact your account team.

Conventions

  • Request and response bodies are JSON. Uploads use multipart/form-data.
  • Upload endpoints return 202 Accepted with a job ID. Poll the status endpoint or subscribe to the SSE stream.
  • Timestamps are ISO 8601 UTC. IDs are UUIDv4.
  • All endpoints are versioned under /api/v1.

Documents

Upload files or import them from a URL. Each upload returns a document ID and starts a background classification + extraction job.

POST/api/v1/documents/upload

Upload a document

Upload a single file as multipart/form-data. The endpoint returns immediately with a document ID; classification and extraction run in the background.

Use this for one document at a time. The response is 202 Accepted, not 200, because the file has only been received and queued. The pipeline runs in the background and can take anywhere from 30 seconds for a small datasheet to several minutes for a multi-page P&ID.

Pathnovo deduplicates by content hash, scoped to the project. If you upload the same file (byte-for-byte) into the same project twice, the second response returns the original document_id with deduplicated set to true and no new job is queued. This makes uploads idempotent enough to retry safely.

Supported file types: PDF, PNG, JPG, TIFF, XLSX, DOCX. Max size per file is 100 MB.

After upload, watch progress on the SSE stream or poll the status endpoint. Don't keep the upload connection open waiting for completion.

What to do next

Pass the document_id from the response to GET /documents/{id}/progress (live SSE) or GET /documents/{id} (polling) to track the job.

Don't poll the upload

Treat 202 as final. The upload connection is closed as soon as the file lands. Tracking progress on the upload socket itself will time out.

Request body

Content-Type: multipart/form-data

FieldTypeDescription
filerequiredbinaryThe document file (PDF, PNG, JPG, TIFF, XLSX, DOCX). Max 100 MB.
project_idrequiredUUIDProject the document belongs to. Your API key must have access to it.

Responses

202Accepted, processing startedUploadResponse
{
  "document_id": "UUID",
  "status": "queued",
  "deduplicated": false
}
400Bad request
401Missing or invalid API key
403API key does not have access to this project
413File too large
415Unsupported file type
429Rate limit exceeded
500Internal server error
cURL
curl -X POST "https://api.pathnovo.com/api/v1/documents/upload" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY" \
  -F "file=@/path/to/file" \
  -F "project_id=<value>"
Response · 202
{
  "document_id": "UUID",
  "status": "queued",
  "deduplicated": false
}
POST/api/v1/documents/batch/upload

Upload a ZIP of documents

Upload a ZIP archive. Every file inside is unpacked and treated as a separate document. Each gets its own document_id and runs through the pipeline independently.

Use this when you have a folder of related documents (a handover package, a vendor deliverable, a turnaround set) and want to push them all in one request. Pathnovo extracts the archive on its side, then queues each file as a normal upload.

The response includes the count and an array of UploadResponse objects, one per file, in the order they appeared in the archive. Track each document_id individually after that.

Nested folders inside the ZIP are flattened. The archive must be < 1 GB. For very large sets, split into multiple ZIPs and call this endpoint per chunk.

What gets unpacked

Files with unsupported extensions (.bak, .tmp, .DS_Store, etc.) are silently ignored. Hidden files starting with `.` are skipped.

tip

If your archive has more than ~50 files, processing happens in parallel server-side. The order of completion will not match the order in the response.

Request body

Content-Type: multipart/form-data

FieldTypeDescription
zip_filerequiredbinaryZIP archive of documents. Max 1 GB. Nested folders are flattened.
project_idrequiredUUIDProject the documents belong to.

Responses

202Accepted, processing startedBatchUploadResponse
{
  "total_files": 12,
  "documents": [
    { "document_id": "UUID", "status": "queued", "deduplicated": false }
  ]
}
400Bad request
401Missing or invalid API key
403API key does not have access to this project
413Archive too large
429Rate limit exceeded
500Internal server error
cURL
curl -X POST "https://api.pathnovo.com/api/v1/documents/batch/upload" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY" \
  -F "zip_file=@/path/to/file" \
  -F "project_id=<value>"
Response · 202
{
  "total_files": 12,
  "documents": [
    { "document_id": "UUID", "status": "queued", "deduplicated": false }
  ]
}
POST/api/v1/documents/import-url

Import from a URL

Hand Pathnovo a URL and we'll fetch the file ourselves, then run the standard pipeline. Useful when files live in S3, SharePoint, or an internal share that's reachable over the public internet.

Pathnovo's importer follows redirects, respects standard auth on the URL (basic auth and query-string tokens), and verifies the file type before queuing the job. The download happens once; the file is then stored in our document store like any other upload.

For private files, generate a short-lived pre-signed URL on your side (e.g. an S3 presigned URL with a 15-minute TTL). The URL only needs to be valid long enough for us to fetch the file, which usually takes a few seconds.

URLs must be reachable from our infrastructure

We cannot pull from inside your VPC or behind a corporate firewall. If the URL is private and you can't expose it, use POST /documents/upload instead and stream the bytes directly.

tip

The same content-hash deduplication that applies to /documents/upload applies here. Re-importing the same URL into the same project returns the original document_id.

Request body

Model: UrlImportRequest · Content-Type: application/json

UrlImportRequest.json
{
  "url": "https://example.com/spec.pdf",
  "project_id": "UUID"
}

Responses

202Accepted, download startedUploadResponse
{
  "document_id": "UUID",
  "status": "queued",
  "deduplicated": false
}
400Bad request
401Missing or invalid API key
403API key does not have access to this project
422Validation error
429Rate limit exceeded
500Internal server error
cURL
curl -X POST "https://api.pathnovo.com/api/v1/documents/import-url" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "url": "https://example.com/spec.pdf",
  "project_id": "UUID"
}'
Response · 202
{
  "document_id": "UUID",
  "status": "queued",
  "deduplicated": false
}
GET/api/v1/documents/{document_id}

Get document status

Read the current state of a document. Use this when you want a single-point-in-time check, for example from a cron job or when refreshing a UI page.

The status field walks through these values in order: queued (uploaded, waiting for a worker), classifying (the classifier is running), extracting (the right pipeline is pulling fields), extracted (done, result available), failed (something blew up, see error_detail on the extraction job).

The response also includes the original filename, mime type, and three timestamps: when it was uploaded, when classification finished, and when extraction finished. The classification and extraction timestamps are null until those steps complete.

If you need live updates rather than snapshots, prefer the SSE progress stream on /documents/{id}/progress so you don't burn rate limit on a tight poll loop.

Polling cadence

Most documents finish within 60 seconds. A 5-second poll interval is plenty. Don't poll faster than once per second; the status almost never changes that quickly and you'll hit rate limits.

Path parameters

NameTypeDescription
document_idUUIDDocument ID returned by an upload endpoint.

Responses

200OKDocumentStatusResponse
{
  "id": "UUID",
  "project_id": "UUID",
  "original_filename": "spec.pdf",
  "mime_type": "application/pdf",
  "status": "extracted",
  "uploaded_at": "2026-04-25T08: 00: 00Z",
  "classified_at": "2026-04-25T08: 00: 14Z",
  "extracted_at": "2026-04-25T08: 01: 02Z"
}
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/documents/{document_id}" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
{
  "id": "UUID",
  "project_id": "UUID",
  "original_filename": "spec.pdf",
  "mime_type": "application/pdf",
  "status": "extracted",
  "uploaded_at": "2026-04-25T08: 00: 00Z",
  "classified_at": "2026-04-25T08: 00: 14Z",
  "extracted_at": "2026-04-25T08: 01: 02Z"
}
GET/api/v1/documents/{document_id}/progress

Stream live progress (SSE)

A Server-Sent Events stream that emits an event every time the document moves to a new stage or the extractor reports a percentage. The connection closes automatically when the job hits a terminal state (extracted or failed).

Each event is a JSON object with a `status` field and an optional `progress` field (0-100) for stages that report incremental progress. Browsers can subscribe with the standard EventSource API; on the server side use any HTTP client that supports streaming.

If the connection drops, reconnect with the same URL. Pathnovo will replay the latest known state immediately so you never miss the terminal event. There's no event ID; reconnection is stateless.

If the document already completed before you connected, you'll get one event with the final status, then the connection closes. This makes it safe to subscribe even after a job is done.

When to use SSE vs polling

SSE is better for live UIs (the user is staring at a progress bar). Polling /documents/{id} is better for batch jobs or async workflows where you only need to check in occasionally.

Path parameters

NameTypeDescription
document_idUUIDDocument ID.

Responses

200text/event-stream
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/documents/{document_id}/progress" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
// No body

Classification

Read the document type Pathnovo assigned to a file, or override it manually if needed.

GET/api/v1/classification/documents/{document_id}

Get classification

Returns the document type Pathnovo assigned to this file, the confidence score, the method (auto or manual), and the bucket the type belongs to.

Confidence is an integer from 0 to 100. In practice, anything above 90 is rock-solid, 70 to 90 is usually right but worth a glance from a human reviewer, and below 70 is worth manual confirmation. The classifier surfaces low-confidence calls in our own UI for review; you can do the same on your side.

Bucket is a coarse grouping (drawings, datasheets, registers, certificates) that lets you filter without knowing the exact doc type. Useful for dashboards.

If the classification was overridden manually, is_manual_override is true and confidence reads 100. The original auto-classification is still recorded in our system but not exposed here.

tip

Run this right after the document hits status 'extracted'. The extraction result already encodes the doc type implicitly, but this endpoint gives you the confidence number which is what you want for review queues.

Path parameters

NameTypeDescription
document_idUUIDDocument ID.

Responses

200OKClassificationResponse
{
  "id": "UUID",
  "document_id": "UUID",
  "doc_type_id": "UUID",
  "doc_type_name": "P&ID",
  "bucket": "drawings",
  "confidence": 96,
  "method": "auto",
  "is_manual_override": false,
  "created_at": "2026-04-25T08: 00: 14Z"
}
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/classification/documents/{document_id}" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
{
  "id": "UUID",
  "document_id": "UUID",
  "doc_type_id": "UUID",
  "doc_type_name": "P&ID",
  "bucket": "drawings",
  "confidence": 96,
  "method": "auto",
  "is_manual_override": false,
  "created_at": "2026-04-25T08: 00: 14Z"
}
PATCH/api/v1/classification/documents/{document_id}

Override classification

Manually set the document type. This re-queues extraction with the new schema and returns the updated classification record.

Use this when the auto-classifier got it wrong and the result you got back from /extraction/documents/{id}/result is using the wrong shape. Common case: a vendor datasheet that visually looks like a P&ID, or two doc types that share the same template.

When you call this endpoint, Pathnovo deletes the previous extraction result, looks up the schema for the new doc_type_name, and queues a fresh extraction job. The original document file is reused, no upload needed. Watch the new job through the standard progress endpoints.

The reason field is stored on our side for your audit trail. We don't use it to retrain anything automatically; if you want classifier improvements based on overrides, talk to your integration engineer.

The previous result is replaced

If you've already pulled the extraction result and stored it on your side, snapshot it before overriding. The previous result is deleted as soon as the new job is queued.

info

doc_type_name must match exactly one of the names returned by GET /schemas. Case-sensitive.

Path parameters

NameTypeDescription
document_idUUIDDocument ID.

Request body

Model: ManualOverrideRequest · Content-Type: application/json

ManualOverrideRequest.json
{
  "doc_type_name": "P&ID",
  "reason": "auto-classified as isometric, but it is a P&ID"
}

Responses

200OKClassificationResponse
{
  "id": "UUID",
  "document_id": "UUID",
  "doc_type_id": "UUID",
  "doc_type_name": "P&ID",
  "bucket": "drawings",
  "confidence": 96,
  "method": "auto",
  "is_manual_override": false,
  "created_at": "2026-04-25T08: 00: 14Z"
}
400Bad request
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
422Validation error
500Internal server error
cURL
curl -X PATCH "https://api.pathnovo.com/api/v1/classification/documents/{document_id}" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "doc_type_name": "P&ID",
  "reason": "auto-classified as isometric, but it is a P&ID"
}'
Response · 200
{
  "id": "UUID",
  "document_id": "UUID",
  "doc_type_id": "UUID",
  "doc_type_name": "P&ID",
  "bucket": "drawings",
  "confidence": 96,
  "method": "auto",
  "is_manual_override": false,
  "created_at": "2026-04-25T08: 00: 14Z"
}

Extraction

Track extraction jobs and pull the typed JSON result for a document.

GET/api/v1/extraction/jobs/{job_id}/status

Get job status

Returns the lightweight status record for one extraction job. Use this when you already hold a job_id and only need to know if it's done.

The status field can be queued, running, completed, or failed. resolved_scope tells you which config layer was used to extract this document (default, org, or project) so you can debug surprising results during a rollout of new prompts.

Compared to GET /extraction/documents/{id}/jobs (which returns all jobs for the document including the full result), this endpoint is cheap and small. Reach for it inside tight loops.

Path parameters

NameTypeDescription
job_idUUIDExtraction job ID. Returned by /extraction/documents/{id}/jobs.

Responses

200OKExtractionJobStatusResponse
{
  "id": "UUID",
  "status": "completed",
  "resolved_scope": "project",
  "started_at": "2026-04-25T08: 00: 14Z",
  "completed_at": "2026-04-25T08: 01: 02Z"
}
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/extraction/jobs/{job_id}/status" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
{
  "id": "UUID",
  "status": "completed",
  "resolved_scope": "project",
  "started_at": "2026-04-25T08: 00: 14Z",
  "completed_at": "2026-04-25T08: 01: 02Z"
}
GET/api/v1/extraction/documents/{document_id}/jobs

List jobs for a document

Returns every extraction job that has run for this document, newest first, including the full result and any error detail.

A document can have more than one job in two cases: classification was overridden (so we re-extracted with the new schema), or the first extraction failed and was retried. This endpoint shows the full history, so you can compare results across attempts.

Each job record includes the resolved_scope (default / org / project) used at the time. If you change project-level config and re-extract, you'll see the new scope on the latest job.

tip

If you only care about the latest successful result, hit /extraction/documents/{id}/result instead. This endpoint is for when you need history.

Path parameters

NameTypeDescription
document_idUUIDDocument ID.

Responses

200OKArray<ExtractionJobResponse>
[
  {
    "id": "UUID",
    "document_id": "UUID",
    "status": "completed",
    "resolved_scope": "project",
    "result": { "title_block": { "...": "..." } },
    "error_detail": null,
    "started_at": "2026-04-25T08: 00: 14Z",
    "completed_at": "2026-04-25T08: 01: 02Z",
    "created_at": "2026-04-25T08: 00: 00Z"
  }
]
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/extraction/documents/{document_id}/jobs" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
[
  {
    "id": "UUID",
    "document_id": "UUID",
    "status": "completed",
    "resolved_scope": "project",
    "result": { "title_block": { "...": "..." } },
    "error_detail": null,
    "started_at": "2026-04-25T08: 00: 14Z",
    "completed_at": "2026-04-25T08: 01: 02Z",
    "created_at": "2026-04-25T08: 00: 00Z"
  }
]
GET/api/v1/extraction/documents/{document_id}/result

Get extraction result

Returns the latest completed extraction as typed JSON. The shape depends on the document type; look up the schema for that type if you need to know the exact fields.

This is the endpoint you'll call most often. After a document reaches the extracted status, hit this to pull the structured data and load it into your downstream system.

Every result includes the embedded title_block header (28 fields shared across every doc type) plus a body that depends on the doc type. For example, a P&ID returns lines, instruments, equipment, valves; a mill certificate returns chemical composition, mechanical tests, dimensions.

If the document hasn't finished extracting yet, the response is 202 Accepted with an empty body. Don't treat that as a result; wait for the job to complete.

Look up the schema

To know the exact fields you'll get back, find the document type in /schemas and use the version returned. The schema includes every field with its type.

Path parameters

NameTypeDescription
document_idUUIDDocument ID.

Responses

200Typed JSON for this document type
202Extraction still in progress
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/extraction/documents/{document_id}/result" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
// No body
GET/api/v1/extraction/documents/{document_id}/status

Get extraction status

Compact status flags for the latest extraction attempt on this document, without the full result payload.

Use this when you need a quick yes/no on whether the latest extraction is ready, and you don't want to pull the full result. Common pattern: poll this from a worker that's deciding whether to fan out the result-fetch job.

Path parameters

NameTypeDescription
document_idUUIDDocument ID.

Responses

200OK
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/extraction/documents/{document_id}/status" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
// No body

Schemas

List the document types Pathnovo can extract and pull the JSON schema for any one of them.

GET/api/v1/schemas

List supported document types

List every document type Pathnovo can extract, with the schema version and field count.

Responses

200OKArray<SchemaListItem>
[
  {
    "doc_type_id": "UUID",
    "doc_type_name": "P&ID",
    "bucket": "drawings",
    "version": "1.4.0",
    "field_count": 78
  }
]
401Missing or invalid API key
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/schemas" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
[
  {
    "doc_type_id": "UUID",
    "doc_type_name": "P&ID",
    "bucket": "drawings",
    "version": "1.4.0",
    "field_count": 78
  }
]
GET/api/v1/schemas/{doc_type_id}

Get a schema

Get the JSON schema for a specific document type, including the embedded title_block header.

Path parameters

NameTypeDescription
doc_type_idUUIDDocument type ID.

Responses

200OKSchemaDetailResponse
{
  "doc_type_id": "UUID",
  "doc_type_name": "P&ID",
  "version": "1.4.0",
  "schema": {
    "title_block": "<common>",
    "lines": [
      { "tag": "string", "size_in": "number", "service": "string" }
    ]
  }
}
401Missing or invalid API key
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/schemas/{doc_type_id}" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
{
  "doc_type_id": "UUID",
  "doc_type_name": "P&ID",
  "version": "1.4.0",
  "schema": {
    "title_block": "<common>",
    "lines": [
      { "tag": "string", "size_in": "number", "service": "string" }
    ]
  }
}
GET/api/v1/extraction/document-types

List document type IDs

List the raw document type records used by the classifier. Use this to map document type names to IDs.

Responses

200OKArray<DocumentTypeResponse>
[
  {
    "id": "UUID",
    "name": "P&ID",
    "bucket": "drawings",
    "description": "Piping and instrumentation diagram"
  }
]
401Missing or invalid API key
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/extraction/document-types" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
[
  {
    "id": "UUID",
    "name": "P&ID",
    "bucket": "drawings",
    "description": "Piping and instrumentation diagram"
  }
]

Analytics

Project-level usage and accuracy metrics. Useful for dashboards and billing reconciliation.

GET/api/v1/analytics/projects/{project_id}/overview

Project overview

Summary counts for a project. Documents uploaded, classified, extracted, failed, plus pages processed for billing.

Path parameters

NameTypeDescription
project_idUUIDProject ID.

Responses

200OKProjectAnalyticsOverviewResponse
{
  "project_id": "UUID",
  "documents_uploaded": 1284,
  "documents_classified": 1280,
  "documents_extracted": 1271,
  "documents_failed": 9,
  "by_status": {
    "queued": 0,
    "classifying": 4,
    "extracting": 13,
    "extracted": 1271,
    "failed": 9
  },
  "pages_processed": 8420,
  "billable_pages": 8420
}
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/analytics/projects/{project_id}/overview" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
{
  "project_id": "UUID",
  "documents_uploaded": 1284,
  "documents_classified": 1280,
  "documents_extracted": 1271,
  "documents_failed": 9,
  "by_status": {
    "queued": 0,
    "classifying": 4,
    "extracting": 13,
    "extracted": 1271,
    "failed": 9
  },
  "pages_processed": 8420,
  "billable_pages": 8420
}
GET/api/v1/analytics/projects/{project_id}/classification

Classification accuracy

Accuracy and method breakdown for a project. Shows the auto vs manual split, average confidence, and counts by document type.

Path parameters

NameTypeDescription
project_idUUIDProject ID.

Responses

200OKClassificationAccuracyResponse
{
  "total_classified": 1280,
  "auto_classified": 1267,
  "manual_overrides": 13,
  "avg_confidence": 94.6,
  "by_method": { "auto": 1267, "manual": 13 },
  "by_bucket": { "drawings": 612, "datasheets": 318, "registers": 350 },
  "by_doc_type": { "P&ID": 184, "Isometric": 230, "Pressure Vessel Datasheet": 88 }
}
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/analytics/projects/{project_id}/classification" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
{
  "total_classified": 1280,
  "auto_classified": 1267,
  "manual_overrides": 13,
  "avg_confidence": 94.6,
  "by_method": { "auto": 1267, "manual": 13 },
  "by_bucket": { "drawings": 612, "datasheets": 318, "registers": 350 },
  "by_doc_type": { "P&ID": 184, "Isometric": 230, "Pressure Vessel Datasheet": 88 }
}
GET/api/v1/analytics/projects/{project_id}/throughput

Extraction throughput

Daily document and page counts for a project. Use the from and to query params to set a range. Defaults to the last 30 days.

Path parameters

NameTypeDescription
project_idUUIDProject ID.

Query parameters

NameTypeDescription
fromdateISO date, inclusive. Defaults to 30 days ago.
todateISO date, inclusive. Defaults to today.

Responses

200OKExtractionThroughputResponse
{
  "from": "2026-04-01",
  "to": "2026-04-25",
  "buckets": [
    { "date": "2026-04-01", "uploaded": 42, "extracted": 41, "pages": 268 },
    { "date": "2026-04-02", "uploaded": 60, "extracted": 60, "pages": 401 }
  ],
  "totals": { "uploaded": 1284, "extracted": 1271, "pages": 8420 }
}
400Bad request
401Missing or invalid API key
403API key does not have access to this project
404Resource not found
500Internal server error
cURL
curl -X GET "https://api.pathnovo.com/api/v1/analytics/projects/{project_id}/throughput" \
  -H "Authorization: Bearer $PATHNOVO_API_KEY"
Response · 200
{
  "from": "2026-04-01",
  "to": "2026-04-25",
  "buckets": [
    { "date": "2026-04-01", "uploaded": 42, "extracted": 41, "pages": 268 },
    { "date": "2026-04-02", "uploaded": 60, "extracted": 60, "pages": 401 }
  ],
  "totals": { "uploaded": 1284, "extracted": 1271, "pages": 8420 }
}