Pathnovo Logo

Common title_block

The 28-field header applied to every document schema in this project.

All PDF schemas share this block. Inside each per-document schema, it appears as "title_block": "<common>" — the extractor expands it inline when parsing. Every document therefore carries the same baseline metadata (document number, revision, project, supplier, MPC review state, equipment tags, etc.), which makes cross-document joins straightforward.

Schema

title_block.json
{
  "document_number": "str",
  "document_title": "str",
  "revision": "str",
  "revision_date": "str",
  "revision_description": "str",
  "project_code": "str",
  "area_code": "str",
  "procurement_package_id": "str",
  "sdrl_code": "str",
  "sequence_number": "str",
  "supplier_vendor_name": "str",
  "supplier_vendor_doc_no": "str",
  "purchase_order_title": "str",
  "purchase_order_no": "str",
  "equipment_tag_numbers": ["str"],
  "customer": "str",
  "end_user": "str",
  "project_name": "str",
  "plant_location": "str",
  "job_number": "str",
  "mpc_review_code": "int",
  "mpc_review_date": "str",
  "mpc_reviewer": "str",
  "sheet_of": "str",
  "size": "str",
  "scale": "str",
  "security_code": "str",
  "language": "str"
}

Using this block

  • equipment_tag_numbers is the primary cross-document join key.
  • procurement_package_id groups every document belonging to a single purchased package.
  • mpc_review_* fields track client review cycles and are surfaced in the extractor's audit UI.

Head back to the schemas overview to see how each document class embeds this header.