Common title_block
The 28-field header applied to every document schema in this project.
All PDF schemas share this block. Inside each per-document schema, it appears as "title_block": "<common>" — the extractor expands it inline when parsing. Every document therefore carries the same baseline metadata (document number, revision, project, supplier, MPC review state, equipment tags, etc.), which makes cross-document joins straightforward.
Schema
{
"document_number": "str",
"document_title": "str",
"revision": "str",
"revision_date": "str",
"revision_description": "str",
"project_code": "str",
"area_code": "str",
"procurement_package_id": "str",
"sdrl_code": "str",
"sequence_number": "str",
"supplier_vendor_name": "str",
"supplier_vendor_doc_no": "str",
"purchase_order_title": "str",
"purchase_order_no": "str",
"equipment_tag_numbers": ["str"],
"customer": "str",
"end_user": "str",
"project_name": "str",
"plant_location": "str",
"job_number": "str",
"mpc_review_code": "int",
"mpc_review_date": "str",
"mpc_reviewer": "str",
"sheet_of": "str",
"size": "str",
"scale": "str",
"security_code": "str",
"language": "str"
}Using this block
equipment_tag_numbersis the primary cross-document join key.procurement_package_idgroups every document belonging to a single purchased package.mpc_review_*fields track client review cycles and are surfaced in the extractor's audit UI.
Head back to the schemas overview to see how each document class embeds this header.
