// ANNUAL REPORTS
Annual Report Table Extractor
Extract every table from a company annual report into clean, structured rows and columns.
> extract annual_report_FY24.pdf
Document: Annual Report FY24 Pages scanned: 284 Tables detected: 63 - Balance Sheet p.118 - Profit & Loss p.119 - Cash Flow p.121 - Segment Revenue p.144 - Notes 1-42 p.130-218 Output: tables.xlsx (63 sheets) + sources.json
// EXAMPLE INPUT
$ extract annual_report_FY24.pdf
// EXAMPLE OUTPUT
Document: Annual Report FY24 Pages scanned: 284 Tables detected: 63 - Balance Sheet p.118 - Profit & Loss p.119 - Cash Flow p.121 - Segment Revenue p.144 - Notes 1-42 p.130-218 Output: tables.xlsx (63 sheets) + sources.json
// EXTRACTION LOGIC
Layout-aware table detection runs across each page; multi-page tables are stitched on matching column headers. Header rows, units, currency, and footnote markers are preserved.
// SOURCE-LINKED OUTPUT
Every cell in the output Excel carries a source reference (PDF page, table index, row/column coordinates) so any value can be re-opened in the original document.
{ file, page, table_id, row_id, cell_id, label, value, unit, period }// FAQ
Does it handle multi-page tables?
Yes. Tables that continue across pages are stitched into a single sheet by matching column headers and units.
Are units and currencies preserved?
Units (₹ Cr, ₹ Mn, USD Mn, %, bps) are kept in a dedicated metadata column so they are not lost during normalization.
What output formats are supported?
Excel (.xlsx) with one sheet per table, plus CSV and JSON. A sources.json file maps each cell back to its PDF coordinates.
// RELATED TOOLS
Financial Statements
Financial Statement Extractor
Pull the three core financial statements — Balance Sheet, P&L, and Cash Flow — into a clean, comparable workbook.
Financial Statements
Notes to Accounts Extractor
Pull individual notes from the Notes-to-Accounts section into structured tables and text blocks.
PDF to Excel
PDF Table to Excel Converter
Convert any table inside a PDF into a clean Excel sheet, preserving headers, merged cells, and units.
// EARLY ACCESS
Get early access to the Annual Report Table Extractor
Paper Data is currently in private beta. Request access to start converting your financial documents into source-linked tables.
