Researchers have developed the Page-Object Table Transformer (POTATR), a new lightweight image-to-graph model designed for accurate and efficient table extraction from large-scale documents. With only 29 million parameters, POTATR significantly improves upon existing models by achieving a GriTSCon score of 0.964 on the PubTables-v2 Single Pages benchmark, outperforming larger models while being over 130 times faster and approximately 300 times cheaper. The model's output is spatially grounded, allowing for visual verification and geometric text assignment, and it can be integrated with external OCR for scanned documents and techniques like cross-page merging for full-document table extraction. Code and models will be made available upon release.
POTATR: Innovative Lightweight Model Achieves Superior Page-Level Table Extraction
More Articles From This Day
EU Orders Meta to Enable WhatsApp Access for Competing AI Agents
The European Union has implemented emergency measures requiring Meta to allow rival AI agents access to its messaging platform, WhatsApp. This decision reflects the EU's commitment to intervene in the rapidly evolving market for autonomous agents, aiming to ensure competitive practices and innovation within the sector. The move is part of broader regulatory efforts to address the challenges posed by advanced AI technologies.
