A hierarchically structured multi-task form understanding benchmark.
A dataset for the document understanding community.
Word to text-line merging, text-line to entity merging, entity category classification, item table localization and entity-based full-document hierarchical structure recovery.