SAIL: Sample-Centric In-Context Learning for Document Information Extraction
Authors: Jinyu Zhang, Zhiyuan You, Jize Wang, Xinyi Le
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments 4.1 Datasets, Metrics, and Details 4.2 Results on DIE Benchmarks 4.3 Comparison with Multi-modal LLMs 4.4 Ablation Studies |
| Researcher Affiliation | Academia | Jinyu Zhang1*, Zhiyuan You2*, Jize Wang1, Xinyi Le1 1Shanghai Jiao Tong University 2The Chinese University of Hong Kong EMAIL, EMAIL |
| Pseudocode | No | The paper includes illustrations of the framework (Figure 2) but does not contain explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/sky-goldfish/SAIL |
| Open Datasets | Yes | FUNSD (Jaume, Ekenel, and Thiran 2019) is a dataset for understanding the content of tables in scanned documents. ... SROIE (Huang et al. 2019) is another scanned receipt understanding dataset... CORD (Park et al. 2019) is a receipt understanding dataset... |
| Dataset Splits | Yes | FUNSD (Jaume, Ekenel, and Thiran 2019) is a dataset for understanding the content of tables in scanned documents. It contains 149 tables and 7,411 entities in the training set, and 50 tables and 2,332 entities in the test set. ... SROIE (Huang et al. 2019) is another scanned receipt understanding dataset, containing 626 receipts in the training set and 347 in the test set. ... CORD (Park et al. 2019) is a receipt understanding dataset that contains 800 training data, 100 test data, and 100 validation data. |
| Hardware Specification | No | The paper mentions using specific LLM APIs (GPT-3.5, GPT-4o) and a specific version of an open-source model (chatglm3-6b-32k) but does not provide details on the hardware used to run experiments or host these models/APIs. |
| Software Dependencies | No | The paper mentions using Chat GLM3 (chatglm3-6b-32k version), GPT-3.5 (gpt-3.5-turbo API version), GPT-4 (gpt-4o API version) and Sentence-BERT, but does not provide specific version numbers for ancillary software dependencies like programming languages, libraries, or frameworks used for implementation. |
| Experiment Setup | Yes | For GPT3.5 and GPT-4o, we set the temperature parameter to 0 to enhance the reproducibility. In our experiments, for each test document, we select four textually similar documents and four layout-similar documents as examples due to the limitation of prompt token number. Furthermore, for each filtered test entity, we choose four textually similar entity examples. |