reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Information Lattice Learning

Authors: Haizi Yu, James A. Evans, Lav R. Varshney

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present applications in knowledge discovery, using ILL to distill music theory from scores and chemical laws from molecules and further revealing connections between them. We show ILL s eﬃcacy and interpretability on benchmarks and assessments, as well as a demonstration of ILL-enhanced classiﬁers achieving human-level digit recognition using only one or a few MNIST training examples (1 10 per class).
Researcher Affiliation	Academia	Haizi Yu EMAIL James A. Evans EMAIL Knowledge Lab, University of Chicago, 1155 E 60th Street, Chicago, IL 60637 USA Lav R. Varshney EMAIL Coordinated Science Lab, University of Illinois at Urbana-Champaign, 1308 W Main Street, Urbana, IL 61801 USA
Pseudocode	Yes	Algorithm 1: Add_partition (Pτ, P): adds a tagged partition Pτ to a partition poset (P, )
Open Source Code	No	The paper does not provide an explicit statement or a link to their own source code implementation. While it references external tools (e.g., Harmonia by Illiac Software, Inc.), it does not offer its own code for the methodology described.
Open Datasets	Yes	We present applications in knowledge discovery, using ILL to distill music theory from scores and chemical laws from molecules and further revealing connections between them. We show ILL s eﬃcacy and interpretability on benchmarks and assessments, as well as a demonstration of ILL-enhanced classiﬁers achieving human-level digit recognition using only one or a few MNIST training examples (1 10 per class). ... Signals are probability distributions of chords encoded as vectors of MIDI keys. Figure 8a shows such a signal the frequency distribution of two-note chords extracted from the soprano and bass parts of Bach s C-score chorales (Illiac Software, Inc., 2020) ... Signals are Boolean-valued functions indicating the presence of compound formulae encoded as vectors of atomic numbers in a molecule database. Figure 8b shows a signal attained by collecting two-element compounds from the Materials Project database (Jain et al., 2013).
Dataset Splits	Yes	We test the performance of our ILL-enhanced Nearest-Neighbor and Text Caps, as well as the vanilla Nearest-Neighbor using pixel-wise Euclidean distance as baseline, in the regime of only a few training examples per class. With the training size growing from 1 image per class, we run the three models on the same training set and collect their prediction accuracies on the entire MNIST test set. ... In Figure 15, training examples are selected as the ﬁrst k (k = 1, 2, . . . , 20) images in the training set per class, which may be viewed as a random sample. One may carefully select (by hand or by algorithm) a training subset comprising distinct prototype digits to mimic what humans might naturally do: observe more but select less to memorize. Using one such selected training subset consisting of only 51 images in total (i.e., 5 per class), ILL-enhanced Nearest-Neighbor can still achieve 90% test accuracy
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It only mentions general computing contexts like "running on the same training and test sets" without specifying the underlying hardware.
Software Dependencies	No	The paper mentions "python numpy.nan" in Appendix F as part of the instructions for an assignment, but it does not specify version numbers for Python, NumPy, or any other significant software libraries or frameworks used for the main experimental work. This is not sufficient to meet the requirement for specific versioned software dependencies.
Experiment Setup	Yes	For these two illustrations, we ﬁx the same priors F, S in (8)(9) thus the same lattice. We ﬁx the same parameters: ϵ-path is 0.2 < 3.2 < 6.2 < (tip: a small initial oﬀset, e.g., 0.2, is used to achieve nearly-deterministic rules) and γ is 20% of the initial signal gap. This ﬁxed setting is used to show generality and for comparison.