reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Harmonic Loss Trains Interpretable AI Models

Authors: David D. Baek, Ziming Liu, Riya Tyagi, Max Tegmark

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We first validate the performance of harmonic models across algorithmic, vision, and language datasets. Through extensive experiments, we demonstrate that models trained with harmonic loss perform better than standard models by: (a) enhancing interpretability (i.e. geometry of representations), (b) requiring less data for generalization, and (c) reducing grokking.
Researcher Affiliation	Academia	David D. Baek EMAIL Massachusetts Institute of Technology Ziming Liu EMAIL Massachusetts Institute of Technology Riya Tyagi EMAIL Massachusetts Institute of Technology Max Tegmark EMAIL Massachusetts Institute of Technology
Pseudocode	No	The paper describes methods using mathematical formulas and textual descriptions, for example, in Section 3 titled 'Harmonic Loss', but does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about the release of its source code or provide a link to a code repository for its methodology.
Open Datasets	Yes	We demonstrate the performance of harmonic models on the vision task of MNIST digit classification. We pre-train a GPT-2 small model (128M, based on Nano GPT) on Open Web Text. Harmonic loss slightly outperforms crossentropy loss on the Image Net benchmark. We evaluate two tasks, COLA (linguistic acceptibility) (Warstadt et al., 2018) and SST2 (sentence sentiment classification) (Socher et al., 2013).
Dataset Splits	No	The paper mentions evaluating 'test accuracy as a function of Train Fraction' for algorithmic datasets and 'validation losses' for GPT-2, and 'validation dataset' for SST-2 and CoLA, and 'Val Acc' for ImageNet, implying the existence of splits. However, it does not explicitly provide specific percentages, sample counts, or detailed methodologies for these splits across all experiments in a way that is immediately reproducible without relying on external knowledge of standard splits for benchmark datasets or making assumptions for custom ones.
Hardware Specification	Yes	We use 8 V100 GPUs, choose block size 1024, batch size 480 blocks.
Software Dependencies	No	The paper mentions using the 'Adam W optimizer' and that the GPT-2 model is 'based on Nano GPT', but does not provide specific version numbers for any key software components or libraries (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	We trained the MLP models for 7000 epochs and the transformers for 10000 epochs. For all four models, we used the Adam W optimizer with a learning rate of 2 10 3, a weight decay of 10 2, and an L2 regularization on the embeddings with strength 0.01. For MNIST: The models were trained with a batch size of 64, a learning rate of 0.001, and for 10 epochs. For GPT-2: We use 8 V100 GPUs, choose block size 1024, batch size 480 blocks. We use the Adam Optimizer with β1 = 0.9, β2 = 0.95. For the harmonic loss, we choose n = 768 28. We use a linear warmup learning rate schedule for 2k (1k) steps to maximum learning rate 6 10 4 (6 10 3), and a cosine decay schedule from 2k to 10k, ending at lr 3 10 5 (3 10 4).