reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Complexity of Learning Sparse Superposed Features with Feedback

Authors: Akash Kumar

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our theoretical findings through experiments1 on two distinct applications: feature recovery from Recursive Feature Machines and dictionary extraction from sparse autoencoders trained on Large Language Models.
Researcher Affiliation	Academia	1Department of Computer Science & Engineering, University of California, San Diego, USA. Correspondence to: Akash Kumar <EMAIL>.
Pseudocode	Yes	Algorithm 1 Model of Feature learning with feedback Given: Representation space V Rp, Feature family MF Algorithm 2 Feature learning with sampled representations Given: Representation space V Rp, Distribution over representations DV, Feature family MF. Algorithm 3 Optimization via Gradient Descent
Open Source Code	Yes	1(https://github.com/akashkumar-d/ learnsparsefeatureswithfeedback.git)
Open Datasets	Yes	We validate our theoretical findings through experiments1 on two distinct applications: feature recovery from Recursive Feature Machines and dictionary extraction from sparse autoencoders trained on Large Language Models. ... dictionaries from trained sparse autoencoders on Pythia-70M (Biderman et al., 2023) and Board Game Models (Karvonen et al., 2024)
Dataset Splits	No	Inputs x R10 are sampled from a Gaussian distribution N(0, 0.5I10). We train an RFM classifier on 5000 training samples to obtain Φ , and the teaching agent has access to this feature matrix for generating feedback. The paper mentions training samples for generating the target feature matrix but does not provide specific train/test/validation splits for the evaluation of the feedback learning process itself.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU models, CPU models, or cloud resources.
Software Dependencies	No	We utilize the cvxpy package to solve constraints... We use the publicly available repository for dictionary learning via sparse autoencoders on neural network activations (Marks et al., 2024a). The paper mentions 'cvxpy package' and 'Adam optimizer' but does not specify their version numbers. It also refers to a repository but not specific software versions used for the paper's methodology.
Experiment Setup	Yes	Algorithm 3 Optimization via Gradient Descent ... Lreg(U) = λ U 2 F where B represents the batch of samples, λ = 10 4 is the regularization coefficient, and y = e1 is the fixed unit vector. ...Update U using Adam optimizer with gradient clipping 4. Enforce fixed entries in U after each update (U[0, 0] = 1 is enforced to be 1.)