The Complexity of Learning Sparse Superposed Features with Feedback
Authors: Akash Kumar
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our theoretical findings through experiments1 on two distinct applications: feature recovery from Recursive Feature Machines and dictionary extraction from sparse autoencoders trained on Large Language Models. |
| Researcher Affiliation | Academia | 1Department of Computer Science & Engineering, University of California, San Diego, USA. Correspondence to: Akash Kumar <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Model of Feature learning with feedback Given: Representation space V Rp, Feature family MF Algorithm 2 Feature learning with sampled representations Given: Representation space V Rp, Distribution over representations DV, Feature family MF. Algorithm 3 Optimization via Gradient Descent |
| Open Source Code | Yes | 1(https://github.com/akashkumar-d/ learnsparsefeatureswithfeedback.git) |
| Open Datasets | Yes | We validate our theoretical findings through experiments1 on two distinct applications: feature recovery from Recursive Feature Machines and dictionary extraction from sparse autoencoders trained on Large Language Models. ... dictionaries from trained sparse autoencoders on Pythia-70M (Biderman et al., 2023) and Board Game Models (Karvonen et al., 2024) |
| Dataset Splits | No | Inputs x R10 are sampled from a Gaussian distribution N(0, 0.5I10). We train an RFM classifier on 5000 training samples to obtain Φ , and the teaching agent has access to this feature matrix for generating feedback. The paper mentions training samples for generating the target feature matrix but does not provide specific train/test/validation splits for the evaluation of the feedback learning process itself. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU models, CPU models, or cloud resources. |
| Software Dependencies | No | We utilize the cvxpy package to solve constraints... We use the publicly available repository for dictionary learning via sparse autoencoders on neural network activations (Marks et al., 2024a). The paper mentions 'cvxpy package' and 'Adam optimizer' but does not specify their version numbers. It also refers to a repository but not specific software versions used for the paper's methodology. |
| Experiment Setup | Yes | Algorithm 3 Optimization via Gradient Descent ... Lreg(U) = λ U 2 F where B represents the batch of samples, λ = 10 4 is the regularization coefficient, and y = e1 is the fixed unit vector. ...Update U using Adam optimizer with gradient clipping 4. Enforce fixed entries in U after each update (U[0, 0] = 1 is enforced to be 1.) |