Contextures: Representations from Contexts
Authors: Runtian Zhai, Kai Yang, Burak Varıcı, Che-Ping Tsai, J Zico Kolter, Pradeep Kumar Ravikumar
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We substantiate this extrapolation with an experiment on the abalone dataset from Open ML. We compare two d-dimensional representations... We now conduct an experiment that examines τd on two datasets. First, we use the abalone dataset... Second, we use the MNIST dataset... In Table 1 we report the correlation between τ and errd over all 140 contexts from the 4 types on 28 classification (Cls) and regression (Reg) datasets from Open ML. |
| Researcher Affiliation | Academia | 1Carnegie Mellon University, Pittsburgh, PA, USA 2Peking University, Beijing, China. |
| Pseudocode | No | The paper describes a procedure in Appendix D, stating 'This can be efficiently done with the following procedure: (i) Train an encoder Φ... (ii) Estimate the covariance matrix... (iii) Estimate BΦ... (iv) Solve the generalized eigenvalue problem...', but it is not formatted as a distinct pseudocode or algorithm block with a specific label. |
| Open Source Code | Yes | The code for this paper can be found at https: //colab.research.google.com/drive/ 1Gd J0Yn-PKi Kfk ZIw Uuon3Wp Tpb NWEt AO?usp= sharing. |
| Open Datasets | Yes | We substantiate this extrapolation with an experiment on the abalone dataset from Open ML. We use the MNIST dataset. In Table 1 we report the correlation between τ and errd over all 140 contexts from the 4 types on 28 classification (Cls) and regression (Reg) datasets from Open ML (Vanschoren et al., 2013). |
| Dataset Splits | Yes | We use the abalone dataset from Open ML, and split the dataset into a pretrain set, a downstream train set and a downstream test set by 70%-15%-15%. |
| Hardware Specification | No | The paper does not explicitly describe any specific hardware used to run its experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions 'Adam W optimizer (Kingma & Ba, 2015; Loshchilov & Hutter, 2017)' and 'VICReg (Bardes et al., 2022)' but does not provide specific version numbers for these or any other software components like programming languages or libraries. |
| Experiment Setup | Yes | The embedding dimension is set to d = 128. For the second encoder, we train a fully-connected neural network with Tanh activation and skip connections for a sufficient number of steps with full-batch Adam W... For each width and depth, we run the experiments 15 times with different random initializations. We set the output dimension of the neural network to be d1 = 512. The downstream linear predictor is fit via ridge regression. Hyperparameter grid search is conducted at both encoder learning and downstream stages. We choose β = 1 and d0 = 512 in our experiments. |