Contextures: Representations from Contexts

Authors: Runtian Zhai, Kai Yang, Burak Varıcı, Che-Ping Tsai, J Zico Kolter, Pradeep Kumar Ravikumar

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We substantiate this extrapolation with an experiment on the abalone dataset from Open ML. We compare two d-dimensional representations... We now conduct an experiment that examines τd on two datasets. First, we use the abalone dataset... Second, we use the MNIST dataset... In Table 1 we report the correlation between τ and errd over all 140 contexts from the 4 types on 28 classification (Cls) and regression (Reg) datasets from Open ML.
Researcher Affiliation Academia 1Carnegie Mellon University, Pittsburgh, PA, USA 2Peking University, Beijing, China.
Pseudocode No The paper describes a procedure in Appendix D, stating 'This can be efficiently done with the following procedure: (i) Train an encoder Φ... (ii) Estimate the covariance matrix... (iii) Estimate BΦ... (iv) Solve the generalized eigenvalue problem...', but it is not formatted as a distinct pseudocode or algorithm block with a specific label.
Open Source Code Yes The code for this paper can be found at https: //colab.research.google.com/drive/ 1Gd J0Yn-PKi Kfk ZIw Uuon3Wp Tpb NWEt AO?usp= sharing.
Open Datasets Yes We substantiate this extrapolation with an experiment on the abalone dataset from Open ML. We use the MNIST dataset. In Table 1 we report the correlation between τ and errd over all 140 contexts from the 4 types on 28 classification (Cls) and regression (Reg) datasets from Open ML (Vanschoren et al., 2013).
Dataset Splits Yes We use the abalone dataset from Open ML, and split the dataset into a pretrain set, a downstream train set and a downstream test set by 70%-15%-15%.
Hardware Specification No The paper does not explicitly describe any specific hardware used to run its experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions 'Adam W optimizer (Kingma & Ba, 2015; Loshchilov & Hutter, 2017)' and 'VICReg (Bardes et al., 2022)' but does not provide specific version numbers for these or any other software components like programming languages or libraries.
Experiment Setup Yes The embedding dimension is set to d = 128. For the second encoder, we train a fully-connected neural network with Tanh activation and skip connections for a sufficient number of steps with full-batch Adam W... For each width and depth, we run the experiments 15 times with different random initializations. We set the output dimension of the neural network to be d1 = 512. The downstream linear predictor is fit via ridge regression. Hyperparameter grid search is conducted at both encoder learning and downstream stages. We choose β = 1 and d0 = 512 in our experiments.