Towards a learning theory of representation alignment
Authors: Francesco Maria Gabriele Insulla, Shuo Huang, Lorenzo Rosasco
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we propose a learning-theoretic perspective to representation alignment. First, we review and connect different notions of alignment based on metric, probabilistic, and spectral ideas. Then, we focus on stitching, a particular approach to understanding the interplay between different representations in the context of a task. Our main contribution here is to relate the properties of stitching to the kernel alignment of the underlying representation. Our results can be seen as a first step toward casting representation alignment as a learning-theoretic problem. (b) We provide a generalization error bound of linear stitching with the kernel alignment of the underlying representation. |
| Researcher Affiliation | Academia | Francesco Insulla Institute of Computational and Mathematical Engineering Stanford University Stanford, CA 94305, USA EMAIL Shuo Huang Istituto Italiano di Tecnologia Genoa, GE 16163, Italy EMAIL Lorenzo Rosasco Ma LGa Center, DIBRIS, Universit a di Genova, Genoa, GE 16146, Italy CBMM, Massachusetts Institute of Technology, Cambridge, MA 02139, USA Istituto Italiano di Tecnologia, Genoa, GE 16163, Italy EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It primarily consists of mathematical definitions, theorems, and proofs. |
| Open Source Code | No | The paper does not contain any explicit statements about the release of source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper is theoretical in nature and does not describe or utilize specific datasets with access information for its own contributions. It mentions "diverse datasets" in the context of large AI models, but not for its own experimental validation. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments that would require dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe experimental implementations or the hardware used to perform them. |
| Software Dependencies | No | The paper is theoretical and does not provide details about specific software dependencies or their version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on mathematical concepts and proofs, thus it does not include details on experimental setup, hyperparameters, or training configurations. |