Towards a learning theory of representation alignment

Authors: Francesco Maria Gabriele Insulla, Shuo Huang, Lorenzo Rosasco

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we propose a learning-theoretic perspective to representation alignment. First, we review and connect different notions of alignment based on metric, probabilistic, and spectral ideas. Then, we focus on stitching, a particular approach to understanding the interplay between different representations in the context of a task. Our main contribution here is to relate the properties of stitching to the kernel alignment of the underlying representation. Our results can be seen as a first step toward casting representation alignment as a learning-theoretic problem. (b) We provide a generalization error bound of linear stitching with the kernel alignment of the underlying representation.
Researcher Affiliation Academia Francesco Insulla Institute of Computational and Mathematical Engineering Stanford University Stanford, CA 94305, USA EMAIL Shuo Huang Istituto Italiano di Tecnologia Genoa, GE 16163, Italy EMAIL Lorenzo Rosasco Ma LGa Center, DIBRIS, Universit a di Genova, Genoa, GE 16146, Italy CBMM, Massachusetts Institute of Technology, Cambridge, MA 02139, USA Istituto Italiano di Tecnologia, Genoa, GE 16163, Italy EMAIL
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It primarily consists of mathematical definitions, theorems, and proofs.
Open Source Code No The paper does not contain any explicit statements about the release of source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets No The paper is theoretical in nature and does not describe or utilize specific datasets with access information for its own contributions. It mentions "diverse datasets" in the context of large AI models, but not for its own experimental validation.
Dataset Splits No The paper is theoretical and does not describe experiments that would require dataset splits.
Hardware Specification No The paper is theoretical and does not describe experimental implementations or the hardware used to perform them.
Software Dependencies No The paper is theoretical and does not provide details about specific software dependencies or their version numbers.
Experiment Setup No The paper is theoretical and focuses on mathematical concepts and proofs, thus it does not include details on experimental setup, hyperparameters, or training configurations.