reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Efficient Contrastive PAC Learning

Authors: Jie Shen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We first show that the problem of contrastive PAC learning of linear representations is intractable to solve in general. We then show that it can be relaxed to a semi-definite program when the distance between contrastive samples is measured by the ℓ2-norm. We then establish generalization guarantees based on Rademacher complexity, and connect it to PAC guarantees under certain contrastive large-margin conditions. To the best of our knowledge, this is the first efficient PAC learning algorithm for contrastive learning.
Researcher Affiliation	Academia	Jie Shen EMAIL Department of Computer Science Stevens Institute of Technology
Pseudocode	No	The paper describes algorithms and methods using mathematical formulations and textual descriptions, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository or mention code in supplementary materials.
Open Datasets	No	The paper is theoretical and does not conduct experiments on specific datasets. It refers to a generic 'data distribution D' and 'contrastive samples' but does not specify any publicly available datasets used for empirical evaluation.
Dataset Splits	No	The paper is theoretical and does not conduct experiments with specific datasets, therefore, it does not provide details on dataset splits (training/test/validation).
Hardware Specification	No	The paper is theoretical and does not report any experimental results that would require specific hardware for computation. Therefore, no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not detail any experimental implementation, thus no software dependencies with version numbers are provided.
Experiment Setup	No	The paper is theoretical and does not conduct practical experiments. Therefore, it does not provide details on experimental setup such as hyperparameters or training configurations.