InfoNCE is variational inference in a recognition parameterised model
Authors: Laurence Aitchison, Stoil Krasimirov Ganev
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experimental results Our primary results are theoretical: in connecting the ELBO/log marginal likelihood and mutual information, and in showing that the Info NCE objective with a restricted choice of fθ makes more sense as a bound on the log marginal likelihood than on the MI. At the same time, our approach encourages a different way of thinking about how to set up contrastive SSL methods, in terms of Bayesian priors. As an example, we considered a task in which the goal was to extract the locations of three moving balls, based on videos of these balls bouncing around in a square (Fig. 1A; Appendix D). |
| Researcher Affiliation | Academia | Laurence Aitchison EMAIL University of Bristol Stoil Ganev EMAIL University of Bristol |
| Pseudocode | No | The paper describes methods in text and mathematical formulations but does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about open-sourcing the code for the methodology described, nor does it include a link to a code repository. |
| Open Datasets | No | We generated 900 images in a single continuous video with a resolution of 256 × 256 pixels. The three balls had a diameter of 32 pixels. |
| Dataset Splits | No | The paper describes how training batches were constructed using random pairs of consecutive frames and random negative examples, but it does not specify explicit training, validation, or test dataset splits in terms of percentages or sample counts for the overall dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper describes the use of neural networks but does not provide specific ancillary software details (e.g., library names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | The training itself was performed by using stochastic gradient descent with a learning rate of 0.005 over the course of 30 epochs. The batches were made of 30 random pairs of consecutive frames. |