reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-class Gaussian Process Classification with Noisy Inputs

Authors: Carlos Villacampa-Calvo, Bryan Zaldívar, Eduardo C. Garrido-Merchán, Daniel Hernández-Lobato

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We have evaluated the proposed methods by carrying out several experiments, involving synthetic and real data. These include several data sets from the UCI repository, the MNIST data set and a data set coming from astrophysics. The results obtained show that, although the classiﬁcation error is similar across methods, the predictive distribution of the proposed methods is better, in terms of the test log-likelihood, than the predictive distribution of a classiﬁer based on GPs that ignores input noise.
Researcher Affiliation	Academia	Carlos Villacampa-Calvo EMAIL Computer Science Department, Universidad Aut onoma de Madrid, 28049, Madrid, Spain Bryan Zald ıvar EMAIL Theoretical Physics Department, Universidad Aut onoma de Madrid, 28049, Madrid, Spain Instituto de F ısica Te orica, 28049, Madrid, Spain Eduardo C. Garrido-Merch an EMAIL Computer Science Department, Universidad Aut onoma de Madrid, 28049, Madrid, Spain Daniel Hern andez-Lobato EMAIL Computer Science Department, Universidad Aut onoma de Madrid, 28049, Madrid, Spain
Pseudocode	No	The paper describes the methods and procedures using prose and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The source code to reproduce all the experiments carried out is available online at https: //github.com/cvillacampa/GPInput Noise.
Open Datasets	Yes	We have evaluated the proposed methods by carrying out several experiments, involving synthetic and real data. These include several data sets from the UCI repository, the MNIST data set and a data set coming from astrophysics. ... data sets extracted from the UCI repository (Dua and Graff, 2017). ... the well-known MNIST data set (Le Cun et al., 1998). ... from astrophysics, dealing with measurements, by the Fermi-LAT instrument operated by NASA, of point-like sources of photons in the gamma ray energy range all over the sky (https://fermi.gsfc.nasa.gov/). ... Such catalogue of sources is fully public and can be downloaded from Collaboration (2019).
Dataset Splits	Yes	For each of these data sets, we consider 100 splits into train and test, containing 90% and 10% of the data respectively. ... The MNIST data set ... 60, 000 training instances ... The test set has 10, 000 data instances. ... We randomly split the Waveform data set into 100 points for the initial training set, 500 points for testing and 400 points for validation. ... For this, we have generated 100 splits of the data into training and test partitions with 90% and 10% of the instances, respectively.
Hardware Specification	Yes	All the computations are sped-up by using a TESLA P100 GPU for training.
Software Dependencies	No	All the methods described have been implemented in Tensorﬂow (Abadi et al., 2015). The paper mentions TensorFlow but does not provide a specific version number.
Experiment Setup	Yes	For the optimization of the ELBO we have used the ADAM optimizer with learning rate equal to 0.01, the number of epochs has been set to 750 and the mini-batch size to 50 (Kingma and Ba, 2015). ... In NIMGPNN the neural network has 50 hidden units and one hidden layer. The activation function is set to be Re Lu. Finally, the number of Monte Carlo samples used to approximate the predictive distribution in NIMGP and NIMGPNN is set to 300. ... The mini-batch size is set to 200 and the number of training epochs is set to 350. ... In the case of NIMGPNN the neural network has 250 units and two hidden layers. ... All hyper-parameters, including the GP amplitude parameter, the length-scales and the level of additive Gaussian noise have been tuned by maximizing the ELBO.