reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Relax and penalize: a new bilevel approach to mixed-binary hyperparameter optimization

Authors: Sara Venturini, Marianna De Santis, Jordan Patracone, Martin Schmidt, Francesco Rinaldi, Saverio Salzo

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the performance of our approach for two specific machine learning problems, i.e., the estimation of the group-sparsity structure in regression problems and the data distillation problem. The reported results show that our method is competitive with state-of-the-art approaches based on relaxation and rounding. ... Numerical experiments are reported in Section 5 and show how the relax and penalize method compares with state-of-the-art approaches based on relaxation and rounding.
Researcher Affiliation	Academia	Sara Venturini EMAIL MIT Senseable City Lab Massachusetts Institute of Technology; Marianna De Santis EMAIL Department of Information Engineering University of Florence; Jordan Patracone EMAIL Inria, Laboratoire Hubert Curien Université Jean Monnet Saint-Etienne; Martin Schmidt EMAIL Department of Mathematics Trier University; Francesco Rinaldi EMAIL Department of Mathematics University of Padova; Saverio Salzo EMAIL DIAG, Sapienza University of Rome and Italian Institute of Technology
Pseudocode	Yes	Algorithm 1: Penalty method ... Algorithm 2: Hypergradient computation (reverse mode)
Open Source Code	Yes	The codes are available on the Git Hub page: https://github.com/saraventurini/ Relax-and-penalize
Open Datasets	Yes	First, music (Bertin-Mahieux, 2011), a dataset with song features from 1922 to 2011 used to predict the release year based on 90 attributes, including timbre averages and covariances. Second, blog (Buza, 2014), a dataset containing features from blog posts, focused on predicting the number of comments received in the next 24 hours using various attributes.
Dataset Splits	Yes	First, music (Bertin-Mahieux, 2011)... It consists of 463 715 training samples, with the first 231 857 used for training the lower level and the remaining 231 857 reserved for testing the weights afterward. Additionally, 51 630 validation samples were utilized for the upper level. ... Second, blog (Buza, 2014)... It comprises 52 397 training and 7624 validation samples, with the first 1089 used for training the lower level and the remaining 6535 set aside for testing the weights afterward.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) are mentioned in the paper.
Software Dependencies	No	The paper mentions using SAGA (Defazio et al., 2014) as a method, but does not provide specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	In Section C.2... We select ε0 = 105, ... η = 10 3, q = 500 inner iterations, and 0.99 η/λ as the inner step size. ... The step size is set to T/0.025 for θ and it is multiplied by the preconditioner c = 10 4 for λ. The hyperparameters θ and λ are projected to the unit simplex (L 1)P and the box [10 3, 1], respectively, and they are initialized to λ0 = 10 1 and θ0 = PΘ(L 1IP L + N(0P L, 0.1L 1IP L). ... In Section D.3... We initialize ε0 = 109 for both datasets... we set the regularization parameter to s = 102. For the upper-level problem, we use a batch of size 600 for computing time reasons, we perform 100 inner iterations for each problem (Pk), and we set the step size to 10 3 for music and to 10 5 for blog.