reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Joint Continuous and Discrete Model Selection via Submodularity

Authors: Jonathan Bunton, Paulo Tabuada

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we numerically validate our theoretical results with several proof-of-concept examples with synthetic and real-world data, comparing against state-of-the-art algorithms.
Researcher Affiliation	Academia	Jonathan Bunton EMAIL Paulo Tabuada EMAIL Department of Electrical & Computer Engineering University of California, Los Angeles 420 Westwood Plaza, Box 951594 Los Angeles, CA 90095-1554, USA
Pseudocode	No	The paper describes algorithms conceptually and refers to existing algorithms (e.g., 'projected subgradient descent', 'minimum-norm point algorithm') but does not present any pseudocode or structured algorithm blocks explicitly within its text.
Open Source Code	No	The paper discusses various algorithms used for comparison and their implementations, such as 'IBM s CPLEX 12.8 constrained quadratic program solver in MATLAB' and 'the minimum-norm point algorithm from Fujishige and Isotani (2011) as implemented in MATLAB by Krause (2010)'. However, it does not explicitly state that the authors' own code for the methodology described in the paper is available or provide a link to a repository.
Open Datasets	Yes	We create this scenario with real retail sales data collected from a UK-based online retail store available in the UCI Machine Learning Repository (Dua and Graﬀ, 2017; Chen et al., 2012).
Dataset Splits	No	The paper mentions using 'synthetic and real-world data' and refers to 'random problem instances' and 'noisy measurements'. For the retail sales data, it states 'we can use historical data to build a local linear approximation'. However, it does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or explicit splitting methodology).
Hardware Specification	Yes	The experiments were all run on a laptop with an AMD Ryzen 9 4900HS CPU and 16GB of RAM.
Software Dependencies	Yes	In our implementation, we use the Pairwise Frank-Wolfe algorithm to solve this convex problem, with all relevant results plotted in blue and labeled Cont Submodular. The projected subgradient method is known to provide approximation guarantees even in the non-submodular case (El Halabi and Jegelka, 2020), but as shown in Section 4.2, amounts to a speciﬁc choice of algorithms in our theory. The algorithm operates by solving an equivalent convex optimization problem in particular, minimizing the Lov asz extension of g + H over [0, 1]n using projected subgradient descent. To implement this approach, we use IBM s CPLEX 12.8 constrained quadratic program solver in MATLAB to evaluate the function H (as expressed in (16)) and use Polyak s rule for updating the step size. The relevant results are plotted in red, and labeled PGD + CPLEX in ﬁgures. Our approach is agnostic to the choice of convex optimization and submodular set function minimization routines, so we also use CPLEX to evaluate H. To highlight the utility of an algorithm-agnostic approach, we also implement an active-set method for fast non-negative quadratic programming to evaluate H (Bro and De Jong, 1997). For the submodular set function minimization algorithm, we use the minimum-norm point algorithm from Fujishige and Isotani (2011) as implemented in MATLAB by Krause (2010), coupled with the semi-gradient lattice pruning strategy proposed by Iyer et al. (2013) which has quadratic complexity and drastically reduces the problem size.
Experiment Setup	Yes	In our examples, we consider the domain [0, 1]n Rn 0 and set the discretization level to k = 51 unless otherwise speciﬁed. ... we set the regularization strength to λ = 0.05 so that both the functions f and g play nontrivial roles in the combined objective function. ... We then let µ = 0.8 in (56) and λ = 0.05 in (57) so that the overall problem s cost function has nontrivial contributions from both the smoothness-promoting function and the sparsityinducing regularizer. ... The various methods are given identical cost functions to minimize, and are run until either convergence to suboptimality below 10 4 or a maximum of 100 iterations.