reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery

Authors: Han Liu, Lie Wang, Tuo Zhao

JMLR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and ﬁnd that it is as competitive as a handcrafted model created by human experts.
Researcher Affiliation	Academia	Han Liu EMAIL Department of Operations Research and Financial Engineering, Princeton University, NJ 08544, USA Lie Wang EMAIL Department of Mathematics, Massachusetts Institute of Technology, Cambridge MA 02139, USA Tuo Zhao EMAIL Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
Pseudocode	No	4.2 Smoothed Proximal Gradient Algorithm We then present a brief derivation of the smoothed proximal gradient algorithm for solving (25). We ﬁrst deﬁne three sequences of auxiliary variables {A(t)}, {V(t)}, and {H(t)} with A(0) = H(0) = V(0) = B(0), a sequence of weights {θt = 2/(t + 1)}, and a nonincreasing sequence of step sizes {ηt} t=0. At the tth iteration, we take V(t) = (1 θt)B(t 1) + θt A(t 1). Let e H(t) = V(t) ηt Gµ(V(t)). When R(H) = \|\|H\|\| , we take j=1 ψj( e H(t)) ηtλ, 0 o ujv T j , where uj and vj are the left and right singular vectors of e H(t) corresponding to the jth largest singular value ψj( e H(t)). When R(H) = \|\|H\|\|1,2, we take H(t) j = e Hj max n 1 ηtλ/\|\| e Hj \|\|2, 0 o . See more details about other choices of p in the L1,p norm in Liu et al. (2009a); Liu and Ye (2010). To ensure that the objective function value is nonincreasing, we choose B(t) = argmin B {H(t), B(t 1)} \|\|Y XB\|\|µ + λR(B). For simplicity, we can set {ηt} as a constant sequence, e.g., ηt = µ/γ for t = 1, 2, .... In practice, we cam use the backtracking line search to adjust ηt and boost the performance. At last, we take A(t) = B(t 1) + 1 θt (H(t) B(t 1)). Given a stopping precision ε, the algorithm stops when max \|\|B(t) B(t 1)\|\|F, \|\|H(t) H(t 1)\|\|F ε.
Open Source Code	Yes	The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.
Open Datasets	Yes	The data are obtained from Mitchell et al. (2008) and contain a f MRI image data set and a text data set.
Dataset Splits	Yes	We generate training data sets of 400 samples for the low rank setting and 200 samples for joint sparsity setting. In addition, we generate validation sets (400 samples for the low rank setting and 200 samples for the joint sparsity setting) for the regularization parameter selection, and testing sets (10,000 samples for both settings) to evaluate the prediction accuracy.
Hardware Specification	Yes	All simulations are implemented by MATLAB using a PC with Intel Core i5 3.3GHz CPU and 16GB memory.
Software Dependencies	No	All simulations are implemented by MATLAB using a PC with Intel Core i5 3.3GHz CPU and 16GB memory. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.
Experiment Setup	Yes	In numerical experiments, we set σmax = 1, 2, and 4 to illustrate the tuning insensitivity of CMR. The regularization parameter λ of both CMR and OMR is chosen over a grid Λ = n 240/4λ0, 239/4λ0, , 2 17/4λ0, 2 18/4λ0 o . λ0 = \|\|X\|\|2 d + m) and λ0 = p for the low rank and joint sparsity settings. ... OMR is solved by the monotone fast proximal gradient algorithm, where we set the stopping precision ε = 10 4. CMR is solved by the proposed smoothed proximal gradient algorithm, where we set the stopping precision ε = 10 4, and the smoothing parameter µ = 10 4.