reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Structured Low-Rank Tensors for Generalized Linear Models

Authors: Batoul Ahmad Taki, Anand Sarwate, Waheed U. Bajwa

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, numerical experiments on synthetic datasets demonstrate the eﬃcacy of the proposed LSR tensor model for three regression types (linear, logistic and Poisson). Experiments on a collection of medical imaging datasets demonstrate the usefulness of the LSR model over other tensor models (Tucker and CP) on real, imbalanced data with limited available samples.
Researcher Affiliation	Academia	Batoul Taki EMAIL Department of Electrical and Computer Engineering Rutgers University-New Brunswick Anand D. Sarwate EMAIL Rutgers University-New Brunswick Waheed U. Bajwa EMAIL Rutgers University-New Brunswick
Pseudocode	Yes	Algorithm 1 LSRTR: A block coordinate descent algorithm for LSR-TGLMs 1: Input: n training samples {Xi, yi}n i=1, step size α, separation rank S, tensor rank (r1, r2, . . . , r K). 2: Initialise: Factor matrices B0 (k,s) k [K], s [S], core tensor G0 and t 0. 3: repeat: 4: for s [S] do 5: for k [K] do 6: e B(t) (k ,s ) vec B(t) (k ,s ) α Pn i=1 T(yi) g 1 vec(B(k ,s )), 1 , exi exi 7: B(t+1) (k ,s ) H e B(t) (k ,s ) 8: end for 9: end for (t+1) vec(G(t)) α Pn i=1 T(yi) g 1 ( g, exi ) exi 11: t t + 1 12: until convergence 13: return b B P s [S] G(t) 1 B(t) (1,s) 2 K B(t) (K,s) Algorithm 2 Posterior prediction for LSR-TGLMs Input Estimate b B Rm1 m K and nte test data points {Xi}nte i=1 Output Expectation bµ = [E[y1\|X1], E[y2\|X2] . . . E[ynte\|Xnte] 1: Deﬁne: X X1, X2, . . . , Xnte 2: Compute bµ for input X as: bµ = g 1( b B, X ) 3: return bµ
Open Source Code	No	We also note that though BCD algorithms such as LSRTR are popular amongst tensor-structured regression works and have been shown to be eﬀective in practice (Zhou et al., 2013; Tan et al., 2013; Li et al., 2018; Zhang & Jiang, 2016), BCD algorithms in prior tensor-structured GLM works do not explicitly exploit the coeﬃcient tensor s Kronecker structure that appears upon vectorization. Keeping a common core tensor and maintaining the LSR matrix structure in (14) allows us to explore other parameter estimation algorithms that exploit the Kronecker matrix structure, similar to those in existing dictionary learning works (Ghassemi et al., 2020). However, we leave this for future work.
Open Datasets	Yes	We move on to investigating our approach on medical imaging data. ... Here we study three datasets; ABIDE Autism (Craddock et al., 2013; Lodhi & Bajwa, 2020), ADHD200 (Bellec et al., 2017) and Vessel MNIST 3D (Yang et al., 2021b).
Dataset Splits	Yes	As for the autistic-control split, we choose 40 autistic and 40 control samples uniformly at random for training, and 14 test subjects the same way. ...The training dataset consists of 762 samples, 280 of which are labelled as ADHD (hyperactive/impulsive, inattentive and combined) and 482 of which are labelled as typical (control). The testing dataset consists of 197 samples, where 76 and 93 are labelled as ADHD and control, respectively. ...The training dataset consists of 1335 samples, 150 of which are labelled as aneurysm and 1185 of which are labelled as healthy . The testing dataset consists of 382 samples, where 43 and 339 are labelled as aneurysm and healthy , respectively.
Hardware Specification	No	The paper does not explicitly mention specific hardware specifications like GPU models, CPU models, or cloud computing instances used for running the experiments. It only describes the algorithms and datasets used.
Software Dependencies	No	The paper does not explicitly mention specific software dependencies with version numbers (e.g., Python version, specific library versions like PyTorch, TensorFlow, or scikit-learn).
Experiment Setup	Yes	Parameters including the Tucker rank (rk)k [K], the separation rank S and the step size α are set for each dataset using separate validation experiments. For linear regression we consider various model sizes and ranks, i.e., m {64, 128, 256}, r {4, 8}. For logistic and Poisson regression we consider m {32, 64, 128} and r {4, 8}. The chosen rank (rk)k [K] for ABIDE Autism, ADHD200 and Vessel MNIST 3D are (6, 6), (6, 6, 6) and (3, 3, 3), respectively. The chosen LSR rank for LSRTR for ABIDE Autism, ADHD200 and Vessel MNIST 3D are S = 2, S = 3 and S = 2, respectively.