reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Permuted and Augmented Stick-Breaking Bayesian Multinomial Regression

Authors: Quan Zhang, Mingyuan Zhou

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide experimental results in Section 6 and conclude the paper in Section 7. Example results demonstrate their attractive properties and performance. Table 1: Comparison of the classiﬁcation error rates (%) of pa SB-MLR, pa SB-robit, pa SBMSVM, MSRs with various K and T (columns 5 to 8), MSR with data transformation (DT-MSR), L2-MLR, SVM, and AMM.
Researcher Affiliation	Academia	Quan Zhang EMAIL Mingyuan Zhou EMAIL Department of Information, Risk, and Operations Management Mc Combs School of Business The University of Texas at Austin Austin, TX 78712, USA
Pseudocode	No	Does not contain structured pseudocode or algorithm blocks. Methods are described using mathematical formulations and descriptive text without explicit pseudocode or algorithm labels.
Open Source Code	No	Does not provide concrete access to source code for the methodology described in this paper. The paper mentions using third-party software packages (LIBSVM, LIBLINEAR, mcmcse) but does not state that its own implementation code is released.
Open Datasets	Yes	To further evaluate the performance of the proposed pa SB multinomial regression models, we consider pa SB multinomial logistic regression (pa SB-MLR), pa SB multinomial robit with κ = 6 degrees of freedom (pa SB-robit), pa SB multinomial support vector machine (pa SB-MSVM), and MSRs. We compare their performance with those of L2 regularized multinomial logistic regression (L2-MLR), support vector machine (SVM), and adaptive multi-hyperplane machine (AMM), and consider the following benchmark multi-class classiﬁcation data sets: iris, wine, glass, vehicle, waveform, segment, dna, and satimage. We also include the synthetic square data shown in Figure 2 for comparison.
Dataset Splits	Yes	The training and testing sets are predeﬁned for vehicle, dna, and satimage. We divide the other data sets into training and testing as follows. For iris, wine, and glass, ﬁve random partitions are taken such that for each partition the training set accounted for 80% of the whole data set while the testing set 20%. The classiﬁcation error rate is calculated by averaging the error rates of all ﬁve random partitions. For square, waveform, and segment, only one random partition is taken, where 70% of the square data set are used as training and the remaining 30% as testing, and 10% of both the waveform and segment datas are used as training and the remaining 90% as testing.
Hardware Specification	No	Does not provide specific hardware details. The paper mentions 'Texas Advanced Computing Center for computational support' but does not specify CPU/GPU models, memory, or other detailed hardware specifications for the experiments.
Software Dependencies	No	Does not provide specific ancillary software details with version numbers. The paper mentions using LIBSVM, R with package e1071, LIBLINEAR, and mcmcse package, but does not specify their version numbers.
Experiment Setup	Yes	For pa SB-robit, we run 8,000 iterations and discard the ﬁrst 5,000 as burn-in (this setting is unchanged for experiments in Section 6.6). For pa SB-MSVM, we use the spike-and-slab prior to select the kernel bases and set 0.5 as the probability of spike at 0... A Gaussian radial basis function (RBF) kernel is used and the kernel width is selected by 3-fold cross validation from (2 10, 2 9, . . . , 210). We run 1000 MCMC iterations and discard the ﬁrst 500 as burn-in samples. For MSR, we try both pa SB and par SB with (K, T) set as (1, 1), (1, 3), (5, 1), and (5, 3). We run 10000 MCMC iterations and discard the ﬁrst 5000 as burn-in samples. The predictive probability is calculated by averaging the Monte Carlo average predictive probabilities from pa SB and par SB MSRs. An observation in the testing set is classiﬁed to the category associated with the largest predictive probability. For L2-MLR provided in the LIBLINEAR package... the regularization parameter is ﬁve-fold cross-validated on the training set from (2 10, 2 9, . . . , 215). For SVM... three-fold cross validation is adopted to tune both the regularization parameter and kernel width from (2 10, 2 9, . . . , 210).