Online Bayesian Passive-Aggressive Learning
Authors: Tianlin Shi, Jun Zhu
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on 20newsgroups and a large Wikipedia multi-label dataset (with 1.1 millions of training documents and 0.9 million of unique terms in the vocabulary) show that our approaches significantly improve time efficiency while achieving comparable accuracy with the corresponding batch algorithms. We demonstrate the efficiency and prediction accuracy of online Med LDA, online Gibbs Med LDA and their extensions on the 20Newsgroup (20NG) and a large Wikipedia dataset. |
| Researcher Affiliation | Academia | Tianlin Shi EMAIL Institute for Interdisciplinary Information Sciences Tsinghua University Beijing, 100084 China Jun Zhu EMAIL State Key Lab of Intelligent Technology and Systems Tsinghua National Lab for Information Science and Technology Department of Computer Science and Technology Tsinghua University Beijing, 100084 China |
| Pseudocode | Yes | Algorithm 1 Online Med LDA 1: Let q0(w) = N(0; σ2I), q0(φk) = Dir(γ), k. 2: for t = 0 do 3: Set q(Φ, w) = qt(Φ)qt(w). ... (Algorithm continues) Algorithm 2 Online Gibbs Med LDA 1: Let q0(w) = N(0; σ2I), q0(φk) = Dir(γ), k. 2: for t = 0 do 3: Set q(Φ, w) = qt(Φ)qt(w). ... (Algorithm continues) |
| Open Source Code | No | The paper does not provide any explicit statement about releasing code, nor does it include a link to a code repository for the methodology described. |
| Open Datasets | Yes | Experimental results on 20newsgroups and a large Wikipedia multi-label dataset (with 1.1 millions of training documents and 0.9 million of unique terms in the vocabulary) show that our approaches significantly improve time efficiency while achieving comparable accuracy with the corresponding batch algorithms. We test pa Med LDAmtave , pa Med LDAmtgibbs and their nonparametric exentions on a large Wiki dataset. The Wiki dataset is built from the Wikipedia set used in PASCAL LSHC challenge 2012. |
| Dataset Splits | Yes | The training set contains 11,269 documents, with the smallest category having 376 documents and the biggest category having 599 documents. The testing set contains 7,505 documents, with the smallest and biggest categories having 259 and 399 documents respectively. (...) The training set consists of 1.1 millions of wikipedia documents and the testing test consists of 5,000 documents. The vocabulary contains 917,683 unique terms. |
| Hardware Specification | Yes | All of the experiments are done on a normal computer with single-core clock rate up to 2.4 GHz. |
| Software Dependencies | No | To perform the supervised tasks, we learn a linear SVM with the topic representations using LIBSVM (Chang and Lin, 2011). The paper mentions LIBSVM but does not specify a version number. No other specific software versions are mentioned. |
| Experiment Setup | Yes | For all the LDA-based topic models, we use symmetric Dirichlet priors α = 1/K 1 and γ = 0.45 1. For Bayes PA with Gibbs classifiers, the parameters were set at ϵ = 164, c = 1, and σ2 = 1. (...) For Bayes PA with averaging classifiers, the parameters determined by cross validation are ϵ = 16, c = 500, and σ2 = 10 3. For reasons explained in section 7.3, we set the mini-batch size |B| = 1 for the averaging classifier and |B| = 512 for the Gibbs classifier. (...) the number of topics is set at K = 80 and the other parameters of Bayes PA are set at (I, J , β) = (1, 2, 0). |