Self-paced Multi-view Co-training
Authors: Fan Ma, Deyu Meng, Xuanyi Dong, Yi Yang
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments conducted on synthetic, text categorization, person re-identification, image recognition and object detection data sets substantiate the superiority of the proposed method. |
| Researcher Affiliation | Academia | Fan Ma EMAIL Centre for Artificial Intelligence, University of Technology Sydney 15 Broadway, Ultimo NSW 2007, Australia School of Mathematics and Statistics and Ministry of Education Key Lab of Intelligent Networks and Network Security, Xian Jiaotong University Xi an, Shaan xi Province, P. R. China Deyu Meng EMAIL School of Mathematics and Statistics and Ministry of Education Key Lab of Intelligent Networks and Network Security, Xian Jiaotong University Xi an, Shaan xi Province, P. R. China Macau Institute of Systems Engineering, Macau University of Science and Technology Taipa, Macau, P. R. China Xuanyi Dong EMAIL Centre for Artificial Intelligence, University of Technology Sydney 15 Broadway, Ultimo NSW 2007, Australia Yi Yang EMAIL Centre for Artificial Intelligence, University of Technology Sydney 15 Broadway, Ultimo NSW 2007, Australia |
| Pseudocode | Yes | Algorithm 1 Serial SPam Co Algorithm ... Algorithm 2 Parallel SPam Co Algorithm |
| Open Source Code | Yes | More details about our algorithm codes and datasets can be seen in https://github.com/Flowerfan/SPam Co. |
| Open Datasets | Yes | Experiments conducted on synthetic, text categorization, person re-identification, image recognition and object detection data sets substantiate the superiority of the proposed method. ... We also evaluate our SPam Co model for multi-view semi-supervised learning on the Reuters multilingual data set in Amini et al. (2009), which is from Reuters RCV1 and RCV2 collections. ... Experiments are conducted on Market-1501 dataset for this task. This dataset contains 32,668 detected bounding boxes with persons of 1,501 identities (Zheng et al., 2015). ... The CIFAR-10 dataset is employed. The dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. ... We evaluate our method on PASCAL VOC 2007 detection data set (Everingham et al., 2010) |
| Dataset Splits | Yes | For each class of each language, 14 and 486 documents are selected as labeled and unlabeled training instances, respectively. Thus a total number of 84 and 2916 documents are used as the labeled and unlabeled data for each language. The rest of all the instances are used as test data. ... In this experiment, 20% instances of training data are chosen as the labeled set, and the rest of the data are treated as unlabeled. ... In this experiment, 2000 and 4000 training samples are randomly selected to be taken as supervised data, respectively, and other rest training ones are taken as unsupervised instances. In both cases, the same 10000 test images are used for evaluation. ... This data set contains 10022 images annotated with bounding boxes for 20 object categories. It is officially split into 2501 training, 2510 validation, and 5011 testing images. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or specific computing platforms) used for running experiments. |
| Software Dependencies | No | The paper mentions software such as the 'scikit-learn Python module' but does not specify version numbers for any libraries or frameworks used in the experiments, which is required for reproducibility. |
| Experiment Setup | Yes | For each class of each language, 14 and 486 documents are selected as labeled and unlabeled training instances, respectively. ... The mean accuracy on the test set with two settings is displayed in Figure 3. We employ seven λ tuning strategies by setting the increment of selected unlabeled instances as 5, 10, 15, 20, 50, 100 and 500 respectively for each class in every iteration, and γ is set as 0.3 in this experiment. ... In this experiment, 20% instances of training data are chosen as the labeled set, and the rest of the data are treated as unlabeled. ... In the training phase, images are randomly horizontal flipped and cropped for data augmentation. The cross entropy loss function is used in this experiment ... The number of added unlabeled samples is proportional to the number of labeled samples. We set this proportion to 0.5 in algorithms for fair comparison. The maximum iteration round is set as 5 ... We set γ to 0.3, and iteration steps to 5 and 4 for the experiment with 2000 and 4000 labeled instances, respectively. The model in each view is trained for 300 epochs in all iterations, and the learning rate is 0.1 in the beginning and is reduced 10 times after training of 100 epochs. ... γ is set to 0.3 for leveraging predictions from all views. The maximum iteration round is set to 5 and training epochs in each round is set to 9. We empirically use the learning rate 0.001 for the first eight epochs and reduce it to 0.0001 for the last epochs. The momentum and weight decay are set as 0.9 and 0.0005, respectively. |