Gradient-based Learning Methods Extended to Smooth Manifolds Applied to Automated Clustering
Authors: Alkis Koudounas, Simone Fiori
JAIR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare clustering performances of these methods and known methods from the scientific literature. The obtained results confirm that the proposed learning algorithms prove lighter in computational complexity than existing ones without detriment in clustering efficacy. ... To evaluate the performance of the GSC algorithm learnt by these gradient-based learning methods on clustering both toy datasets and real-world (pictorial) databases. ... The present section illustrates results of numerical experiments performed on two categories of datasets, namely, synthetic 2-dimensional datasets used for testing purposes (Subsection 5.1, and real-world datasets used to validate the discussed gradient-based learning algorithms (Subsection 5.2) and to compare their performances with those exhibited by closely-related clustering algorithms known from the scientific literature (Subsection 5.3). |
| Researcher Affiliation | Academia | Alkis Koudounas EMAIL Graduate School of Computer Science, Polytechnic of Turin, Turin, Italy. Simone Fiori EMAIL Department of Information Engineering, Marches Polytechnic University, Ancona, Italy. |
| Pseudocode | Yes | Algorithm 1 Stochastic Gradient Descent (SGD). ... Algorithm 2 Ada Delta. ... Algorithm 3 Adaptive Moment Estimation (Ada M). ... Algorithm 4 Spectral Clustering (SC). ... Algorithm 5 Grassmann Manifold Optimization Assisted Spectral Clustering (GSC). ... Algorithm 6 Normalized Cut (NCut). |
| Open Source Code | No | The authors wish to thank Prof. Junbin Gao (The University of Sydney) for sharing part of the computer codes to implement the GSC algorithm, as well as Dr. Ehsan Elhamifar (University of California at Berkeley) and Prof. René Vidal (The Johns Hopkins University) for sharing part of the computer codes to implement sparse clustering on pictorial data. |
| Open Datasets | Yes | The Yale B face database. ... (http://vision.ucsd.edu/~leekc/Ext_Yale_Database/Ext_Yale_B.html). The ORL face database. ... (https://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html). The MNIST database. ... (http://yann.lecun.com/exdb/mnist/). ... Recursion Cellular Image Classification (https://www.kaggle.com/xhlulu/recursion-cellular-image-classification-224-jpg). ... Tensor Flow Patch Camelyon Medical Images (https://www.tensorflow.org/datasets/catalog/patch_camelyon). ... Coast Sat Image Classification Dataset (https://figshare.com/articles/Coast_Sat_image_classification_training_data/8868665/1). ... Images for Weather Recognition (Ajayi, 2018) (https://data.mendeley.com/datasets/4drtyfjtfy/1). ... Indoor Scenes Images (https://www.kaggle.com/itsahmad/indoor-scenes-cvpr-2019). ... Intel Image Classification (https://www.kaggle.com/puneet6060/intel-image-classification/version/2). ... Tensor Flow Sun397 Image Classification Dataset (https://www.tensorflow.org/datasets/catalog/sun397). ... Architectural Heritage Elements (https://old.datahub.io/dataset/architectural-heritage-elements-image-dataset). ... Images of People Eating Food (https://data.world/crowdflower/image-classification-people-an). ... Images of Cracks in Concrete for Classification (https://data.mendeley.com/datasets/5y9wdsg2zt/2). |
| Dataset Splits | Yes | The MNIST dataset of handwritten digits has a training set of 60,000 examples and a test set of 10,000 examples. ... The training set includes around 14,000 images and the testing folder has around 3,000 images ... Six subsets were constructed which consist of images of randomly selected subjects classified in K clusters for K {5, 8, 10, 12, 15, 18}. ... Each training set was build up so as to contain a total of 400 images randomly selected from the same cluster. |
| Hardware Specification | Yes | These numerical experiments were performed on a personal computer endowed with a dual-core Intel Core i5 processor, a clock frequency of 2.7GHz and a 8GB RAM by the help of MATLAB R2017b scripts. |
| Software Dependencies | Yes | These numerical experiments were performed on a personal computer endowed with a dual-core Intel Core i5 processor, a clock frequency of 2.7GHz and a 8GB RAM by the help of MATLAB R2017b scripts. |
| Experiment Setup | Yes | In our experiment each cluster contains 200 samples. ... We set the same value β = 0.00001 and let gradient-based learning algorithms run over 1,500 iterations. ... We set the sparsity-promotion β to 0.00001 and the number of clusters K to 5, 8 and 10. ... The authors of the algorithm proposed default values of 0.9 for β1, 0.999 for β2, and 10−8 for ϵ. |