Factorization Machines with Regularization for Sparse Feature Interactions
Authors: Kyohei Atarashi, Satoshi Oyama, Masahito Kurihara
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The analysis and experimental results on synthetic and real-world datasets show the effectiveness of the proposed methods. Keywords: factorization machines, sparse regularization, feature interaction selection, feature selection, proximal algorithm |
| Researcher Affiliation | Academia | Kyohei Atarashi EMAIL Graduate School of Information Science and Technology Hokkaido University Kita 14, Nishi 9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan Satoshi Oyama EMAIL Masahito Kurihara EMAIL Faculty of Information Science and Technology Hokkaido University Kita 14, Nishi 9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan |
| Pseudocode | Yes | Algorithm 1 PCD algorithm for TI-sparse FMs ... Algorithm 6 PBCD algorithm for sparse FMs |
| Open Source Code | Yes | Our implementation is available at https://github.com/neonnnnn/nimfm. |
| Open Datasets | Yes | We used one regression dataset, the Movie Lens 100K (ML100K) (Harper and Konstan, 2016) dataset, and three binary classification datasets, a9a, RCV1 (Chang and Lin, 2011), and Flight Delay (FD) (Ke et al., 2017). ... For the a9a and RCV1 datasets, we used feature vectors and targets that are available from the LIBSVM (Chang and Lin, 2011) datasets repository. 4 https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ For the FD dataset, we used the scripts provided in https://github.com/szilard/benchm-ml |
| Dataset Splits | Yes | We divided the 100, 000 user-item scores in the dataset into sets of 64, 000, 16, 000, and 20, 000 for training, validation, and testing, respectively. For the a9a and RCV1 datasets, ... we used 20% of the training dataset as the validation dataset and the remaining 80% as the training dataset in this experiment. |
| Hardware Specification | Yes | We implemented the existing and proposed methods in Nim 3 and ran the experiments on an Arch Linux desktop with an Intel Core i7-4790 (3.60 GHz) CPU and 16 GB RAM. ... We ran the experiments on an Ubuntu 18.04 server with an AMD EPYC 7402P 24-Core Processor (2.80 GHz) and 128 GB RAM. |
| Software Dependencies | No | We implemented the existing and proposed methods in Nim 3 and ran the experiments on an Arch Linux desktop with an Intel Core i7-4790 (3.60 GHz) CPU and 16 GB RAM. The FM implementation was as fast as lib FM (Rendle, 2012), which is the implementation of FMs in C++. |
| Experiment Setup | Yes | For all methods, we omitted the linear term w, x since the true models do not use it. ... We set rank hyperparameter k to 30 for FM and the sparse FM methods and used N(0, 0.012) to initialize P . ... We used the squared loss for ℓ( , ). ... We initialized w to 0 and the bias term to 0. ... For the sparse FM methods (i.e., TI, CS, L21, L1, Ω -nm APGD, and Ω -Sub GD), we set them to 10 2, 10 1, . . . , 102. ... We set rank-hyper parameter k to 30. We used the linear term w, x and introduced a bias (intercept) term. |