Factorization Machines with Regularization for Sparse Feature Interactions

Authors: Kyohei Atarashi, Satoshi Oyama, Masahito Kurihara

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The analysis and experimental results on synthetic and real-world datasets show the effectiveness of the proposed methods. Keywords: factorization machines, sparse regularization, feature interaction selection, feature selection, proximal algorithm
Researcher Affiliation Academia Kyohei Atarashi EMAIL Graduate School of Information Science and Technology Hokkaido University Kita 14, Nishi 9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan Satoshi Oyama EMAIL Masahito Kurihara EMAIL Faculty of Information Science and Technology Hokkaido University Kita 14, Nishi 9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan
Pseudocode Yes Algorithm 1 PCD algorithm for TI-sparse FMs ... Algorithm 6 PBCD algorithm for sparse FMs
Open Source Code Yes Our implementation is available at https://github.com/neonnnnn/nimfm.
Open Datasets Yes We used one regression dataset, the Movie Lens 100K (ML100K) (Harper and Konstan, 2016) dataset, and three binary classification datasets, a9a, RCV1 (Chang and Lin, 2011), and Flight Delay (FD) (Ke et al., 2017). ... For the a9a and RCV1 datasets, we used feature vectors and targets that are available from the LIBSVM (Chang and Lin, 2011) datasets repository. 4 https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ For the FD dataset, we used the scripts provided in https://github.com/szilard/benchm-ml
Dataset Splits Yes We divided the 100, 000 user-item scores in the dataset into sets of 64, 000, 16, 000, and 20, 000 for training, validation, and testing, respectively. For the a9a and RCV1 datasets, ... we used 20% of the training dataset as the validation dataset and the remaining 80% as the training dataset in this experiment.
Hardware Specification Yes We implemented the existing and proposed methods in Nim 3 and ran the experiments on an Arch Linux desktop with an Intel Core i7-4790 (3.60 GHz) CPU and 16 GB RAM. ... We ran the experiments on an Ubuntu 18.04 server with an AMD EPYC 7402P 24-Core Processor (2.80 GHz) and 128 GB RAM.
Software Dependencies No We implemented the existing and proposed methods in Nim 3 and ran the experiments on an Arch Linux desktop with an Intel Core i7-4790 (3.60 GHz) CPU and 16 GB RAM. The FM implementation was as fast as lib FM (Rendle, 2012), which is the implementation of FMs in C++.
Experiment Setup Yes For all methods, we omitted the linear term w, x since the true models do not use it. ... We set rank hyperparameter k to 30 for FM and the sparse FM methods and used N(0, 0.012) to initialize P . ... We used the squared loss for ℓ( , ). ... We initialized w to 0 and the bias term to 0. ... For the sparse FM methods (i.e., TI, CS, L21, L1, Ω -nm APGD, and Ω -Sub GD), we set them to 10 2, 10 1, . . . , 102. ... We set rank-hyper parameter k to 30. We used the linear term w, x and introduced a bias (intercept) term.