Denise: Deep Robust Principal Component Analysis for Positive Semidefinite Matrices

Authors: Calypso Herrera, Florian Krach, Anastasis Kratsios, Pierre Ruyssen, Josef Teichmann

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that Denise matches state-of-the-art performance in terms of decomposition quality, while being approximately 2000 faster than the state-of-the-art, principal component pursuit (PCP), and 200 faster than the current speed-optimized method, fast PCP. In this sections we provide numerical results of Denise. We first train Denise with the supervised loss function on a synthetic training dataset and evaluate it on a synthetic test dataset.
Researcher Affiliation Collaboration Calypso Herrera EMAIL Department of Mathematics ETH Zürich Florian Krach EMAIL Department of Mathematics ETH Zürich Anastasis Kratsios EMAIL Department Mathematics Mc Master University Pierre Ruyssen EMAIL Google Brain Google Zürich Josef Teichmann EMAIL Department of Mathematics ETH Zürich
Pseudocode Yes A schematic version of these supervised and unsupervised training schemes is given in the pseudo-Algorithm 1. Algorithm 1 Training of Denise
Open Source Code Yes The source code is avaible at https://github.com/Deep RPCA/Denise .
Open Datasets No We create a synthetic dataset in order to train Denise using the Monte Carlo approximation (7) of the supervised loss function (3). ... We consider a real world dataset of about 1 000 20-by-20 correlation matrices of daily stock returns (on closing prices), for consecutive trading days, shifted every 5 days, between 1989 and 2019. The considered stocks belong to the S&P500 and have been sorted by the GICS sectors.
Dataset Splits Yes We create a synthetic dataset consisting of 10 million matrices for the training set. ... We create a synthetic test dataset consisting of 10,000 matrices for each of the test settings... The first 77% of the data is used as training set and the remaining 23% as test set.
Hardware Specification Yes In this setting, we trained our model using 16 Google Cloud TPU-v2 hardware accelerators. ... A machine with 2 Intel Xeon CPU E5-2697 v2 (12 Cores) 2.70GHz and 256 Gi B of RAM.
Software Dependencies No To implement Denise, we used the machine learning framework Tensorflow (Abadi et al., 2015) with Keras APIs (Chollet et al., 2015). All algorithms are implemented as part of the LRS matlab library (Sobral et al., 2015; Bouwmans et al., 2016).
Experiment Setup Yes All results were similar, hence we only present those using size n = 20, sparsity s0 = 0.95 and rank k0 = 3 in the training set. ... Training took around 8 hours (90 epochs)... we empirically determined λ in order to reach the same rank. In particular, with λ = 0.56/ n for the synthetic dataset and λ = 0.64/ n for the real dataset, we approximately obtain a rank of 3 for matrices L.