Invariant Subspace Decomposition

Authors: Margherita Lazzaretto, Jonas Peters, Niklas Pfister

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose a practical estimation procedure, which automatically infers the decomposition using tools from approximate joint matrix diagonalization. Furthermore, we provide finite sample guarantees for the proposed estimator and demonstrate empirically that it indeed improves on approaches that do not use the additional invariant structure. [...] Finally, in Section 5, we illustrate ISD based on numerical experiments, both on simulated and on real world data, and validate the theoretical results presented in the paper.
Researcher Affiliation Academia Margherita Lazzaretto EMAIL Department of Mathematical Sciences University of Copenhagen Copenhagen, Denmark Jonas Peters EMAIL Seminar for Statistics ETH Zürich Zürich, Switzerland Niklas Pfister EMAIL Department of Mathematical Sciences University of Copenhagen Copenhagen, Denmark
Pseudocode Yes Appendix B. ISD estimation algorithm. We provide here the pseudocode summarizing the ISD procedure described in Section 4. The algorithm includes the estimation of the intercept, that is, it considers for all t N the model Yt = γ0 0,t + X t γ0,t + ϵt (25) ... Algorithm 2 ISD: estimation
Open Source Code Yes The code for the presented experiments is available at https://github.com/mlazzaretto/Invariant-Subspace-Decomposition. The implementation of the uwedge algorithm is taken from the Python package https://github.com/sweichwald/coro ICA-python developed by Pfister et al. (2019b).
Open Datasets Yes The data is taken from the smooth_polarizers experiment in the lt_walks_v1 dataset, available at https://github.com/juangamella/causal-chamber.
Dataset Splits Yes We consider n {500, 1000, 2500, 4000, 6000}, and repeat the experiment 20 times for each n. ... We then consider a separate time window of 250 observations in which the value of the time-varying coefficients ... (we refer to this window as test data). ... In the same setting, we now fix the size of the historical dataset to n = 6000 ... and consider a test dataset in which the time-varying coefficients ... undergo two shifts and take values 0.5 and 2 on two consecutive time windows, each containing 1000 observations. ... take as adaptation data a rolling window of length m contained in the test data and shifting by one time point at the time. We repeat the simulation 20 times for different sizes of the adaptation window, m {1.5p, 2p, 5p, 10p}. ... The available dataset contains 8000 observations ... The historical dataset contains the first 7000 observations, and the test dataset the remaining 1000.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run its experiments (e.g., specific GPU/CPU models, memory details).
Software Dependencies No The paper mentions 'Python package https://github.com/sweichwald/coro ICA-python' but does not specify its version, nor does it list specific version numbers for other key software components or programming languages used for the experiments.
Experiment Setup Yes We sample a random orthogonal matrix U, and sample the covariates Xt from a normal distribution with zero mean and covariance matrix U Σt U , where Σt is a block-diagonal matrix with four blocks of dimensions 2, 4, 3 and 1, and random entries that change 10 times in the observed time horizon n. We take as true time-varying parameter the rotation by U of the parameter with constant entries equal to 0.2 ... The noise terms ϵt are sampled i.i.d. from a normal distribution with zero mean and variance σ2 ϵt = 0.64. ... To compute ˆβinv, we use K = 25 equally distributed windows of length n/8. ... We estimate ˆδres t and ˆγOLS t on a rolling adaptation window of size m = 3p. ... In the simulations, we select the threshold λ in (19) by cross-validation. More in detail, we define the grid of possible thresholds λ by ... We then split the historical data into L = 10 disjoint blocks of observations, and for all folds ℓ {1, . . . , L}, we compute an estimate for the invariant component ... we then consider a rolling window of length d = 2p and the observation at t immediately following the rolling window: ... tse = 1.