reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Nonlinear Sequence Embedding by Monotone Variational Inequality

Authors: Jonathan Y. Zhou, Yao Xie

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show the competitive performance of our method on real-world time-series data with baselines and demonstrate its effectiveness for symbolic text modeling and RNA sequence clustering. ... In our real-data experiments, the iteration cost is typically dominated by the cost of computing the gradient, which can be mitigated by stochastic approximation. ... Section 4 EXPERIMENTS We first illustrate parameter recovery using synthetic univariate time-series in Section 4.1. ... Section 4.2 describes benchmarks using real-world time-series data from the UCR Time Series Classification Archive (Dau et al., 2018). We report classification and runtime performance against a number of baselines. Section 4.3 provides two illustrations on embedding of real-world sequence data.
Researcher Affiliation	Academia	Jonathan Y. Zhou, Yao Xie School of Industrial & Systems Engineering Georgia Institute of Technology Atlanta, GA 30332 EMAIL, EMAIL
Pseudocode	Yes	We detail an extragradient scheme with backtracking for nuclear norm constrained VI in Algorithm 1 of Appendix A, which addresses the following general problem ... Algorithm 1 Extragradient Method with Backtracking for Nuclear Norm constrained VI
Open Source Code	Yes	The implementation is available at https://github.com/XSpace2013/Low Rank Time Series Recovery.
Open Datasets	Yes	Section 4.2 describes benchmarks using real-world time-series data from the UCR Time Series Classification Archive (Dau et al., 2018). ... a series of excerpts taken either from the works of Lewis Carroll or abstracts scraped from ar Xiv (Carroll, 1865; 1871; Kaggle Team, 2020). ... apply our method to the clustering of gene sequences for strains of Influenza A and Dengue viruses (Sayers et al., 2022). ... We retrieved the raw text of Alice s Adventures in Wonderland and Through the Looking Glass from Project Gutenberg 2. For the paper abstracts, we used the training portion of the ML-Ar Xiv-Papers dataset 3. ... The Influenza A virus genome data (n = 949) is acquired from the NCBI Influenza Virus Resource (Bao et al., 2008). ... We consider n = 1562 full Dengue virus genomes downloaded from the NCBI Virus Variation Resource (Hatcher et al., 2017).
Dataset Splits	Yes	Each dataset using its default train/test split. ... We split our data into testing and training splits according to those given by the UCR repository.
Hardware Specification	Yes	We evaluated all experiments and illustrations using a cluster with 24 core Intel Xeon Gold 6226 CPU (2.7 GHZ) processors, and NVIDIA Tesla V100 Graphics coprocessors (16 GB VRAM), and 384 GB of RAM.
Software Dependencies	No	We implement Algorithm 1, and associated subroutines (evaluation of the monotone field Ψ, as defined in Equation (8), incremental simplex/nuclear ball projection), using the Julia programming language. The implementation is available at https://github.com/XSpace2013/Low Rank Time Series Recovery.
Experiment Setup	Yes	We embed the data without supervision by solving (8) using the extragradient scheme given in Algorithm 1 of Appendix A with a look-back length of d = 20 , running the algorithm for 256 steps using a linear link function. The value of λ is selected via a two-step process: first, bisection identifies when the solution becomes rank-one, and then a grid search refines the choice for rank-constrained parameters. ... we perform cross-validated grid search (based on k = 5 folds) across KNNs with k = {2i \| i [0, 4]} neighbors or SVMs with RBF kernels with penalty values c {2i \| i [ 10, 15]} . ... To find the embedding, we run Algorithm 1 for 256 iterations.