Manifold Coordinates with Physical Meaning
Authors: Samson J. Koelle, Hanyu Zhang, Marina Meila, Yu-Chia Chen
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the ability of Manifold Lasso to identify explanations of manifolds and their embedding coordinates in both toy and scientific manifold learning problems. Section 7.1 describes the general experimental procedure, while Section 7.2 describes some specific adjustments to this protocol necessary for analyzing molecular dynamics (MD) data. Sections 7.3.1 7.4 describe our experimental results. |
| Researcher Affiliation | Academia | Samson J. Koelle1 EMAIL Hanyu Zhang1 EMAIL Marina Meil a1,2 EMAIL Yu-Chia Chen2 EMAIL 1Department of Statistics University of Washington Seattle, WA 98195-4322, USA 2Department of Electrical and Computer Engineering University of Washington Seattle, WA 98195, USA |
| Pseudocode | Yes | Manifold Lasso (Dataset D, dictionary G, embedding coordinates φ(D), intrinsic dimension d, kernel bandwidth ϵN, neighborhood cutoffsize r N, regularization parameter λ) 1: Construct Ni for i = 1 : n; i Ni iff||ξi ξi|| r N, and local data matrices Ξ1:n 2: Construct kernel matrix and Laplacian K, L Laplacian(N1:n, Ξ1:n, ϵN) 3: [Optionally compute embedding: φ(ξ1:n) Embedding Alg(D, N1:n, m, . . .)] 4: for j = 1, 2, . . . p do 5: Compute ξgj(ξi) for i = 1, . . . n 6: Compute ζ2 j by (11) and normalize ξgj(ξi) (1/ζj) ξgj(ξi) for i = 1, . . . n 8: for i = 1, 2, . . . n do 9: Compute basis T M i Local PCA(Ξi, Ki,Ni, d) 10: Project Xi (T M i )T ξg1:p 11: Compute Yi Pull Back DPhi(Ξi, Φi, T M i , Li,Ni, d) 12: end for 13: Compute ζ2 k 1 n Pn i=1 yik 2 (i.e., (10)), for k = 1, . . . m and normalize Yi Yi diag{1/ζ1:m}, for i = 1, . . . n. 14: β Group Lasso(X1:n, Y1:n, λ) 15: Output S = supp β |
| Open Source Code | Yes | Code to run experiments is available at https://github.com/sjkoelle/montlake. |
| Open Datasets | Yes | Our MD data are quantum-simulations from Chmiela et al. (2017). (full reference: S. Chmiela, A. Tkatchenko, H. Sauceda, I. Poltavsky, K. T. Sch utt, and K.-R. M uller. Machine learning of accurate energy-conserving molecular force fields. Science Advances, March 2017.) |
| Dataset Splits | No | Manifold Lasso is applied to a uniformly random subset of size n = |I| and this process is repeated ω number of times. The paper describes using subsamples for analysis but does not specify distinct training, validation, and test splits for model evaluation in the conventional sense. |
| Hardware Specification | No | The paper mentions 'supercomputer time' in Section 7.4.2 but does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'automatic differentiation (Paszke et al., 2019)' which refers to PyTorch, and cites a 'Python implementation of elastic-net regularized generalized linear models' (Jas et al., 2020), but it does not provide specific version numbers for PyTorch, Python, or any other software library or tool used in the experiments. |
| Experiment Setup | Yes | Manifold Lasso (Dataset D, dictionary G, embedding coordinates φ(D), intrinsic dimension d, kernel bandwidth ϵN, neighborhood cutoffsize r N, regularization parameter λ)... The regularization parameter λ ranges over [0, λmax] as described in Section 4.8. Table 1 also specifies parameters like ϵN, m, n (sample size for manifold embedding), n (subsample size for Manifold Lasso), p (dictionary size), and ω (number of independent repetitions) for different experiments. |