Finer Metagenomic Reconstruction via Biodiversity Optimization
Authors: Simon Foucart, David Koslicki
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Numerical Experiments, The purpose here is a proof-of-concept, Figure 1 displays the support size of uniformly randomly (normalized) vectors x versus the percentage of successful recoveries by each algorithm averaged over 200 replicates, Figure 3 demonstrates that when a high percentage of vectors are recovered, the procedure (IRWLP) takes less execution time than Quikr. |
| Researcher Affiliation | Academia | Simon Foucart Department of Mathematics Texas A&M University College Station, TX 77843 EMAIL, David Koslicki Departments of Computer Science and Engineering, Biology, and the Huck Institutes of the Life Sciences Pennsylvania State University University Park, PA 16802 EMAIL |
| Pseudocode | No | The paper describes mathematical formulations and optimization problems, such as (Min Div) and (IRWLP), but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All numerical experiments can be reproduced via the Git Hub repository: https://github.com/dkoslicki/Minimize Biological Diversity |
| Open Datasets | Yes | we utilized the Green Genes 97% OTU database [7] where [7] is T. Z. De Santis, P. Hugenholtz, N. Larsen, M. Rojas, E. L. Brodie, K. Keller, T. Huber, D. Dalevi, P. Hu, and G. L. Andersen. Greengenes, a chimera-checked 16S r RNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol., 72(7):5069 5072, 2006. |
| Dataset Splits | No | The paper describes running 200 replicates or simulations and defines success criteria, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware (e.g., GPU models, CPU types, memory details) used for running the experiments. |
| Software Dependencies | Yes | We utilized MATLAB s fmincon nonlinear optimizer [19] with the sqp algorithm to solve the optimization (Min Div). and [19] The Math Works, Inc. MATLAB and statistics toolbox release 2019a. Natick, Massachusetts, United States. |
| Experiment Setup | Yes | In equation (IRWLP), we set q = 0.01 and ε = 10 5 and terminated the iterative procedure if the change in ℓ1 norm was less than 10 3 or if the number of iterations exceeded 25. For the Quikr optimization procedure, we set λ = 10,000. We used k = 3 to form a 64 192 k-mer matrix A. we selected k = 4 to form a 256 768 k-mer matrix A. we considered the cases when h = 4, 6, and 13 and set q = 0.01 in each of them. |