Mode Estimation for High Dimensional Discrete Tree Graphical Models
Authors: Chao Chen, Han Liu, Dimitris Metaxas, Tianqi Zhao
NeurIPS 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the accuracy and efficiency of our algorithm. More theoretical guarantee of our algorithm can be found in [7]. To validate our method, we first show the scalability and accuracy of our algorithm in synthetic data. Furthermore, we demonstrate using biological data how modes can be used as a novel analysis tool. |
| Researcher Affiliation | Academia | Chao Chen Department of Computer Science Rutgers, The State University of New Jersey Piscataway, NJ 08854-8019 EMAIL Han Liu Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 EMAIL Dimitris N. Metaxas Department of Computer Science Rutgers, The State University of New Jersey Piscataway, NJ 08854-8019 EMAIL Tianqi Zhao Department of Operations Research and Financial Engineering Princeton University, Princeton, NJ 08544 EMAIL |
| Pseudocode | Yes | Procedure 1 Compute-M-Modes Input: A tree G, a potential function f and a scale δ Output: The M modes of the lowest potential 1: Construct geodesic balls B = {Br(c) | c V}, where r = δ 2 + 1 2: for all B B do 3: Mδ B = the set of local modes of B 4: Construct a junction tree (Figure 2). The label set of each supernode is its local modes. 5: Compute the M lowest-potential labelings of the junction tree, using Nilsson s algorithm. |
| Open Source Code | No | The paper does not provide any link to source code nor states its public availability for the described methodology. |
| Open Datasets | Yes | Biological data analysis. We compute modes of the microarray data of Arabidopsis thaliana plant (108 samples, 39 dimensions) [24]. |
| Dataset Splits | No | The paper mentions generating synthetic data and using different sample sizes (10K, 40K, 80K) for evaluation, and also uses a biological dataset, but it does not specify explicit training, validation, or test dataset splits. |
| Hardware Specification | No | The paper discusses running time and scalability, but does not provide any specific hardware details such as GPU or CPU models used for the experiments. |
| Software Dependencies | No | The paper describes algorithms and methods but does not list any specific software dependencies with version numbers. |
| Experiment Setup | Yes | In all experiments, we choose M to be 500. We randomly generate tree-structured graphical model (tree size D =200 ...2000, label size L = 3) and test the speed. We randomly generate tree-structured distributions (D = 20, L = 2). To evaluate the sensitivity of our method to noise, we randomly flip 0%, 5%, 10%, 15% and 20% labels of these samples. |