Monotone Missing Data: A Blessing and a Curse
Authors: Santtu Tikka, Juha Karvanen
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we consider missing data models represented by directed acyclic graphs (DAGs) and scenarios where the assumption of a monotonic relationship between response indicators enables us to identify distributions of interest that would otherwise be nonidentifiable, and the converse, where the same assumption renders otherwise identifiable distributions nonidentifiable. To the best of our knowledge, there are no previous graphical criteria or algorithms for determining identifiability or nonidentifiability of the missing data distribution under monotone missing data in nonparametric MNAR settings for missing data DAGs. The rest of the paper is organized as follows. Section 2 introduces the notation and the relevant definitions. Section 3 considers missing data models and monotone missingness. Section 4 discusses identifiability in missing data models and the applicability of previous identifiability results for nonmonotone missingness under monotone missingness. Sections 5 and 6 present new results for identifiability and nonidentifiability under monotone missingness, respectively. Section 7 concludes the paper with a discussion. |
| Researcher Affiliation | Academia | Santtu Tikka EMAIL Department of Mathematics and Statistics University of Jyvaskyla, Finland Juha Karvanen EMAIL Department of Mathematics and Statistics University of Jyvaskyla, Finland |
| Pseudocode | No | The paper presents theoretical concepts, definitions, theorems, and proofs related to identifiability in missing data models, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statements or links indicating the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper discusses theoretical concepts using example scenarios like 'an intervention program to improve the physical condition of the participants' or 'an epidemiological study on hobbies and cognitive abilities'. These are illustrative and do not refer to specific, publicly available datasets used for empirical analysis. |
| Dataset Splits | No | The paper is theoretical and does not present experiments based on specific datasets, therefore, there is no mention of dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup or computational results that would require specific hardware. Therefore, no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical identifiability results in missing data models. It does not mention any specific software, libraries, or versions used for implementation or analysis. |
| Experiment Setup | No | The paper is theoretical and presents mathematical results and proofs regarding identifiability. It does not include any experimental setup details, hyperparameters, or training configurations. |