Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift
Authors: Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper addresses this fundamental question by proving that, surprisingly, classical Maximum Likelihood Estimation (MLE) purely using source data (without any modification) achieves the minimax optimality for covariate shift under the wellspecified setting. That is, no algorithm performs better than MLE in this setting (up to a constant factor), justifying MLE is all you need. Our result holds for a very rich class of parametric models, and does not require any boundedness condition on the density ratio. We illustrate the wide applicability of our framework by instantiating it to three concrete examples linear regression, logistic regression, and phase retrieval. This paper further complement the study by proving that, under the misspecified setting, MLE is no longer the optimal choice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimax optimal in certain scenarios. |
| Researcher Affiliation | Academia | Jiawei Ge Shange Tang Jianqing Fan Cong Ma Chi Jin equal contribution Department of Operations Research and Financial Engineering, Princeton University; EMAIL Department of Statistics, University of Chicago; EMAIL Department of Electrical and Computer Engineering, Princeton University; EMAIL |
| Pseudocode | No | The paper is theoretical and does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not mention providing open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not describe empirical experiments involving datasets or their public availability. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical validation with dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe empirical experiments that would require hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe empirical experiments that would require specific software dependencies. |
| Experiment Setup | No | The paper is theoretical and does not describe empirical experiments that would involve an experimental setup with hyperparameters or training configurations. |