How to Upscale Neural Networks with Scaling Law?
Authors: Ayan Sengupta, Yash Goel, Tanmoy Chakraborty
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this survey, we synthesize insights from current studies, examining the theoretical foundations, empirical findings, and practical implications of scaling laws. We also explore key challenges, including data efficiency, inference scaling, and architecture-specific constraints, advocating for adaptive scaling strategies tailored to real-world applications. We suggest that while scaling laws provide a useful guide, they do not always generalize across all architectures and training strategies. |
| Researcher Affiliation | Academia | Ayan Sengupta , EMAIL Indian Institute of Technology Delhi Yash Goel EMAIL Indian Institute of Technology Delhi Tanmoy Chakraborty EMAIL Indian Institute of Technology Delhi |
| Pseudocode | No | The paper describes various mathematical formulations for scaling laws (e.g., Equation 1, 2, 3), but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or a direct link to a source-code repository for the methodology described in this survey paper. Table 15 lists reproducibility status for *other* papers cited, not for this survey itself. |
| Open Datasets | No | This paper is a survey and does not conduct its own experiments using a specific dataset. It discusses datasets used in other research, such as "Web Text2", "C4", "ImageNet", etc., but does not utilize a dataset for its own empirical evaluation. |
| Dataset Splits | No | This paper is a survey and does not conduct its own experiments, therefore, no dataset splits are provided. |
| Hardware Specification | No | The Acknowledgments section mentions: "T. Chakraborty acknowledges the support of Google GCP Grant for providing the necessary computational resources." However, this statement is too general and does not provide specific hardware details (e.g., GPU models, CPU types, or memory amounts). |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library names with version numbers) used for the survey or analysis presented. |
| Experiment Setup | No | This paper is a survey and does not conduct its own experiments. Therefore, no experimental setup details or hyperparameters are provided. |