reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Competitive Fair Scheduling with Predictions

Authors: Tianming Zhao, Chunqiu xia, Xiaomin Chang, Chunhao Li, Wei Li, Albert Zomaya

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct simulations on synthetic and real-world datasets to evaluate the practical performance of these algorithms. The experimental results validate our theoretical analysis and demonstrate that RG+ consistently outperforms other algorithms in practical scenarios, showing its robustness and effectiveness.
Researcher Affiliation	Academia	Tianming Zhao School of Computer Science The University of Sydney Chunqiu Xia School of Computer Science The University of Sydney Xiaomin Chang School of Computer Science The University of Sydney Chunhao Li School of Computer Science The University of Sydney Wei Li School of Computer Science The University of Sydney Albert Y. Zomaya School of Computer Science The University of Sydney
Pseudocode	Yes	Algorithm 1: Relaxed-Greedy Algorithm 2: Offline Scheduling Algorithm 3: Adaptive Relaxed-Greedy
Open Source Code	Yes	Code is available on Git Hub (Anonymous, 2024).
Open Datasets	Yes	We conduct experiments on synthetic and real-world datasets ((Google, 2019), (Alibaba, 2023), and Azure (Cortez et al., 2017))... Google. Google Cluster Workload Traces 2019, 2019. URL https://research.google/resources/datasets/google-cluster-workload-traces-2019/. Accessed: 2023-08-20. Alibaba. Alibaba cluster data, 2023. URL https://github.com/alibaba/clusterdata. Git Hub repository. Cortez, Eli, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proceedings of the 26th Symposium on Operating Systems Principles, pages 153 167, 2017.
Dataset Splits	No	The paper describes how synthetic datasets are generated (e.g., job sizes follow exponential distributions, prediction errors are added) and mentions generating "50 independent instances for every problem set defined by n, P, η, and the range of release times." For real-world datasets, it uses "trace-log data." However, it does not specify any training/test/validation splits for these datasets for the purpose of algorithm evaluation or prediction model training.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models, memory, or cloud instance types.
Software Dependencies	No	The paper mentions that "Code is available on Git Hub (Anonymous, 2024)" but does not explicitly list any specific software dependencies or their version numbers required for replication.
Experiment Setup	Yes	We set RG+ to be (1 + 0.3)-speed RR-augmented RG, i.e., it runs RG at unit speed and RR at speed 0.3. The job sizes are generated following exponential distributions. With the maximum job size ratio P, we set p j to max{1, log X A} where X is a random variable with X U(0, 1) and A a scaling factor to ensure p j P. Given η and p j, we set pj p j exp(Y ) with Y U( log η, log η). We generate 50 independent instances for every problem set defined by n, P, η, and the range of release times.