reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-Task Learning for Straggler Avoiding Predictive Job Scheduling

Authors: Neeraja J. Yadwadkar, Bharath Hariharan, Joseph E. Gonzalez, Randy Katz

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For our experimental evaluation, we use a collection of real-world workloads from Facebook and Cloudera s customers. We evaluate not only the prediction accuracy but also the improvement in job completion times, which is the metric that directly impacts the end user. We show that our formulation to predict stragglers allows us to reduce job completion times by up to 59% over the previous state-of-the-art learning base system, Wrangler (Yadwadkar et al. (2014)). This large reduction arises from a 7 point increase in prediction accuracy. Further, we can get equal or better accuracy than Wrangler (Yadwadkar et al. (2014)) using a sixth of the training data, thus bringing the training time down from 4 hours to about 40 minutes.
Researcher Affiliation	Academia	Neeraja J. Yadwadkar EMAIL Division of Computer Science University of California Berkeley, CA 94720-1776, USA Bharath Hariharan EMAIL Division of Computer Science University of California Berkeley, CA 94720-1776, USA Joseph E. Gonzalez EMAIL Division of Computer Science University of California Berkeley, CA 94720-1776, USA Randy Katz EMAIL Division of Computer Science University of California Berkeley, CA 94720-1776, USA
Pseudocode	No	The paper describes mathematical formulations and discusses algorithms but does not present any structured pseudocode blocks or sections explicitly labeled 'Algorithm' or 'Pseudocode'.
Open Source Code	No	The paper mentions Wrangler (Yadwadkar et al., 2014) as a system they built and improved upon, but it does not explicitly state that the source code for the methodology described in this paper is open-source, nor does it provide any links to a code repository or supplementary materials containing the code.
Open Datasets	No	The set of real-world workloads considered in this paper are collected from the production compute clusters at Facebook and Cloudera s customers, which we denote as FB2009, FB2010, CC b and CC e. Table 2 provides details about these workloads... Chen, et al., explain the data in further details in (Chen et al., 2012). While a citation is provided for data explanation methodology, the datasets themselves (FB2009, FB2010, CC b, CC e) are described as originating from 'production compute clusters at Facebook and Cloudera s customers' and no direct access information (URL, DOI, repository) for these specific datasets is provided.
Dataset Splits	Yes	In particular, for every task i launched on node n, Wrangler records both the resource usage counters xi and the label yi which indicates if the task straggles or not. Since there is a separate predictor for each node and workload, Wrangler produces separate datasets for each node and workload. Let Sn,l be the set of tasks of jobs corresponding to workload l, executed on node n. Thus, we record the dataset for node n, workload l as: Dn,l = {(xi, yi) : i Sn,l}. Then Wrangler divides each dataset into a training set and test set temporally, i.e, the ﬁrst few jobs constitute the train set and the rest form the validation. ... Table 3 shows the sizes of the datasets. (Table 3 provides explicit counts for 'No of tasks (Training+Validation)' and 'Test' for each workload).
Hardware Specification	No	For faithfully replaying these real-world production traces on our 20 node EC2 cluster, we used a statistical workload replay tool, SWIM (Chen et al., 2011). The mention of 'EC2 cluster' is a generic cloud platform description and does not provide specific hardware details such as CPU models, GPU types, or exact instance specifications.
Software Dependencies	No	We optimize all our formulations using Liblinear (Fan et al., 2008) for the l2 regularized variants and using the algorithm proposed by Aﬂalo et al. (2011) (modiﬁed to work in the primal) for the mixed norm variants. While Liblinear is mentioned as a software package, no specific version number is provided.
Experiment Setup	Yes	Thus, if we use all four partitions, the weight vector wt for a given workload lt and a given node nt is: wt = w0 + wnt + wlt + vt (22)... where λ0, ν, ω, τ are hyperparameters. We ran an initial grid search on a validation set to ﬁx these hyperparameters and found that prediction accuracy was not very sensitive to these settings: several settings gave very close to optimal results. We used λ0 = ν = ω = τ = 1, which was one of the highest performing settings, for all our experiments. Appendix A provides the details of this grid search experiment along with the sensitivity of these hyperparameters to a set of values.