CodaLab Competitions: An Open Source Platform to Organize Scientific Challenges

Authors: Adrien Pavao, Isabelle Guyon, Anne-Catherine Letournel, Dinh-Tuan Tran, Xavier Baro, Hugo Jair Escalante, Sergio Escalera, Tyler Thomas, Zhen Xu

JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Coda Lab Competitions is an open source web platform designed to help data scientists and research teams to crowd-source the resolution of machine learning problems through the organization of competitions, also called challenges or contests. Coda Lab Competitions provides useful features such as multiple phases, results and code submissions, multi-score leaderboards, and jobs running inside Docker containers. The platform is very flexible and can handle large scale experiments, by allowing organizers to upload large datasets and provide their own CPU or GPU compute workers. This contribution features Coda Lab Competitions, an open source platform for running data science competitions that has been used in hundreds of challenges associated to physics, machine learning, computer vision, natural language processing, among many other fields. The paper primarily describes the design, features, and technical architecture of the Coda Lab Competitions platform, rather than conducting new empirical studies with data analysis or hypothesis validation within the paper itself.
Researcher Affiliation Collaboration Adrien Pavão1,2 EMAIL Isabelle Guyon1,2 EMAIL Anne-Catherine Letournel1,2 EMAIL Dinh-Tuan Tran1 EMAIL Xavier Baró3 EMAIL Hugo Jair Escalante2,4 EMAIL Sergio Escalera2,5 EMAIL Tyler Thomas6 EMAIL Zhen Xu7 EMAIL 1 LISN, CNRS, Université Paris-Saclay, France, 2 Cha Learn, 3 Universitat Oberta de Catalunya, Spain, 4 Instituto Nacional de Astrofísica, Óptica y Electrónica, Mexico, 5 Universitat de Barcelona and Computer Vision Center, Spain, 6 Tier0 Software Development LLC, United States of America, 7 4Paradigm, China
Pseudocode No The paper describes the Coda Lab Competitions platform, its features, and technical architecture (Figures 1 and 2 illustrating workflow and worker connections). It discusses the 'logic of scoring of a competition (Figure 1) is coded by the organizers in any programming language (typically Python)'. However, it does not include any explicit pseudocode or algorithm blocks for a specific method described within the paper.
Open Source Code Yes The reader is referred to the project website where the code and complete documentation are found. The code is released under an Apache 2.0 License. This project has a sister project called Coda Lab Worksheets, which features dynamical workflows, particularly useful for Natural Language Processing. Both projects are accessible from https://codalab.org/ pointing to public platform instances, freely available. The openness of Coda Lab Competitions is total: the Apache 2.0 licence is used, the source code is on Git Hub; the development framework and all the used components are open source.
Open Datasets No The paper discusses how the Coda Lab Competitions platform allows organizers to upload large datasets for their competitions and mentions historical challenges like the ImageNet challenge (Russakovsky et al., 2015) as examples. However, the paper itself does not provide concrete access information (e.g., link, DOI, specific citation) for a dataset used in its own analysis or experiments, as its primary focus is on describing the platform itself.
Dataset Splits No The paper describes Coda Lab Competitions, a platform for organizing scientific challenges, which may involve dataset splits within those challenges. However, the paper itself does not detail any specific dataset split information (percentages, sample counts, predefined splits, or detailed splitting methodology) for its own experimental analysis, as it primarily describes the platform's features and architecture rather than conducting new experiments.
Hardware Specification No The paper mentions that competition organizers can 'create custom queues and attach their own CPU or GPU compute workers (physical or virtual machines on any cloud service)'. It refers to 'providing potentially powerful machines to the candidates'. However, it does not provide specific hardware details (e.g., exact CPU/GPU models, memory amounts, or detailed computer specifications) used for running the platform itself or for any experiments conducted within the paper.
Software Dependencies No The paper states that 'Coda Lab Competitions is implemented in Python s Django framework' and mentions other technologies like 'Django Rest Framework', 'Postgre SQL database', 'Min IO for file storage', 'Rabbit MQ as a queue manager', 'Celery client', and 'Docker'. However, it does not provide specific version numbers for any of these software components, which are necessary for reproducible software dependency information.
Experiment Setup No The paper describes the Coda Lab Competitions platform and its features for organizing competitions, allowing organizers to configure 'settings, logo, HTML pages, computer language, data organization, participant resource limitations, dates, and phases duration'. However, the paper itself does not present any specific experimental setup details, hyperparameters, or training configurations for an experiment conducted by the authors in the context of machine learning model training.