reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

KerasCV and KerasNLP: Multi-framework Models

Authors: Matthew Watson, Divyashree Shivakumar Sreepathihalli, François Chollet, Martin Górner, Kiranbir Sodhia, Ramesh Sampath, Tirth Patel, Haifeng Jin, Neel Kovelamudi, Gabriel Rasskin, Samaneh Saadat, Luke Wood, Chen Qian, Jonathan Bischof, Ian Stenbit, Abheesht Sharma, Anshuman Mishra

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present the Keras domain packages Keras CV and Keras NLP... These domain packages are designed to enable fast experimentation, with a focus on ease-of-use and performance. ... To enable efficient training, we support XLA compilation for all models... The libraries are fully open-source (Apache 2.0 license) and available on GitHub. Keywords: Keras CV, Keras NLP, Keras multi-backend, Deep learning, Generative AI" and "Table 1: Average time taken (in ms/step) per training or inference step across different models, namely Segment Anything (Kirillov et al., 2023), Gemma (Team et al., 2024), BERT (Devlin et al., 2019) and Mistral (Jiang et al., 2023).
Researcher Affiliation	Industry	EMAIL EMAIL EMAIL, EMAIL Keras Team, Google, USA
Pseudocode	No	The paper provides code examples in the appendices, such as 'augmenter = keras_nlp.layers.RandomSwap(...)' and 'model = keras_nlp.models.RetinaNet.from_preset(...)', but these are concrete code snippets, not structured pseudocode or algorithm blocks describing a method.
Open Source Code	Yes	The libraries are fully open-source (Apache 2.0 license) and available on Git Hub. All pretrained models of Keras CV and Keras NLP are published on Kaggle Models https: //www.kaggle.com/organizations/keras/models. The benchmarks will continue to be updated here https://keras.io/ getting_started/benchmarks/. (Section 5) Kerashub. https://github.com/keras-team/keras-hub, 2024. (References)
Open Datasets	Yes	We provide pretrained task models for popular architectures such as Stable Diﬀusion, YOLOv8, GPT2, BERT, Mistral, CLIP, Gemma, T5, etc. For example, Keras CV provides a number of presets for image classification models that have been trained on different datasets, such as Image Net, COCO, and Pascal VOC. (Appendix D) In Appendix A, the example code uses 'keras.datasets.cifar10.load_data()'.
Dataset Splits	No	The paper describes the Keras CV and Keras NLP libraries and their capabilities, including performance benchmarks for various models. While it mentions models are trained on datasets like ImageNet, COCO, and Pascal VOC which have standard splits, it does not explicitly provide or specify new training/test/validation splits for any experiments conducted within this paper. The BERT classifier example uses 'x=features, y=labels' but does not define splits for these inputs.
Hardware Specification	Yes	All benchmarks are done with a single NVIDIA A100 GPU with 40GB of GPU memory on a Google Cloud Compute Engine of machine type a2-highgpu-1g with 12 vCPUs and 85GB host memory.
Software Dependencies	No	The paper discusses Keras 3, JAX, TensorFlow, PyTorch, XLA, tf.data, tf.text, and various Keras CV/NLP components. However, it does not provide specific version numbers for these libraries or frameworks that would be required for precise reproducibility of the experiments.
Experiment Setup	Yes	For fair comparison, we use the same batch size across frameworks if it is the same model and task (fit or predict). However, for different models and tasks, due to their different sizes and architectures, we use different batch sizes... We also used the same batch size for Gemma and Mistral since they are the same model type with similar number of parameters. (Section 5) In Appendix D, the example code for the BERT classifier uses 'batch_size=2' for both fitting and predicting.