reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fully General Online Imitation Learning

Authors: Michael K. Cohen, Marcus Hutter, Neel Nanda

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now walk through a toy example, in which our imitation learner has about a halfmillion demonstrator models in its model class Π... Running it with 20 diﬀerent random seeds, the number of queries required is 486.75 52.63 (out of 215 timesteps), and no client ever quit. Returning to run depicted in Figure 1, Table 1 works through an example of the posterior and the imitator s behavior.
Researcher Affiliation	Collaboration	Michael K. Cohen EMAIL Department of Engineering Science University of Oxford Future of Humanity Institute Oxford, UK OX1 3PJ Marcus Hutter EMAIL Deep Mind Department of Computer Science Australian National University Acton, ACT, Australia 2601 Neel Nanda EMAIL Independent
Pseudocode	No	The paper describes the imitator's policy in Section 4, 'Imitation', using mathematical equations (3) and (4) but does not present the steps in a structured pseudocode or algorithm block format.
Open Source Code	Yes	The code for this toy example can be found at https://tinyurl.com/imitation-toy-example.
Open Datasets	No	The action space A of the demonstrator is null {0, 1}4. The observation space O is { , 1, 2, 3}. ... Each observation is randomly sampled; it is 1 with probability 1/4, 2 with probability 1/16, 3 with probability 1/64, and otherwise .
Dataset Splits	No	The paper describes a synthetic environment for a 'toy example' where observations are randomly sampled during online learning. It mentions running with '20 different random seeds' but does not specify train/test/validation splits for a fixed dataset.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments or simulations.
Software Dependencies	No	The paper mentions a URL for the code, which might imply a programming language like Python, but it does not specify any particular software libraries, frameworks, or their version numbers.
Experiment Setup	Yes	For an imitator with α = 1e-14, Figure 1 shows how often it has to query the demonstrator to pick the restaurant features. Recommendations are random, and this is only one run. Running it with 20 diﬀerent random seeds, the number of queries required is 486.75 52.63 (out of 215 timesteps), and no client ever quit.