reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Rationally Inattentive Inverse Reinforcement Learning Explains YouTube Commenting Behavior

Authors: William Hoiles, Vikram Krishnamurthy, Kunal Pattanayak

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	After a careful analysis of a massive You Tube dataset, our surprising result is that in most You Tube user groups, the commenting behavior is consistent with optimizing a Bayesian utility with rationally inattentive constraints. The paper also highlights how the rational inattention model can accurately predict commenting behavior. [...] This section provides empirical evidence that the commenting behavior of You Tube users is consistent with Bayesian utility maximization with rational inattention.
Researcher Affiliation	Collaboration	William Hoiles EMAIL KATERRA Menlo Park, CA 94025, USA Vikram Krishnamurthy EMAIL Electrical and Computer Engineering Cornell University Ithaca, NY 14853, USA Kunal Pattanayak EMAIL Electrical and Computer Engineering Cornell University Ithaca, NY 14853, USA
Pseudocode	Yes	Algorithm 1 Deep Embedded Clustering for Framing Association [...] Algorithm 2 Deep Embedded Clustering for Framing Association
Open Source Code	Yes	The massive You Tube dataset and analysis used in this paper are available on Git Hub and completely reproducible. [...] The results presented in Sec. 6.2 and Sec. 6.3 of this paper can be reproduced using the code and datasets that we have uploaded to a public Git Hub repository: https://github.com/Kunal P117/You Tube_project.
Open Datasets	Yes	The massive You Tube dataset and analysis used in this paper are available on Git Hub and completely reproducible. [...] The results presented in Sec. 6.2 and Sec. 6.3 of this paper can be reproduced using the code and datasets that we have uploaded to a public Git Hub repository: https://github.com/Kunal P117/You Tube_project.
Dataset Splits	Yes	We divided the You Tube dataset D into two parts training data (80%) and testing data (20%).
Hardware Specification	No	The paper does not explicitly mention any specific hardware (e.g., GPU models, CPU types, or detailed computing infrastructure) used for running the experiments.
Software Dependencies	No	The paper mentions software components and models like "GloVe (Pennington et al. (2014))", "Deep Embedded Clustering", "stacked long short term memory (LSTM)", "convolutional neural network (CNN)", and "Word Net lemmatizer" but does not provide specific version numbers for any of these tools or libraries as used in their implementation.
Experiment Setup	Yes	The deep embedded clustering is based on Xie et al. (2016); Guo et al. (2017), however we design the input, encoder, and decoder to account for the visual perception of the frame of the decision problem which includes image, text, and numeric information. [...] The denoising autoencoder encodes the input into the latent space representation as a 200 dimensional vector as zt = r(w(ft) + ϵ), and attempts to remove the effect of this corruption process stochastically applied to the input of the autoencoder. [...] The possible dimension of the word embedding space is 25, 50, 100, or 200. Here we use a word embedding dimension of 25. [...] Choosing N = 4 ensures each video is sufﬁciently isolated to a particular frame; less than 3% of videos are classiﬁed ambiguously in terms of frames.