Refinery: An Open Source Topic Modeling Web Platform
Authors: Daeil Kim, Benjamin F. Swanson, Michael C. Hughes, Erik B. Sudderth
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Users can interactively organize articles by topic and also refine this organization with phrase-level analysis. The results of an analysis on 500 New York Times articles that contained the keyword obama during the year 2013. |
| Researcher Affiliation | Academia | Daeil Kim EMAIL Benjamin F. Swanson EMAIL Michael C. Hughes EMAIL Erik B. Sudderth EMAIL Department of Computer Science, Brown University, Providence, RI 02192, USA |
| Pseudocode | No | The paper describes the functionality and architecture of the Refinery platform in descriptive text, without presenting formal pseudocode or algorithm blocks for its underlying methods. |
| Open Source Code | Yes | The project website http://daeilkim.github.io/refinery/ contains Python code and further documentation. > git clone https :// github.com/daeilkim/refinery.git |
| Open Datasets | No | The paper refers to an 'analysis on 500 New York Times articles' but does not provide any specific link, DOI, repository, or formal citation with author/year to access this particular dataset or any other dataset used. |
| Dataset Splits | No | The paper mentions an 'analysis on 500 New York Times articles' but does not provide any details on how this or any other dataset was split into training, validation, or test sets. |
| Hardware Specification | No | The paper mentions running Refinery in a 'Unix-like command line' environment with Virtualbox and Vagrant, but it does not specify any particular CPU or GPU models, memory, or other detailed hardware specifications used for running experiments. |
| Software Dependencies | No | The paper states: 'To make installation simple, it has only three dependencies: the Git version-control system, Virtualbox (Oracle, 2013), and Vagrant (Hashimoto, 2013).' This text provides software names but not specific version numbers. Other mentioned tools like BNPy and Splitta also lack version numbers. |
| Experiment Setup | No | The paper describes the Refinery platform's features and the general approach to topic modeling (HDP), mentioning that users specify 'an upper bound on the number of inferred topics.' However, it does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, epochs) or system-level training settings. |