Learning Concept Graphs from Online Educational Data

Authors: Hanxiao Liu, Wanli Ma, Yiming Yang, Jaime Carbonell

JAIR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on our newly collected datasets of courses from MIT, Caltech, Princeton and CMU show promising results.
Researcher Affiliation Academia Hanxiao Liu EMAIL Wanli Ma EMAIL Yiming Yang EMAIL Jaime Carbonell EMAIL School of Computer Science Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 USA
Pseudocode Yes Algorithm 1 CGL.Rank with Nestrerov s Accelerated Gradient Descent Algorithm 2 Sparse CGL.Rank with Accelerated PGD Algorithm 3 trans-CGL.Rank with accelerated GD
Open Source Code No The paper does not provide concrete access to source code. It discusses algorithms and their efficiency but does not explicitly state that the code is open-source or provide a link to a repository for the methodology described in this paper.
Open Datasets Yes We collected course listings, including course descriptions and available prerequisite structure from MIT Open Course Ware, Caltech, CMU and Princeton2. The datasets are available at http://nyc.lti.cs.cmu.edu/teacher/dataset/
Dataset Splits Yes We used one third of the data for testing, and the remaining two thirds for training and validation. We conducted 5-fold cross validation on the training two-thirds, i.e., trained the model on 80% of the training/validation dataset, and tuned extra parameters on the remaining 20%.
Hardware Specification Yes We tested the efficiency of our proposed algorithms (based on the optimization formulation after variable reduction) on a single machine with an Intel i7 8-core processor and 32GB RAM.
Software Dependencies No The paper mentions various algorithms and methods (e.g., SVM algorithms, accelerated gradient descent, coordinate descent) but does not provide specific version numbers for any software libraries, frameworks, or solvers used in the implementation.
Experiment Setup Yes We set k = 100 in our experiments based on cross validation. Via cross validation, we have found k = 1 (1NN) works best for this problem on the current datasets. CGL.Rank with gradient descent took 37.3 minutes and 1490 iterations to reach the convergence rate of 10-3. To achieve the same objective value, the accelerated gradient descent took 3.08 minutes with 401MB memory at 103 iterations, and the inexact Newton method took only 43.4 seconds with 587MB memory. for sparse CGL, it took the accelerated proximal gradient method 2.07 minutes to reach the convergence rate of 10-3 on MIT with 3.9GB peak memory consumption.