Convex Regression with Interpretable Sharp Partitions
Authors: Ashley Petersen, Noah Simon, Daniela Witten
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set. |
| Researcher Affiliation | Academia | Department of Biostatistics University of Washington Seattle, WA 98195 |
| Pseudocode | Yes | Algorithm 1 Alternating Directions Method of Multipliers for Equation (4) ... Algorithm 2 Block Coordinate Descent for CRISP with p > 2 (Equation (13)) |
| Open Source Code | No | The paper mentions 'Our Python implementation of CRISP' and 'FLAM (implemented with the R package flam (Petersen, 2014)); CART (implemented with the R package rpart (Therneau et al., 2014)); TPS (implemented with the R package fields (Nychka et al., 2014))', but it does not provide an explicit link or statement about the open-sourcing of the code for the CRISP methodology itself. The R packages mentioned are for third-party or related methods, not the direct implementation of CRISP. |
| Open Datasets | Yes | The data set was originally considered in Pace and Barry (1997) and is publicly available from the Carnegie Mellon Stat Lib data repository (lib.stat.cmu.edu). |
| Dataset Splits | Yes | We consider five different training set sizes: 100, 500, 1000, 5000, and 11,198 (which corresponds to 60% of the observations). We use the observations not selected for the training set as the test set. |
| Hardware Specification | Yes | On a Macbook Pro with a 2.0 GHz Intel Sandy Bridge Core i7 processor, our Python implementation of CRISP with n = q = 50 takes 20.1 seconds for a sequence of 20 λ values. |
| Software Dependencies | Yes | FLAM (implemented with the R package flam (Petersen, 2014)); CART (implemented with the R package rpart (Therneau et al., 2014)); TPS (implemented with the R package fields (Nychka et al., 2014))... R package version 1.0 for flam, R package version 4.1-8 for rpart, R package version 7.1 for fields. |
| Experiment Setup | Yes | We generate data with either n = 100 or n = 10, 000, and p = 2. We independently sample each element of x1 and x2 from a Unif[−2.5, 2.5] distribution, and then take y = f(x1, x2) + ϵ, where ϵ ∼ MVN(0, σ2In) with σ = 1 for n = 100 and σ = 10 for n = 10, 000. ... For each scenario, we generate 200 data sets and estimate M using CRISP (with q = 100) and several competitors. |