Expected Pinball Loss For Quantile Regression And Inverse CDF Estimation
Authors: Taman Narayan, Serena Lutong Wang, Kevin Robert Canini, Maya Gupta
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments in Section 5 on simulations and real-world data show that the proposed non-crossing DLNs provide competitive, trust-worthy estimates. ... Section 5.3 Model Architecture Experiments: We start by demonstrating the efficacy of using monotonic DLNs to predict the inverse CDF on simulations, and then on the real data. ... Table 3: Simulations: Quantile MSE and percent crossing violations for τ {0.01, 0.02, . . . , 0.99}. ... Table 5: Real data experiments: Pinball loss on the test set, averaged over τ {0.01, 0.02, . . . , 0.99}. |
| Researcher Affiliation | Collaboration | Taman Narayan EMAIL Google Research Serena Wang EMAIL Google Research University of California, Berkeley Kevin Canini EMAIL Google Research Maya R. Gupta EMAIL University of Washington |
| Pseudocode | No | The paper describes mathematical formulations and discusses algorithms like Dykstra's projection algorithm, but it does not contain a dedicated section or figure presenting pseudocode or an algorithm block for the proposed methodology. |
| Open Source Code | Yes | Code is available at github.com/google-research/google-research/tree/master/quantile_regression. |
| Open Datasets | Yes | Air Quality: The Beijing Multi-Site Air-Quality dataset from UCI (Zhang et al., 2017) ... Puzzles: ... The anonymized dataset is publicly available at www.mayagupta.org/data/Puzzle Club_Hold Times.csv. ... Wine: We used the Wine Reviews dataset from Kaggle (Bahri, 2018). |
| Dataset Splits | Yes | Air Quality: ...earlier examples forming a training set of size 252,481, later examples a validation set of size 84,145, and most recent examples a test set of size 84,145. ... Puzzles: The 984 train and 247 validation examples are IID from past data, while the 211 test samples are the most recent samples ... Traffic: We used 1,000 examples each for training, validation, and testing ... Wine: The data was split IID with 84,641 examples for training, 12,091 for validation, and 24,184 for testing. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for running its experiments, such as GPU models, CPU models, or cloud computing instance types. It mentions software frameworks like TensorFlow but not the underlying hardware. |
| Software Dependencies | Yes | We used Keras models in Tensor Flow 2.2 for the unrestricted DNN comparisons... For DLNs, we used the Tensor Flow Lattice library... For all DNN and DLN experiments, we use the Adam optimizer (Kingma & Ba, 2015) with its default learning rate of 0.001. |
| Experiment Setup | Yes | All hyperparameters were optimized on validation sets. ... For DNN models, we validated the number of hidden layers and the hidden dimension. For the SQF-DNN, we also validated the number of distribution keypoints. For the smaller DLN models, we used the common two-layer calibrated lattice architecture ... and validated over its number of calibration keypoints and lattice vertices. ... For both DLNs and DNNs, we additionally validated over the number of training epochs. ... The number of calibration keypoints for the piecewise-linear calibration function over τ were tuned between {10, 20, 50, 100}. The number of lattice keypoints for τ was tuned between {2, 3, 5, 7, 10}. Other feature calibration keypoints were tuned between {5, 10, 15, 20}. Step sizes were tuned between {0.001, 0.005, 0.01, 0.05, 0.1} minibatch sizes were tuned between {1000, 10000}. Number of steps was tuned between {100, 1000, 10000}. |