Fully Heteroscedastic Count Regression with Deep Double Poisson Networks
Authors: Spencer Young, Porter Jenkins, Longchao Da, Jeff Dotson, Hua Wei
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on diverse datasets demonstrate that DDPN outperforms current baselines in accuracy, calibration, and out-of-distribution detection, establishing a new state-of-the-art in deep count regression. |
| Researcher Affiliation | Collaboration | 1Delicious AI 2Brigham Young University, Department of Computer Science 3Arizona State University, School of Computing and Augmented Intelligence 4Ohio State University, Department of Marketing. |
| Pseudocode | No | The paper describes the DDPN model and its objective function in detail (e.g., Equation 2), as well as a modified loss function, but does not present these in a structured pseudocode or algorithm block. |
| Open Source Code | Yes | Source code is available online2. 2https://github.com/delicious-ai/ddpn |
| Open Datasets | Yes | Length of Stay (Microsoft, 2016) is a tabular dataset... Licensing information for Length of Stay can be found at https://github.com/microsoft/r-server-hospital-length-of-stay/blob/master/Website/package.json. The full dataset can be downloaded from https://raw.githubusercontent.com/microsoft/r-server-hospital-length-of-stay/refs/heads/master/Data/Length Of Stay.csv. ... COCO-People (Lin et al., 2014) is an adaptation of the MS-COCO dataset... The COCO dataset from which we form the COCO-People subset is distributed via the CCBY 4.0 license. It can be accessed at https://cocodataset.org/#home. ... Amazon Reviews is publicly available at https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/. The Patio, Lawn, and Garden subset we employ in this work is hosted at https://datarepo.eng.ucsd.edu/mcauley_group/data/amazon_v2/category Files Small/Patio_Lawn_and_Garden.csv. |
| Dataset Splits | Yes | For each dataset, we generate train/val/test splits with a fixed random seed. ... The Length of Stay dataset ... we divide into a 80/10/10 train/val/test split using a fixed random seed. ... COCO-People ... we divide into a 70/10/20 train/val/test split. ... Amazon Reviews ... we divide into a 70/10/20 train/val/test split. |
| Hardware Specification | Yes | We train each network for 200 epochs on the CPU of a 2021 Mac Book Pro... Each model benchmarked for Length of Stay is a small MLP... on a 2021 Mac Book Pro CPU... We train with an effective batch size of 256 in a distributed fashion, using an on-prem machine with 2 Nvidia Ge Force RTX 4090 GPUs. ... Training is performed on an internal cluster of 4 NVIDIA Ge Force RTX 2080 Ti GPUs. ... All networks are trained for 10 epochs across 2 on-prem Nvidia Ge Force RTX 4090 GPUs. |
| Software Dependencies | No | All experiments are implemented in Py Torch (Paszke et al., 2017). ... We employ the Adam W optimizer (Loshchilov & Hutter, 2017) for all training procedures... The paper mentions specific software frameworks and optimizers but does not provide version numbers for them. |
| Experiment Setup | Yes | We train each network for 200 epochs on the CPU of a 2021 Mac Book Pro with a batch size of 32 and an initial learning rate of 10-3. We set weight decay to 10-5. ... Each model benchmarked for Length of Stay is a small MLP with layer widths [128, 128, 128, 64]. Models are trained for 15 epochs... with a batch size of 128, an initial learning rate of 10-4, and a weight decay value of 10-4. ... We set the initial learning rate to 10-4 and use a weight decay of 10-3. We train with an effective batch size of 256... For the Inventory task, we use a modified version of Count Net3D... We train models for 50 epochs with an effective batch size of 16, an initial learning rate of 10-3, and weight decay of 10-5. ... All networks are trained for 10 epochs... with an effective batch size of 2048. We use an initial learning rate of 10-4 and a weight decay of 10-5. |