Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Authors: Spencer Young, Porter Jenkins, Longchao Da, Jeff Dotson, Hua Wei

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on diverse datasets demonstrate that DDPN outperforms current baselines in accuracy, calibration, and out-of-distribution detection, establishing a new state-of-the-art in deep count regression.
Researcher Affiliation Collaboration 1Delicious AI 2Brigham Young University, Department of Computer Science 3Arizona State University, School of Computing and Augmented Intelligence 4Ohio State University, Department of Marketing.
Pseudocode No The paper describes the DDPN model and its objective function in detail (e.g., Equation 2), as well as a modified loss function, but does not present these in a structured pseudocode or algorithm block.
Open Source Code Yes Source code is available online2. 2https://github.com/delicious-ai/ddpn
Open Datasets Yes Length of Stay (Microsoft, 2016) is a tabular dataset... Licensing information for Length of Stay can be found at https://github.com/microsoft/r-server-hospital-length-of-stay/blob/master/Website/package.json. The full dataset can be downloaded from https://raw.githubusercontent.com/microsoft/r-server-hospital-length-of-stay/refs/heads/master/Data/Length Of Stay.csv. ... COCO-People (Lin et al., 2014) is an adaptation of the MS-COCO dataset... The COCO dataset from which we form the COCO-People subset is distributed via the CCBY 4.0 license. It can be accessed at https://cocodataset.org/#home. ... Amazon Reviews is publicly available at https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/. The Patio, Lawn, and Garden subset we employ in this work is hosted at https://datarepo.eng.ucsd.edu/mcauley_group/data/amazon_v2/category Files Small/Patio_Lawn_and_Garden.csv.
Dataset Splits Yes For each dataset, we generate train/val/test splits with a fixed random seed. ... The Length of Stay dataset ... we divide into a 80/10/10 train/val/test split using a fixed random seed. ... COCO-People ... we divide into a 70/10/20 train/val/test split. ... Amazon Reviews ... we divide into a 70/10/20 train/val/test split.
Hardware Specification Yes We train each network for 200 epochs on the CPU of a 2021 Mac Book Pro... Each model benchmarked for Length of Stay is a small MLP... on a 2021 Mac Book Pro CPU... We train with an effective batch size of 256 in a distributed fashion, using an on-prem machine with 2 Nvidia Ge Force RTX 4090 GPUs. ... Training is performed on an internal cluster of 4 NVIDIA Ge Force RTX 2080 Ti GPUs. ... All networks are trained for 10 epochs across 2 on-prem Nvidia Ge Force RTX 4090 GPUs.
Software Dependencies No All experiments are implemented in Py Torch (Paszke et al., 2017). ... We employ the Adam W optimizer (Loshchilov & Hutter, 2017) for all training procedures... The paper mentions specific software frameworks and optimizers but does not provide version numbers for them.
Experiment Setup Yes We train each network for 200 epochs on the CPU of a 2021 Mac Book Pro with a batch size of 32 and an initial learning rate of 10-3. We set weight decay to 10-5. ... Each model benchmarked for Length of Stay is a small MLP with layer widths [128, 128, 128, 64]. Models are trained for 15 epochs... with a batch size of 128, an initial learning rate of 10-4, and a weight decay value of 10-4. ... We set the initial learning rate to 10-4 and use a weight decay of 10-3. We train with an effective batch size of 256... For the Inventory task, we use a modified version of Count Net3D... We train models for 50 epochs with an effective batch size of 16, an initial learning rate of 10-3, and weight decay of 10-5. ... All networks are trained for 10 epochs... with an effective batch size of 2048. We use an initial learning rate of 10-4 and a weight decay of 10-5.