reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Bayesian Nash Equilibrium in Auction Games via Approximate Best Response

Authors: Kexin Huang, Ziqian Chen, Xue Wang, Chongming Gao, Jinyang Gao, Bolin Ding, Xiang Wang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across various auction formats demonstrate that our approach accelerates convergence and enhances learning efficiency in complex auction settings.
Researcher Affiliation	Academia	1University of Science and Technology of China, Hefei, China 2Independent Researcher. Correspondence to: Chongming Gao <EMAIL>, Xiang Wang <EMAIL>.
Pseudocode	Yes	Algorithm 1 Stochastic Approximate BR Gradient Method
Open Source Code	Yes	Our code is open-sourced at: https://github.com/Hesse73/Approx-BR-Grad.
Open Datasets	No	We benchmark the L2 distance in the symmetric auction games with a uniform prior under first-price and second-price mechanisms... We also explore the asymmetric auction setting, where bidders have different value priors. Following previous work on learning asymmetric auction games (Bichler et al., 2023a), we consider the FP auction with 2 bidders whose value priors are U[0, 0.5] (the weak bidder) and U[0, 1] (the strong bidder).
Dataset Splits	No	For all experiments, we sample 256 values for vi and 10,000 value samples for v i.
Hardware Specification	No	This research was also supported by the advanced computing resources provided by the Supercomputing Center of the USTC.
Software Dependencies	Yes	In the original SM paper, the ex-post utility is modified as: u SM i (vi, bi, b i) = (vi p SM i ) exp(bi/λ) Pn j=1 exp(bj/λ) where p SM i = P j n pj. In our implementation, we use p SM i = bi for FP and p SM i = maxj =i bj for simplicity. Note that this modification doesn’t change the smoothness and ability to approximate the original utility function of u SM i . Our results align closely with the ones reported in the SM paper: The L2 error of our results is around 3e-3 and the SM paper’s L2 is ranged in 4e-3 5e-3 when evaluated at the FP/SP symmetric uniform setting with 2 players. While improving the L2 result, the training time per iteration when n = 2 in our implementation is 0.012 0.013 seconds, which is comparable to the time range of 0.009 0.011 seconds reported in SM’s paper.
Experiment Setup	Yes	We conduct the experiments in a more general setting where the bidding function is a 3-layer MLP instead of the linear model in Assumption 3.1 and evaluate performance across different mechanisms, asymmetric auctions, risk-averse utilities, and alternative gradient estimation approaches. ... we use a shared MLP network for bidders with the same value prior and run each method with 2,000 steps. ... We set the learning rate to 0.05 for all experiments except for the asymmetric SP auction, where we increased the learning rate to 0.2 for better results. ... For all experiments, we sample 256 values for vi and 10,000 value samples for v i. ... For the estimation of pdf in the ex-interim utility’s gradient (Eq. (11)), we utilize the kernel density estimation (KDE) approach with a Gaussian kernel... The bandwidth h in KDE is a hyperparameter... The parameter γ in first-order approximate best response is set to 1. ... the 2nd-order approximation is not applied during the first 200 steps of training; 2. it’s only enabled if 2ui/ b2 i -1e-8 and \| ui/ bi\| 0.01 by default.