Bandit Learning in Matching Markets with Indifference
Authors: Fang Kong, Jingqi Tang, Mingzhu Li, Pinyan Lu, John C.S. Lui, Shuai Li
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the algorithm s effectiveness in handling such complex situations and its consistent superiority over baselines. Extensive experiments are conducted to show our algorithm s effectiveness and consistent advantage compared with available baselines. We report the stable regret of each player in Figure 1 (a)(b)(c)(d)(e) and the cumulative market unstability (the cumulative number of unstable matchings) in Figure 1 (f). |
| Researcher Affiliation | Academia | EMAIL. Southern University of Science and Technology EMAIL. Shanghai Jiao Tong University EMAIL. Shanghai Jiao Tong University EMAIL. Shanghai University of Finance and Economics; Key Laboratory of Interdisciplinary Research of Computation and Economics (SUFE), Ministry of Education EMAIL Chinese University of Hong Kong EMAIL. Shanghai Jiao Tong University. |
| Pseudocode | Yes | Algorithm 1 adaptive exploration with arm-guided GS (AE-AGS, centralized version, from the view of the central platform) ... Algorithm 2 Subroutine-of-AE-AGS ... Algorithm 3 AE-AGS (centralized version, from the view of player pi) ... Algorithm 4 AE-AGS (decentralized version, from the view of player pi) ... Algorithm 5 Communication |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the methodology, nor does it include a link to a code repository or mention code in supplementary materials. |
| Open Datasets | No | To present the stable regret of each player, we first test the algorithms performances in a small market with 5 players and 5 arms. The position of each arm in a player s preference ranking is a random number in {1, 2, . . . , K}, similar to how the arms rank the players. Arms sharing the same position in a ranking have the same preference values, and the preference gap between two arms ranked in adjacent positions is set to = 0.1. The feedback Xi,j(t) for player pi on arm aj at time t is drawn independently from the Gaussian distribution with mean µi,j and variance 1. |
| Dataset Splits | No | The paper describes a simulation setup where data is generated for each run. It specifies parameters for this generation (e.g., 'random number in {1, 2, ..., K}', 'Gaussian distribution') but does not refer to traditional dataset splits like training, validation, or test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It only generally states that 'Extensive experiments are conducted'. |
| Software Dependencies | No | The paper describes the algorithms and experimental methodology but does not mention any specific software, libraries, or their version numbers used for implementation or simulation. |
| Experiment Setup | Yes | In each experiment, we run all algorithms for T = 100k rounds and report the averaged results over 20 independent runs. The position of each arm in a player s preference ranking is a random number in {1, 2, . . . , K}, similar to how the arms rank the players. The preference gap between two arms ranked in adjacent positions is set to = 0.1. The feedback Xi,j(t) for player pi on arm aj at time t is drawn independently from the Gaussian distribution with mean µi,j and variance 1. We also vary the value of {0.1, 0.15, 0.2, 0.25} and market size N = K {3, 6, 9, 12} to show the performances of algorithms. |