Exploring Large Action Sets with Hyperspherical Embeddings using von Mises-Fisher Sampling

Authors: Walid Bendada, Guillaume Salha-Galvan, Romain Hennequin, Théo Bontempelli, Thomas Bouabça, Tristan Cazenave

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on simulated data, real-world public data, and the successful large-scale deployment of v MF-exp on the recommender system of a global music streaming service empirically validate the key properties of the proposed method.
Researcher Affiliation Collaboration 1Deezer Research, Paris, France. 2LAMSADE, Université Paris Dauphine, PSL, Paris, France. 3SPEIT, Shanghai Jiao Tong University, Shanghai, China.
Pseudocode Yes Algorithm 1 Sample VO 1 Sample vector U uniformly from Sd 1; 2 Compute projection of U on V : W = U, V V ; 3 Subtract projection and normalize: VO = U W ||U W ||; 4 return VO
Open Source Code Yes We publicly release a Python implementation of v MF-exp on Git Hub to enable reproducibility of our experiments and to encourage future use of the method: https://github.com/deezer/v MF-exploration.
Open Datasets Yes Therefore, in Appendix H, we empirically validate the main properties of v MF-exp using a large-scale, publicly available dataset of one million Glo Ve word embedding vectors (Pennington et al., 2014). The Glo Ve-25 dataset is available for download at: https://nlp.stanford.edu/projects/glove/.
Dataset Splits No The paper describes using simulated data and real-world datasets (GloVe-25, Deezer's music catalog) for empirical validation and Monte Carlo simulations. It specifies parameters for these simulations and experiments (e.g., number of actions, inner product values, kappa), but does not detail traditional machine learning dataset splits (e.g., train/test/validation percentages or specific sample counts) for model training or evaluation in the conventional sense.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cluster configurations used for running the experiments or simulations. It mentions an 'industrial deployment' which implies hardware usage but lacks specifics.
Software Dependencies No The paper mentions using a 'Python implementation' and specific libraries like 'Python v MF sampler from Pinz on & Jung (2023)' and the 'Faiss library (Johnson et al., 2019)' but does not provide version numbers for Python itself or these libraries.
Experiment Setup Yes Figure 2 reports, for κ = 1.0, <V,A>=0.5 and growing values of d, the Pv MF-exp(a) sampling probability depending on the number of actions n, as well as PB-exp(a) with similar parameters and our approximations P0(a) and P1(a). In this section, we compare the behaviors of B-exp and v MF-exp on the Glo Ve-25 dataset of 1 million Glo Ve word embedding vectors with dimension d = 25... for varying action numbers n and inner products <V,A>...Sampling is repeated 30 million times and averaged to obtain precise estimates. To generate playlists, Deezer leverages a collaborative filtering model... This model learns unit norm song embedding representations of dimension d = 128... tuning κ (see Equation (4) of Sra (2012))... We first retrieve the m = 500 nearest neighbors of the initial song in the embedding space...