Collaborative likelihood-ratio estimation over graphs

Authors: Alejandro de la Concha, Nicolas Vayatis, Argyris Kalogeratos

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The empirical evaluation of the GRULSIF framework is conducted for the objective of estimating the likelihood-ratio rα v for each node of a given fixed graph. In Sec. 6.1, we present synthetic experiments where the true likelihood-ratios are known by their design. The evaluation in real problems is challenging since the true likelihood-ratios are generally not known, however, in Sec. 6.2 we design a particular setting for seismic data where those true quantities can be safely assumed. In all experiments, both GRULSIF and POOL follow the numerical implementation guidelines described in Sec. 5, which include the CBCGD optimization technique and the Nystr om approximation of the RKHS.
Researcher Affiliation Academia Alejandro de la Concha alejandro.de la concha EMAIL Universit e Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, 91190 Gif-sur-Yvette, France Department of Mathematics, University of Luxembourg, 4364 Esch-sur-Alzette, Luxembourg Nicolas Vayatis EMAIL Universit e Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, 91190 Gif-sur-Yvette, France Argyris Kalogeratos EMAIL Universit e Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, 91190 Gif-sur-Yvette, France
Pseudocode Yes Algorithm 1 Model selection for GRULSIF hyperparameters tuning Algorithm 2 GRULSIF: Collaborative and distributed LRE over a graph Algorithm 3 Dictionary creation
Open Source Code No The paper does not provide an explicit statement about open-sourcing the code or a link to a code repository.
Open Datasets Yes The data is made public available by the Geo Net project that operates a geological hazard monitoring system in that territory. 1. Public data available by the Geo Net project GNS Science (1970): https://www.geonet.org.nz/earthquake/2021p405872 2. https://www.geonet.org.nz/earthquake/2023p741652 3. https://www.geonet.org.nz/earthquake/2024p817566
Dataset Splits Yes To compute this measure, we need both the approximated relative likelihood-ratio f and the true rα. The former is given by each LRE estimator trained on 80% of the observations, while by design we have rα v (x) = (1,...,1) RN for any α and x X. This choice is consistent with the sampling from pv and qv, which is done such that pv qv. The expected value Epα(y)[[fv rα v ]2(y)] is then calculated by averaging the estimation result on the remaining 20% of the observations that were not used during the training phase. Algorithm 1 Model selection for GRULSIF hyperparameters tuning: Randomly split X and X into R disjoint subsets {Xr}R r=1 and {X r}R r=1
Hardware Specification Yes On a single machine with 12th Gen Intel(R) Core(TM) i7-12700H processor and 16GB of RAM.
Software Dependencies No We apply a series of standard preprocessing steps used in seismology: the instrument response is deconvolved, the linear trend is removed, the observations are demeaned, a 2-20 bandpass filter is applied, and the filtered data are downsampled by a factor of 5. To reduce the temporal dependency, we fit an autoregressive model of order 1, and we keep the residuals to analyze further. The output is then standardized so that it has zero mean and unit variance. After completing these steps, we obtain 1200 observations at each location. We then assign the first 600 observations to pv and the remaining 600 of them to qv. To account for spatial similarity, we generate an unweighted spatial graph GS = (V,E,W) where the nodes represent the seismic stations and the edges are computed in order to form a spatial 3-nearest neighbors graph, as visualized in Fig. 17.
Experiment Setup Yes To investigate the sensitivity of GRULSIF and POOL to the regularization parameter α, we complement the previous experiments by reporting in Fig. 7-11 results for α = {0.01,0.1,0.5}. We can see that tuning α affects convergence, as suggested by Theorem 3. Low α values make the LRE task harder, hence leads to estimates with higher bias and variance, and slower convergence to the true target quantities (this is more evident in the box-plots). Moreover, graph regularization leads to more robust nodelevel estimates, i.e. lower variance within sets of connected nodes and faster convergence, especially for nodes where pv = qv. Finally, when α is closer to 1, GRULSIF and POOL get closer since the target relative likelihood-ratios become easier to estimate even without collaboration. The findings show that the Collaborative LRE is more robust as α gets closer to 0, when targeting a less regularized likelihood-ratio (see also Sec. 5.4).