Diffusion-Based Active Learning for Distributed Client Manifolds
Authors: Kwang In Kim
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments with five datasets, our approach demonstrates substantial advancements when compared to adaptations of existing active learning algorithms. We evaluated the effectiveness of our algorithm on five benchmark datasets: CIFAR-10 and CIFAR100 (Krizhevsky 2009), Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), EMNIST letters (Cohen et al. 2017), and CINIC-10 (Darlow et al. 2018). Results. Figure 3 and Tab. 1 show the results. |
| Researcher Affiliation | Academia | Kwang In Kim POSTECH EMAIL |
| Pseudocode | Yes | Algorithm 1: Distributed AL. In each round r, a client batch Cr of size V is constructed and for each client in Cr = {Cz}, a local data batch Br z of size W is selected. |
| Open Source Code | No | The paper does not explicitly state that source code for the described methodology is available, nor does it provide any links to a code repository or mention code in supplementary materials. |
| Open Datasets | Yes | We evaluated the effectiveness of our algorithm on five benchmark datasets: CIFAR-10 and CIFAR100 (Krizhevsky 2009), Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), EMNIST letters (Cohen et al. 2017), and CINIC-10 (Darlow et al. 2018). |
| Dataset Splits | Yes | For all datasets except CINIC-10, initially, 1,000 labels were prepared by selecting five clients and labeling 200 local data points per client. Afterward, 1,000 labels were added in each AL round by selecting five clients and 200 local points therein (i.e., V =5, W=200). For CINIC-10, we used V =10 and W=100. The number of total AL rounds R was set at 11, yielding total 12,000 labeled points in the final round. We considered two types of non-independent and identically distributed (non-IID) data allocation: In the first setting (Dirichlet), local data in each client was allocated based on the probability distributions of output classes generated by sampling from a Dirichlet distribution with α [0.1, 0.2] (Wang et al. 2020; Hsu, Qi, and Brown 2020). In the second setting (Shard), each local dataset was sampled from only 20% of total classes (Mc Mahan et al. 2017). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions using ResNet101 as the learner network but does not specify any software libraries, frameworks, or their version numbers that would be required to reproduce the experiments. |
| Experiment Setup | Yes | The learning rate η (Eq. 3) was initially set to 0.001, and it decayed by 0.1 at every 10-th epoch. For all datasets except CINIC-10, initially, 1,000 labels were prepared by selecting five clients and labeling 200 local data points per client. Afterward, 1,000 labels were added in each AL round by selecting five clients and 200 local points therein (i.e., V =5, W=200). For CINIC-10, we used V =10 and W=100. The number of total AL rounds R was set at 11, yielding total 12,000 labeled points in the final round. The hyperparameters of our algorithm include the number of diffusion steps H and its step size δ, the number of nearest neighbors K (Eq. 13), and the scale parameters σ2 w (Eq. 9) and σ2 γ (Eq. 10) of the Laplacian. σ2 w was decided as the mean of the squared distances q following the standard practice in graph Laplacian applications (Hein and Maier 2007). σ2 γ was similarly determined as the mean squared distance of δg, but it was scaled by a factor 1ρ. ρ balances the contributions of bw and γ, and it was set to 0.2. ... K was fixed at a small value of 7. ... δ was fixed at a small value of 0.1 ... while H was determined at 5. The parameters for local data batch selection were determined similarly. |