Deep Active Learning in the Open World
Authors: Tian Xie, Jifan Zhang, Haoyue Bai, Robert D Nowak
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluations on three long-tailed image classification benchmarks demonstrate that ALOE outperforms traditional active learning baselines, effectively expanding known categories while balancing annotation cost. Our findings reveal a crucial tradeoff between enhancing known-class performance and discovering new classes, setting the stage for future advancements in open-world machine learning. |
| Researcher Affiliation | Academia | Tian Xie EMAIL Jifan Zhang EMAIL Haoyue Bai EMAIL Robert Nowak EMAIL University of Wisconsin-Madison |
| Pseudocode | Yes | Algorithm 1 ALOE: Active Ly Learning in Open-world Environments |
| Open Source Code | No | Code of the work will be be uploaded at https://github.com/Efficient Training/Label Bench. |
| Open Datasets | Yes | Specifically, we utilize three image classification benchmark datasets, CIFAR100-LT (Alex, 2009), Image Net-LT (Deng et al., 2009) and Places365-LT (Zhou et al., 2017). |
| Dataset Splits | Yes | CIFAR100 contains 60,000 images across 100 classes, with 500 training images and 100 testing images per class. To create a long-tailed version of CIFAR100, we use an exponential distribution where the number of examples per class Ni is given by: Ni = N0α i n , where n is the total number of classes, N0 is the number of examples in the most frequent class, and α is the imbalance factor. In our experiments, we set α = 0.01, creating a highly imbalanced version of the dataset. |
| Hardware Specification | Yes | All experiments are conducted on NVIDIA TITAN RTX for CIFAR100-LT and Places365-LT, and NVIDIA A100 for Image Net-LT. |
| Software Dependencies | Yes | Our method is implemented with Py Torch 2.2.0. |
| Experiment Setup | Yes | For evaluation, we follow the latest Label Bench framework (Zhang et al., 2024a), while introducing the new open world setting with dynamic number of classes at each iteration. Specifically, we fine-tune the pretrained CLIP Vi T-B32 image encoder (Radford et al., 2021) with a linear classification head attached. For every iteration of the active learning algorithm, the model is reinitialized to the pretraining checkpoint and finetuned end-to-end on all labeled examples thus far. The clustering method, k-means, is then applied to these embedded features. The number of clusters 2 max(B, |Kt|) is set so that we obtain a surplus of clusters to effectively filter out the in-distribution examples, where B is the batch size and |Kt| is the number of annotated classes in step t. Empirically throughout our experiments, we find the multiplier value 2 to be a good and robust value. To identify OOD examples, we establish a threshold at the 95%-TPR cutoff based on the in-distribution labeled examples. This threshold is commonly used in the OOD detection literature, and ensures at least 95% of the labeled examples are classified as ID. The clusters are then ranked based on their OOD cluster ratio. The top B clusters are selected for further processing, and from each of these selected clusters, the example with the highest OOD score is chosen to form the final batch X(t) of examples for annotation. |