Dual-Agent Reinforcement Learning for Automated Feature Generation
Authors: Wanfu Gao, Zengyao Man, Hanlin Pan, Kunpeng Liu
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on multiple datasets demonstrate that the proposed method is effective. We conduct experiments on 21 datasets from UCI [Public, 2024b], Kaggle [Howard, 2024], and Open ML [Public, 2024a], Lib SVM [Lin, 2024], comprising 12 classification tasks and 9 regression tasks. |
| Researcher Affiliation | Academia | 1College of Computer Science and Technology, Jilin University, China 2Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, China 3Department of Computer Science, Portland State University, Portland, OR 97201 USA |
| Pseudocode | Yes | Pseudocode of the DARL, experimental settings, comparison of different downstream task and convergence analysis are presented in Appendix. |
| Open Source Code | Yes | The code is available at https://github.com/extess0/DARL. |
| Open Datasets | Yes | We conduct experiments on 21 datasets from UCI [Public, 2024b], Kaggle [Howard, 2024], and Open ML [Public, 2024a], Lib SVM [Lin, 2024], comprising 12 classification tasks and 9 regression tasks. |
| Dataset Splits | Yes | We adopt random forest as the downstream machine learning model and performed 5-fold stratified cross-validation in all experiments, instead of a simple 70%-30% split. |
| Hardware Specification | Yes | All experiments are conducted on the Ubuntu operating system, Intel(R) Core(TM) i9-10900X CPU@ 3.70GHz, and V100, with the framework of Python 3.10.12 and Py Torch 1.13.1. |
| Software Dependencies | Yes | All experiments are conducted on the Ubuntu operating system, Intel(R) Core(TM) i9-10900X CPU@ 3.70GHz, and V100, with the framework of Python 3.10.12 and Py Torch 1.13.1. |
| Experiment Setup | Yes | The number of epochs is limited to 200. By using 6 exploration steps per epoch, we further control the number of features generated. We adopt random forest as the downstream machine learning model and performed 5-fold stratified cross-validation in all experiments, instead of a simple 70%-30% split. We used the Adam [Kingma and Ba, 2015] optimizer with a learning rate of 0.0001 to optimize DQN, and set the memory limit of experience replay to 24, and the DQN batch size to 8. The model incorporated 8 attention heads, with a word embedding vector dimension of 8 and a model hidden layer dimension of 128. The discrimination agent s reward weights α, β, γ, and δ are set to 0.1, 0.1, 1, and 0.01. |