Iterative Vectors: In-Context Gradient Steering without Backpropagation

Authors: Yiting Liu, Zhi-Hong Deng

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate IVs across various tasks using four popular models and observe significant improvements. Our findings suggest that in-context activation steering is a promising direction, opening new avenues for future research.
Researcher Affiliation Academia 1State Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University. Correspondence to: Zhi-Hong Deng <EMAIL>.
Pseudocode Yes The pseudocode for the extraction and evaluation process is available in Appendix B. To facilitate understanding, Appendix C includes an example of the processes described. Algorithm 1 Extraction of Iterative Vectors Algorithm 2 Evaluation Algorithm 3 Episodic Functions
Open Source Code Yes Our code is available on Git Hub.
Open Datasets Yes Details of all the datasets used in this paper can be found in Appendix E, while additional results with the other two metrics are provided in Appendix F. E. Datasets A full list of all datasets utilized in this research, along with their corresponding access labels, is detailed in Table 5. The datasets are obtained from Hugging Face (Lhoest et al., 2021).
Dataset Splits Yes For a given split of an n-way k-shot classification task T = {Ttrain, Tval, Ttest}, which comprises textual query-answer pairs (x, y), an ICL episode 1 is sampled as: We evaluate over 200 episodes for both extraction (Ttrain) and hyperparameter search (Tval).
Hardware Specification Yes All experiments can be performed on a single Nvidia RTX A6000 GPU unless stated otherwise. Conducted on 3 Nvidia RTX A6000 GPUs.
Software Dependencies No The paper mentions various language models (GPT-J-6B, Llama 2, Llama 3.1) and the Hugging Face platform for datasets but does not provide specific version numbers for software libraries or dependencies used in their implementation.
Experiment Setup Yes For the hyperparameters of IVs, we use a fixed iterative batch size of b = 10 and explore the extraction strength and inference strength α1, α2 {0.1, 0.3, 0.5, 0.7, 0.9} across all tasks. Regarding the extraction shot k, we test k {1, 2, 3, 4} for both TVs and IVs. All experiments were conducted using a predetermined random seed (42) to mitigate selection bias. To ensure a robust representation of result distributions, the tests are averaged over a substantial number of episodes, namely 10,000. We reuse hyperparameters obtained from prior searches in the main experiment (k = 4, b = 10 fixed, α1 = 0.3, α2 = 0.5).