C3oT: Generating Shorter Chain-of-Thought Without Compromising Effectiveness
Authors: Yu Kang, Xianghui Sun, Liangyu Chen, Wei Zou
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments over four datasets from arithmetic and commonsense scenarios, showing that the proposed method is capable of compressing the length of generated Co T by up to more than 50% without compromising its effectiveness. Additionally, we design extensive experiments and discussions to analyze the contribution of different components in our approach, as well as to explore future research directions of Co T compression based on our method. |
| Researcher Affiliation | Industry | Yu Kang, Xianghui Sun, Liangyu Chen *, Wei Zou Beike Inc., Beijing, China EMAIL |
| Pseudocode | No | The paper describes the C3o T framework and its components (Compressor, Conditioned Training, Conditioned Inference) in narrative form, supplemented by a diagram in Figure 1, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | For math reasoning, we use GSM8K (Cobbe et al. 2021) and Math QA (Amini et al. 2019). As for commonsense reasoning, we use ECQA (Aggarwal et al. 2021) and Strategy QA (Geva et al. 2021). |
| Dataset Splits | Yes | We followed the training and testing set division as outlined in the original paper of the dataset used, trained C3o T on the training set, and evaluated its performance on the test set, excluding Strategy QA. Due to the inaccessibility of ground truths for the Strategy QA test set, we proceeded to further split the original Strategy QA training set into training and test sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the Adam W optimizer and LLa MA-2-Chat models, but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | In this paper, we train C3o T based on LLa MA-2-Chat-7B and -13B (Touvron et al. 2023). We fine-tune the model for 2 epochs on each dataset using the Adam W optimizer with a sequence length of 2,048 tokens and a batch size of 128. The Adam W optimizer s hyperparameters are set as follows: β1 = 0.9, β2 = 0.999, ϵ = 10 6, and weight decay of 0.001. We employ a cosine learning rate schedule with a maximum learning rate of 1 10 5. |