Graphmax for Text Generation
Authors: Bin Liu, Guosheng Yin
JAIR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we demonstrate that the new GTV-based regularization can improve performances in various natural language processing (NLP) tasks in comparison with existing methods. |
| Researcher Affiliation | Academia | Bin Liu EMAIL The Center of Statistical Research, School of Statistics, Southwestern University of Finance and Economics, Chengdu, China Guosheng Yin EMAIL Corresponding author Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, China |
| Pseudocode | Yes | Algorithm 1: Projected Gradient Descent for graphmax. Input: Learning rate α, maximum number of iteration steps T 2 while t < T do 3 at+1 xt α f(xt) ; 4 Sort at+1 = (at+1 1 , . . . , at+1 N ) as (bt+1 1 , . . . , bt+1 N ); 5 Find k = max{1 j N : bt+1 j + (1 Pj i=1 bt+1 j )/j > 0}; 6 Calculate γ = (1 Pk i=1 bt+1 j )/k; 7 Calculate xt+1 i = max{at+1 i + γ, 0}; 8 If xt+1 xt 2 < 10 4 then break; 9 end Output: A near-optimal minimum x of the graphmax. |
| Open Source Code | No | The paper does not explicitly state that the authors are releasing their own code for the methodology described. It mentions third-party models like GPT-2, BART, LLAMA2, and references a HuggingFace link for OPT, but this is not a release of their own implementation. |
| Open Datasets | Yes | Datasets and Baseline Methods We choose the Amazon (Zhang, Zhao, & Le Cun, 2015) and Yelp (Mc Auley & Leskovec, 2013; Zhang et al., 2015) datasets for general text generation, WMT 16 (Bojar et al., 2016) and WMT 21 corpora (Tran et al., 2021) for machine translation. |
| Dataset Splits | Yes | The reported results are averaged over 100 independent runs. Table 2: The BLEU scores of GPT-2, CTRL, PPLM, OPT, LLAMA2, and the proposed methods on generating product reviews ( standard deviation), averaged over five cross-validation folds. |
| Hardware Specification | Yes | All models are implemented using Pytorch 1.7 on an Intel(R) Xeon(R) CPU E5-2680 v4 2.40GHz, Tesla K80 GPU, and 128G memory, based on the Ubuntu 16.04 platform. |
| Software Dependencies | Yes | All models are implemented using Pytorch 1.7 on an Intel(R) Xeon(R) CPU E5-2680 v4 2.40GHz, Tesla K80 GPU, and 128G memory, based on the Ubuntu 16.04 platform. |
| Experiment Setup | Yes | The maximum number of iteration steps for the graphmax algorithm is set as T = 20, the learning rate α is 10 4, and the convergence threshold is 10 4. Specifically, the iteration process stops when the l2 norm of the difference between consecutive iterations, xt+1 xt 2, is less than 10 4. |