PaLD: Detection of Text Partially Written by Large Language Models

Authors: Eric Lei, Hsiang Hsu, Chun-Fu Chen

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we demonstrate the effectiveness of Pa LD compared to baseline methods that build on existing LLM text detectors. In Section 4, we empirically illustrate that Pa LD-PE and Pa LD-TI outperform existing detection methods on two language datasets: Writing Prompts (Fan et al., 2018) and Yelp Reviews (Yelp, 2014).
Researcher Affiliation Collaboration Eric Lei12 , Hsiang Hsu2, Chun-Fu (Richard) Chen2 1University of Pennsylvania, 2JPMorgan Chase Global Technology Applied Research EMAIL, EMAIL
Pseudocode Yes Algorithm 1 Greedy algorithm, Pa LD-TI. Initialize S = { arg max e {1,...,n} fx({e})} Initialize A = {1, . . . , n} \ S while fx(S) increases do e = arg maxe A fx(S e) fx(S) S S e A A \ e end while
Open Source Code Yes Code to reproduce our experiments can be accessed at https://github.com/jpmorganchase/pald.
Open Datasets Yes We evaluate our methods on the Writing Prompts (WP) (Fan et al., 2018) and Yelp Reviews (Yelp) (Yelp, 2014) datasets which are typically used to benchmark LLM text detection.
Dataset Splits Yes In total, for each dataset, we generate 3,600 and 300 mixed texts for training and test splits, respectively. For the training split, the LLM target fractions are ranged from 0.1, 0.2, . . . , to 0.9; while the LLM target fractions are set to 0.25, 0.5, 0.75 for the test split, and the amount of data at each fraction are similar as a balanced dataset.
Hardware Specification Yes On a single A10 GPU, the exact solver takes 30s for a 10-segment text on average, whereas greedy takes 2.1s.
Software Dependencies No The paper mentions software like RoBERTa, GPT-4o, and Claude-3.5-Sonnet, but does not provide specific version numbers for these or any other libraries or frameworks used in the implementation.
Experiment Setup Yes We use Logit Norm with temperature τ = 0.005, and train the RoBERTa model on the training split for the respective datasets. For the posterior, we choose the prior P(δ) to be the Beta(2, 2) distribution. During the inference stage, we sample 5000 samples, discarding the first 1000 due to burn-in, using Metropolis-Hastings (Gelman et al., 2004) with a proposal distribution as the truncated normal centered at the previous sample, truncated to [0, 1].