LogiCase: Effective Test Case Generation from Logical Description in Competitive Programming

Authors: Sicheol Sung, Aditi, Dogyu Kim, Yo-Sub Han, Sang-Ki Ko

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the Code Contests dataset demonstrate that CCFG-based test cases outperform baseline methods in identifying incorrect algorithms, achieving significant gains in validity and effectiveness. Our approach provides a scalable and reliable grammar-driven framework for enhancing automated competitive programming evaluations. We evaluate the practical usefulness of CCFGs through experimental validation.
Researcher Affiliation Academia Sicheol Sung1 , Aditi2 , Dogyu Kim3 , Yo-Sub Han1 and Sang-Ki Ko2 1Yonsei University 2University of Seoul 3Kangwon National University
Pseudocode No The paper provides formal grammar definitions (Example 2 and Example 3) but does not include structured pseudocode or algorithm blocks for a procedural method or process.
Open Source Code Yes All implementations and associated codes and datasets used in these experiments are available in our Git Hub repository.2 2https://github.com/Aditi1612/Grammar-based-test-case-generation
Open Datasets Yes We use the Code Contests dataset, which consists of various programming problems sourced from different competitive platforms [Li et al., 2022].
Dataset Splits Yes After categorizing the grammars, we split them into a training dataset with 1,200 problems and an evaluation dataset with 300 problems.
Hardware Specification No The paper describes experiments and model training but does not provide specific details about the hardware used, such as CPU or GPU models, or memory specifications.
Software Dependencies No The paper mentions using a fine-tuned Code T5 model and an Adam optimizer but does not specify versions for any programming languages, libraries, or frameworks used in the implementation.
Experiment Setup Yes We use Adam optimizer with learning rate 10 5 and cross-entropy loss function to train each Code T5 model. We generate candidate grammars and constraints with repetition penalty 2.5 and length penalty 1.0 from each model.