FlexiClip: Locality-Preserving Free-Form Character Animation
Authors: Anant Khandelwal
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments validate the effectiveness of Flexi Clip in generating animations that are not only smooth and natural but also structurally consistent across diverse clipart types, including humans and animals. By integrating spatial and temporal modeling with pretrained video diffusion models, Flexi Clip sets a new standard for high-quality clipart animation, offering robust performance across a wide range of visual content. |
| Researcher Affiliation | Industry | 1Search Technology Center India, Microsoft, IDC, Bengaluru, India(Bharat). Correspondence to: Anant Khandelwal <EMAIL, EMAIL>. |
| Pseudocode | No | The paper describes methods and mathematical formulations but does not present them in a structured pseudocode or algorithm block. |
| Open Source Code | No | Project Page: https://creativegen.github.io/flexiclip.github.io/ - While a project page is provided, it is not explicitly a source code repository link, nor does the text contain a clear statement about releasing source code for the methodology. |
| Open Datasets | Yes | We used 30 clipart images from Ani Clipart (Wu et al., 2024) and additional ones from Freepik across various categories (humans, animals, and objects), resized to 256 256 pixels. |
| Dataset Splits | No | The paper mentions using 30 clipart images from Ani Clipart and additional ones from Freepik, but it does not specify any training, validation, or test splits for these images. |
| Hardware Specification | Yes | Standard 24-frame animations were rendered on an NVIDIA V100 in 40 minutes using 26 GB. |
| Software Dependencies | No | The paper mentions several tools and models like Diff VG, Model Scope T2V, and the Adam optimizer, but does not provide specific version numbers for any software dependencies used in their implementation. |
| Experiment Setup | Yes | motion trajectories with 8 11 control points were optimized upto 700 steps using Adam (learning rate: 0.5). We applied Model Scope T2V (Wang et al., 2023) with a guidance parameter of 50 for SDS loss. For spatial posing, cubic Bézier control points, we use a 4-layer MLP with Leaky ReLU activation, with the final layer being linear. Temporal Jacobians from pf ODE were predicted with a 3-layer MLP, while two attention networks with 32-dimensional keys/values and two heads modeled motion and deformation effectively. |