AniSora

Revolutionary Anime Video Generation Powered by AI

Create stunning animated videos with one-click generation across diverse anime styles using Bilibili's cutting-edge open-source model.

AniSora Demo
Latest: AniSora V2.0

Interactive AniSora Demo

Experience AniSora in action with this interactive demo powered by Wan2.1-Fast

Overview of AniSora

Index-AniSora is a cutting-edge project in the field of animated video generation, specifically tailored for anime styles. Developed by Bilibili, it enables users to create videos from inputs like images across diverse anime formats.

The AniSora model is backed by significant AI research, with details published in academic circles under the paper titled "AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era", adding credibility to its capabilities.

With multiple versions available, each with specific features and hardware support, AniSora offers flexibility for different user needs - from cost-effective solutions to high-performance scenarios.

Key AniSora Highlights

  • Open-source model by Bilibili
  • One-click anime video generation
  • Multiple versions for different hardware
  • Research published in IJCAI'25
  • Supports diverse anime styles

Features and Capabilities of AniSora

Wide Range of Anime Styles

The AniSora model supports numerous anime formats including series episodes, Chinese original animations, manga adaptations, VTuber content, and promotional videos.

Hardware Flexibility

Different versions of AniSora are optimized for various hardware, from cost-effective options like RTX 4090 GPUs to high-performance Huawei Ascend 910B NPUs.

Reinforcement Learning

AniSora includes an RL (Reinforcement Learning) framework that helps in evaluating and fine-tuning the model's performance to align with anime aesthetics and user preferences.

Open Source Accessibility

As an open-source project, AniSora provides full access to training and inference code, enabling researchers and developers to extend and improve upon the model.

Robust Data Pipeline

AniSora is trained on over 10 million high-quality data samples with a specialized pipeline for data cleaning, ensuring optimal results in anime video generation.

Benchmark System

The project includes an anime-optimized benchmark system with 948 animation video clips for evaluation, featuring specialized scoring algorithms aligned with ACG aesthetics.

AniSora Supported Anime Styles

Technical Specifications and AniSora Versions

AniSoraV1.0

Based on CogVideoX-5B

Coverage: 80%
  • Cost-effective on RTX 4090 GPUs
  • Localized region guidance
  • Temporal guidance
  • Full training/inference code
View on GitHub

AniSoraV2.0

Based on Wan2.1-14B

Coverage: 90%
  • Distillation-accelerated inference
  • Native NPU support
  • Enhanced stability
  • Huawei Ascend 910B support
View on GitHub

AniSoraV1.0_RL

Based on CogVideoX-5B with RLHF

Innovation: First of its kind
  • First RLHF framework for anime video
  • Human preference alignment
  • Improved anime aesthetics
  • Enhanced user experience
View on GitHub

AniSora Hardware Performance

Different versions of AniSora are optimized for specific hardware configurations, providing a balance between performance, cost, and accessibility. This makes the technology available to a wide range of users, from hobbyists to professional studios.

90% Use Cases Covered
10M+ Training Samples

AniSora Ecosystem and Supporting Tools

AniSora Ecosystem Diagram

Comprehensive Development Tools

Beyond the core model, AniSora includes an ecosystem of tools designed to facilitate development and deployment, ensuring a robust foundation for anime video generation.

Data Pipeline

Located in the data_pipeline directory on GitHub, it supports rapid training data expansion and includes an animate data cleaning pipeline, crucial for maintaining high-quality inputs.

Benchmark System

Found in the reward directory, this anime-optimized system includes 948 animation video clips for evaluation, featuring scoring algorithms and reward models for RL.

RLHF Framework

The first Reinforcement Learning from Human Feedback framework specifically designed for anime video generation, aligning model outputs with human preferences and anime aesthetics.

Training Data and Quality Assurance

AniSora is powered by over 10 million high-quality data samples used in training, ensuring robustness and diversity in generated outputs.

Data Cleaning Pipeline

Specialized process to ensure training data quality

Standard Test Dataset

Aligned with ACG aesthetics for consistent evaluation

Human Preference Integration

Collaborative evaluation process with anime experts

Challenges and Considerations

Technical Challenges

While AniSora represents significant advances in anime video generation, several challenges remain in the field, particularly in maintaining consistency and quality across diverse styles.

Character Consistency

Maintaining consistent character appearance and identity throughout generated videos remains challenging, especially with complex character designs and expressions.

Motion Coherence

Creating smooth, natural motion that follows anime-specific animation principles requires significant training data and specialized approaches.

Hardware Requirements

High-quality anime video generation still requires substantial computational resources, though AniSora aims to make this more accessible.

Ethical Considerations

As with any AI generative technology, AniSora raises important ethical considerations that users and developers should be mindful of.

Training Data Usage

The model is trained on over 10 million high-quality data samples, which may include copyrighted works. Users should be aware of potential legal implications when generating content.

Content Guidelines

Users should adhere to responsible AI principles when generating content, avoiding harmful, misleading, or inappropriate material that could violate community standards.

Creator Attribution

When using AniSora for content creation, proper attribution to the model and acknowledgment of AI-generated content is recommended.

Future Directions and Roadmap

AniSora Todo List

NVIDIA GPU Training

In Progress

Adding support for NVIDIA GPU training for AniSoraV1.0 to increase accessibility.

14B Model Release

Coming Soon

Releasing the 14B version of AniSoraV2.0 before the end of May 2025.

Training Dataset

Planning

Opening applications for access to high-quality training dataset for research purposes.

Benchmark Updates

Ongoing

Updating the benchmark with the latest state-of-the-art models' performance comparisons.

The AniSora Vision

The AniSora project represents just the beginning of what's possible in anime video generation. The team at Bilibili is committed to advancing this technology further, with ambitious goals for future development.

By continuing to refine the models, expanding dataset quality, and integrating cutting-edge AI research, AniSora aims to revolutionize content creation in the anime industry, empowering creators worldwide.

2025 Current Release
2026+ Future Vision

Future Research Directions

Enhanced Character Control

Developing more precise controls for character expressions, movements, and interactions in generated videos.

Lower Resource Requirements

Working towards more efficient models that can run on consumer-grade hardware without compromising quality.

Longer Video Generation

Extending capabilities to create longer, narrative-driven anime sequences with consistent style and characters.

Community-Driven Development

Expanding open-source collaboration to leverage the collective expertise of anime creators and AI researchers worldwide.