[Feature]Add video-to-video (V2V) pipeline #829

Gary-ChenJL · 2025-09-30T20:50:37Z

Adds support for Wan2.1 control model

control model support for Wan2.1

gemini-code-assist · 2025-09-30T20:51:08Z

Summary of Changes

Hello @Gary-ChenJL, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly extends the fastvideo library by introducing a comprehensive video-to-video (V2V) generation pipeline. It specifically adds support for the Wan2.1 control model, allowing users to generate new videos conditioned on an existing input video. The changes encompass new configuration classes for the specialized CLIP encoder, a dedicated stage for encoding video inputs into latent space, and robust handling of video file loading and preprocessing, alongside an illustrative example script.

Highlights

Video-to-Video Pipeline Introduction: A new WanVideoToVideoPipeline is added, enabling video-to-video generation capabilities within the fastvideo library.
Wan2.1 Control Model Integration: Explicit support for the Wan2.1 control model is introduced, including specific CLIP vision encoder configurations (WAN2_1ControlCLIPVisionConfig) and pipeline configurations (WANV2VConfig).
Enhanced Video Input Handling: The system now supports direct video inputs via a new video_path parameter, with load_video updated to extract FPS and handle video preprocessing (resampling, resizing) for V2V tasks.
Dedicated Video VAE Encoding: A VideoVAEEncodingStage is implemented to efficiently encode input video frames into latent space, which is crucial for the V2V generation process.
Customizable CLIP Encoder Behavior: The CLIP attention mechanism is made more flexible, allowing conditional scaling and causal attention based on model configuration, which is utilized by the new control model.
New Example Script: An example script (basic_wan2_2_Fun.py) is provided to demonstrate the usage of the new video-to-video pipeline with the Wan2.1-Fun-1.3B-Control-Diffusers model.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a video-to-video (V2V) pipeline for the Wan2.1 control model. The changes include new pipeline configurations, stages for handling video inputs, and updates to support the V2V workflow. My review focuses on correctness, maintainability, and potential bugs. I've identified several critical issues, including a resource leak from un-deleted temporary files, typos that will lead to ImportError and AttributeError, and a confusing typo in a configuration parameter (is_casual vs is_causal). I've also suggested improvements for code clarity, style, and efficiency, such as simplifying a complex preprocessing function and improving type hints.

fastvideo/configs/sample/registry.py

fastvideo/pipelines/stages/input_validation.py

fastvideo/models/vision_utils.py

fastvideo/models/encoders/clip.py

examples/inference/basic/basic_wan2_2_Fun.py

fastvideo/configs/pipelines/wan.py

fastvideo/pipelines/basic/wan/wan_v2v_pipeline.py

fastvideo/pipelines/stages/input_validation.py

examples/inference/basic/basic_wan2_2_Fun.py

fastvideo/utils.py

fastvideo/configs/models/encoders/clip.py

fastvideo/configs/sample/wan.py

fastvideo/models/vision_utils.py

SolitaryThinker · 2025-10-10T07:15:58Z

fastvideo/models/vision_utils.py

+    pil_images = []
+    original_fps = None
+
+    try:
+        if video_path.endswith(".gif"):
+            gif = PIL.Image.open(video_path)
+            try:
+                # GIF FPS estimation
+                if hasattr(gif, 'info') and 'duration' in gif.info:
+                    duration_ms = gif.info['duration']
+                    if duration_ms > 0:
+                        original_fps = 1000.0 / duration_ms
+
+                while True:
+                    pil_images.append(gif.copy())
+                    gif.seek(gif.tell() + 1)
+            except EOFError:
+                pass
+        else:
+            try:
+                imageio.plugins.ffmpeg.get_exe()
+            except AttributeError:
+                raise AttributeError(
+                    "`Unable to find an ffmpeg installation on your machine. Please install via `pip install imageio-ffmpeg"
+                ) from None
+
+            with imageio.get_reader(video_path) as reader:
+                try:
+                    original_fps = reader.get_meta_data().get('fps', None)
+                except:
+                    # Fallback: try to get from format-specific metadata
+                    try:
+                        original_fps = reader.get_meta_data().get('source_size', {}).get('fps', None)
+                    except:
+                        pass
+
+                for frame in reader:
+                    pil_images.append(PIL.Image.fromarray(frame))
+    finally:
+        # Clean up temporary file if it was created
+        if was_tempfile_created and os.path.exists(video_path):
+            os.remove(video_path)


I know the original code was not the best, but could you clean this up and remove all of these try except blocks?

SolitaryThinker · 2025-10-10T07:37:04Z

fastvideo/models/vision_utils.py

        pil_images = convert_method(pil_images)

-    return pil_images
+    return pil_images, original_fps


There's a few other places using this util method. We should either update those places or use a flag to return fps

refactored, now backward compatible

fastvideo/models/vision_utils.py

fastvideo/pipelines/basic/wan/wan_v2v_pipeline.py

Add video-to-video (V2V) pipeline and

49e2631

control model support for Wan2.1

gemini-code-assist bot reviewed Sep 30, 2025

View reviewed changes

minor fixes

69f998d

SolitaryThinker mentioned this pull request Oct 3, 2025

Video to video? #446

Open

finalizes testing

9d402df

SolitaryThinker added the go Trigger Buildkite CI label Oct 9, 2025

Gary-ChenJL added 2 commits October 10, 2025 02:54

formating

1746e68

fix clip config attribute error

6b492d6

SolitaryThinker requested changes Oct 10, 2025

View reviewed changes

Gary-ChenJL added 2 commits October 11, 2025 18:06

fix requested changes

ad90d71

cicd

177e546

[Feature]Add video-to-video (V2V) pipeline #829

Are you sure you want to change the base?

[Feature]Add video-to-video (V2V) pipeline #829

Uh oh!

Conversation

Gary-ChenJL commented Sep 30, 2025

Uh oh!

gemini-code-assist bot commented Sep 30, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SolitaryThinker Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Gary-ChenJL Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

SolitaryThinker Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Gary-ChenJL Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants