StreamForce: Streaming Video Generation with Streaming Force Control

1Northeastern University    2Impossible Research.    3University of California, Berkeley    4University of Illinois Urbana-Champaign
*Equal contributions    Equal advising
TL;DR: StreamForce generates videos from a single image while users continuously apply and modify physical forces, enabling causal streaming control over local pushes and global effects such as wind.

Real-Time Interaction Demo

The demo shows how StreamForce consumes force inputs during generation, enabling users to steer evolving videos instead of specifying the entire motion sequence in advance.

Abstract

We introduce StreamForce, a streaming video generation framework that enables physically grounded control through continuous force inputs. Unlike prior video models that train separate models for different force types, assume fixed forces, or rely on non-causal processing, StreamForce is a causal and unified model that responds instantly and coherently to both local and global, time-varying forces. To achieve this, we design a unified force representation as a control signal and develop a distillation pipeline for force-controllable video generation. Our model combines autoregressive efficiency with force responsiveness, sustaining stable photometric and dynamic realism. StreamForce runs at up to 16.6 FPS on a single GPU, achieving state-of-the-art performance in both force adherence and motion realism.

Physical Behaviors

StreamForce inherits physical priors that support falling and bouncing, and different responses under varied mass or friction.

Falling and Bouncing

Falling glass
Falling and bouncing object

A force pushes the object across a table; once it passes the edge, it falls under gravity and rebounds on the ground with a plausible loss of energy, emerging from the spatiotemporal priors of the pretrained video model.

Mass-Aware Motion

Empty glass: faster motion
Glass with milk: slower motion

Under the same horizontal force, the glass containing milk moves more slowly than the empty glass, reflecting the expected relationship between object mass and acceleration. This behavior emerges from the model's physical priors, not from explicit mass conditioning.

Friction-Aware Motion

Lower-friction surface: travels farther
Higher-friction surface: travels shorter

The same horizontal force is applied to the same T-shaped object on two surfaces with different friction: the object travels a noticeably shorter distance on the higher-friction surface, reflecting friction opposing motion and dissipating kinetic energy.

Baseline Comparisons

Four-way comparison against Wan2.2 5B TI2V (text-only), Force-Prompting (bidirectional), and Kling Motion Brush, across both force preservation and force change settings, for both global and local forces.

Local Force Preservation

Wan2.2 TI2V
Force-Prompting
Kling Motion Brush
StreamForce (Ours)

Global Force Preservation

Wan2.2 TI2V
Force-Prompting
Kling Motion Brush
StreamForce (Ours)

Local Force Change

Wan2.2 TI2V
Force-Prompting
Kling Motion Brush
StreamForce (Ours)

Global Force Change

Wan2.2 TI2V
Force-Prompting
Kling Motion Brush
StreamForce (Ours)

Streaming Force Updates During Generation

Force-Prompting: bidirectional
Kling Motion Brush
StreamForce (Ours): causal streaming force control

This section highlights the core streaming setting: forces arrive while the video is being generated, and users can modify them at any time to steer the future rollout. StreamForce is causal, so it reacts online to changing force inputs; bidirectional baselines require the full force sequence upfront and cannot naturally support the same real-time interaction.

In the smoke-alarm example, a user-applied wind force directed toward the right gradually increases in magnitude. StreamForce updates the generated dynamics as the force changes.

Multi-Force and Part-Level Interaction

Applying two local forces simultaneously to different parts of a T-shaped object produces coordinated translation and rotation that drive the object toward a target position, demonstrating multi-force, part-level interaction.

BibTeX


          @misc{wang2026streamingvideogenerationstreaming,
            title={Streaming Video Generation with Streaming Force Control}, 
            author={Hanhui Wang and Yiming Xie and Haiwen Feng and Zhaoyang Lv and Shenlong Wang and Huaizu Jiang},
            year={2026},
            eprint={2606.07508},
            archivePrefix={arXiv},
            primaryClass={cs.CV},
            url={https://arxiv.org/abs/2606.07508}, 
          }