Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

•

Original Author:Hau-Shiang Shiu et al.

•

December 29, 2025

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Image generated by Gemini AI

Stream-DiffVSR introduces a causally conditioned diffusion framework for video super-resolution, allowing real-time processing by relying solely on past frames. It features a four-step distilled denoiser and an Auto-regressive Temporal Guidance module, achieving 720p frame processing in just 0.328 seconds on an RTX4090 GPU. This method reduces latency by over 130x compared to existing state-of-the-art methods, making it viable for low-latency applications. More details are available on its project page.

Stream-DiffVSR: A Breakthrough in Low-Latency Video Super-Resolution

A new framework, Stream-DiffVSR, has emerged as a solution for video super-resolution (VSR) in latency-sensitive applications. By focusing solely on past frames, Stream-DiffVSR significantly reduces processing times while enhancing perceptual quality.

Technical Innovations and Performance Metrics

A four-step distilled denoiser that accelerates inference times.
An Auto-regressive Temporal Guidance (ARTG) module that provides motion-aligned cues during latent denoising.
A lightweight temporal-aware decoder featuring a Temporal Processor Module (TPM) to improve detail and maintain temporal coherence.

On an RTX4090 GPU, Stream-DiffVSR can process 720p video frames in just 0.328 seconds, marking a significant improvement over previous methods. Compared to the current state-of-the-art model TMP, Stream-DiffVSR shows a +0.095 improvement in LPIPS scores while achieving a latency reduction of over 130 times.

Implications for Online Deployment

Stream-DiffVSR's capabilities position it as the first viable diffusion VSR method for low-latency online applications, potentially transforming sectors that rely on real-time video processing.

Share this article

Twitter Facebook LinkedIn WhatsApp Reddit

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Stream-DiffVSR: A Breakthrough in Low-Latency Video Super-Resolution

Technical Innovations and Performance Metrics

Implications for Online Deployment

Related Topics:

Share this article