AI
AI News

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Source:arXiv
Original Author:Hau-Shiang Shiu et al.
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Image generated by Gemini AI

Stream-DiffVSR introduces a causally conditioned diffusion framework for video super-resolution, allowing real-time processing by relying solely on past frames. It features a four-step distilled denoiser and an Auto-regressive Temporal Guidance module, achieving 720p frame processing in just 0.328 seconds on an RTX4090 GPU. This method reduces latency by over 130x compared to existing state-of-the-art methods, making it viable for low-latency applications. More details are available on its project page.

Stream-DiffVSR: A Breakthrough in Low-Latency Video Super-Resolution

A new framework, Stream-DiffVSR, has emerged as a solution for video super-resolution (VSR) in latency-sensitive applications. By focusing solely on past frames, Stream-DiffVSR significantly reduces processing times while enhancing perceptual quality.

Technical Innovations and Performance Metrics

  • A four-step distilled denoiser that accelerates inference times.
  • An Auto-regressive Temporal Guidance (ARTG) module that provides motion-aligned cues during latent denoising.
  • A lightweight temporal-aware decoder featuring a Temporal Processor Module (TPM) to improve detail and maintain temporal coherence.

On an RTX4090 GPU, Stream-DiffVSR can process 720p video frames in just 0.328 seconds, marking a significant improvement over previous methods. Compared to the current state-of-the-art model TMP, Stream-DiffVSR shows a +0.095 improvement in LPIPS scores while achieving a latency reduction of over 130 times.

Implications for Online Deployment

Stream-DiffVSR's capabilities position it as the first viable diffusion VSR method for low-latency online applications, potentially transforming sectors that rely on real-time video processing.

Related Topics:

Stream-DiffVSRlow-latencyvideo super-resolutionAuto-regressive Temporal Guidancediffusion-based methods

📰 Original Source: https://arxiv.org/abs/2512.23709v1

All rights and credit belong to the original publisher.

Share this article