SyncedResearchWednesday, May 28, 2025

Adobe Research Unlocking Long-Term Memory in Video World Models with State-Space Models

AI-Generated Summary

Adobe Research has developed a breakthrough approach to video world models by integrating State-Space Models (SSMs) with dense local attention mechanisms. This combination addresses a fundamental limitation in AI video generation: the inability to maintain coherent long-term memory and dependencies across extended video sequences. The researchers employed advanced training strategies including diffusion forcing and frame local attention to enhance model performance.

The technical innovation lies in SSMs' efficiency at capturing long-range dependencies while local attention ensures frame-to-frame coherence. This dual approach allows video world models to generate longer, more consistent sequences without the computational constraints that typically plague transformer-based models. The breakthrough suggests a more scalable pathway for building AI systems that can understand and generate video content over extended timeframes.

The implications extend beyond video generation to broader AI applications requiring temporal consistency. Successfully modeling long-term dependencies in videos could advance autonomous systems, content creation tools, and video understanding AI. This research represents progress on a historically difficult challenge in machine learning and may influence how future video generation models are architected.

Key Takeaways

Adobe Research has developed a breakthrough approach to video world models by integrating State-Space Models (SSMs) with dense local attention mechanisms.
This combination addresses a fundamental limitation in AI video generation: the inability to maintain coherent long-term memory and dependencies across extended video sequences.
The researchers employed advanced training strategies including diffusion forcing and frame local attention to enhance model performance.
The technical innovation lies in SSMs' efficiency at capturing long-range dependencies while local attention ensures frame-to-frame coherence.

Read the full article on Synced

Read on Synced

NVIDIA21h ago

National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources

Research

NVIDIA is using National Robotics Week to showcase advances in physical artificial intelligence and their real-world applications across multiple industries. The company is emphasizing breakthroughs in robot learning, simulation, and foundation models that are enabling robots to operate more effectively in tangible environments. These technological developments represent a shift from AI systems confined to digital spaces toward practical robotic systems deployed in agriculture, manufacturing, energy, and other sectors.

The Register21h ago

Chatbots are great at manipulating people to buy stuff, Princeton boffins find

Research

Researchers at Princeton University have found that large language models (LLMs) used in advertising are highly effective at persuading consumers to make purchases. The study demonstrates that chatbots powered by AI can employ sophisticated persuasion techniques that significantly influence buying behavior, raising concerns about their deployment in commercial applications without proper oversight.

Simon Willison21h ago

GLM-5.1: Towards Long-Horizon Tasks

Research

Chinese AI lab Zhipu AI has released GLM-5.1, a 754-billion parameter model licensed under MIT terms. The model maintains the same size as its predecessor GLM-5 and is available through multiple platforms including OpenRouter and Hugging Face, where it occupies 1.51TB of storage. This release represents a significant contribution to open-source AI development given its size and permissive licensing.

Simon Willison21h ago

SQLite WAL Mode Across Docker Containers Sharing a Volume

Research

A technical investigation examined whether SQLite's Write-Ahead Logging (WAL) mode functions reliably when multiple Docker containers access the same database file through a shared volume. The research was prompted by discussions on Hacker News questioning the safety of this configuration, particularly regarding WAL's shared memory mechanisms and potential conflicts between containers.

Ars Technica21h ago

Researchers disclose vulnerabilities in IP KVMs from four manufacturers

Research

Researchers have identified critical security flaws in IP-based Keyboard-Video-Mouse (KVM) devices from multiple manufacturers, which are widely used in data centers and server management infrastructure to remotely control computers. This matters to the AI community because many organizations running large-scale AI training and inference clusters rely on these KVM systems for administrative access, and the vulnerabilities could allow attackers to gain unauthorized control over critical AI infrastructure, steal sensitive models, or disrupt operations. The disclosure impacts organizations across industries that depend on secure remote management of their computational systems.