Thinking Machines Lab Unveils Advancement in Real-Time Human-AI Collaboration

Thinking Machines argued that most real-world work depends on ongoing interaction between people and AI systems rather than fully autonomous operation.

Share

Mira Murati’s AI startup, Thinking Machines Lab, has introduced a research preview of interaction models, a new class of AI models that can handle real-time collaboration with humans across audio, video, and text without relying on external software scaffolding.

The company announced TML-Interaction-Small, a 276B-parameter MoE model with 12B active parameters that treats conversation as a live stream instead of a stop-start chat box.

In a blog post published May 11, the company said the models are built from scratch to support continuous interaction, allowing users and AI systems to communicate simultaneously rather than through traditional turn-based exchanges.

Thinking Machines says its interaction models instead break audio, video, and text into 200-millisecond micro-turns, allowing the AI to listen, observe, speak, draw, search, and use tools simultaneously during an ongoing interaction.

The company said the approach aims to make AI systems more responsive and collaborative during tasks that require ongoing human input. For example, the model understands timing-based instructions, such as waiting four seconds before responding.

“We think interactivity should scale alongside intelligence; the way we work with AI should not be treated as an afterthought,” Thinking Machines said in the post.

The company said current AI systems are largely optimised for autonomous task completion, where users provide instructions upfront and wait for the model to finish. According to the company, this creates what it described as a “collaboration bottleneck” because users cannot continuously guide, clarify, or interrupt the system during work.

Thinking Machines argued that most real-world work depends on ongoing interaction between people and AI systems rather than fully autonomous operation. It said current models operate in “a single thread”, meaning they stop perceiving new information while generating responses and cannot process simultaneous streams of communication.

To address this, the company said it developed a “multi-stream, micro-turn design” that enables the model to continuously receive and respond to information in real time across different modalities.

The startup also criticised the use of external systems, or “harnesses”, that many existing AI products rely on for features such as interruptions and multimodal interaction. It argued that interactivity should be built directly into the model architecture itself.

“For interactivity to scale with intelligence, it must be part of the model itself,” the company said.

Thinking Machines said the goal is to allow humans and AI systems to collaborate “the same way we do with other people: messaging, talking, listening, seeing, showing, and interjecting as needed.”

ALSO READ: The Playground is Closed: 10 Hard Truths from the Cisco AI Summit

Staff Writer
Staff Writer
The AI & Data Insider team works with a staff of in-house writers and industry experts.

Related

spot_img

Unpack More