Microsoft Unveils Fara-7B Agentic Model

Microsoft is also releasing WebTailBench, a new test set with 609 real-world tasks across 11 categories.

Share

 

Microsoft has launched Fara-7B, its first small language model built to operate a computer the way a person does. The company claims the 7-billion-parameter model matches or beats larger agentic systems on live web tasks while running locally with lower latency and stronger privacy.

Fara-7B reads a webpage visually and completes tasks by clicking, typing and scrolling on predicted coordinates. It does not rely on accessibility trees or separate parsing layers. 

Microsoft says the model finishes tasks in about 16 steps on average, which is far fewer than many comparable systems. The model is trained on 145,000 synthetic trajectories generated through the Magentic-One framework and is built on Qwen2.5-VL-7B with supervised fine-tuning.

The company positions Fara-7B as an everyday computer-use agent that can search, summarise, fill forms, manage accounts, book tickets, shop online, compare prices and find jobs or real estate listings. 

Microsoft is also releasing WebTailBench, a new test set with 609 real-world tasks across 11 categories. Fara-7B leads all computer-use models across every segment, including shopping, flights, hotels, restaurants and multi-step comparison tasks.

The company offers two ways to run the model. Azure Foundry hosting lets users deploy Fara-7B without downloading weights or using their own GPUs. Advanced users can self-host through VLLM on GPU hardware. 

The evaluation stack relies on Playwright and an abstract agent interface that can plug in any model. Microsoft warns that Fara-7B is an experimental release and should be run in sandboxed settings without sensitive data.

Earlier this year, Microsoft launched Phi-4-multimodal and Phi-4-mini, the latest additions to its Phi family of small language models (SLMs).

Last month, Google DeepMind released the Gemini 2.5 Computer Use model, a specialised version of its Gemini 2.5 Pro AI that can interact with user interfaces. The model is available in preview via the Gemini API through Google AI Studio and Vertex AI Studio.

ALSO READ: OpenAI’s GPT-5.1-Codex-Max Can Work for More Than 24 Hours

Staff Writer
Staff Writer
The AI & Data Insider team works with a staff of in-house writers and industry experts.

Related

Unpack More