Microsoft Launches its First Image Generation Model MAI-Image-1

With MAI-Image-1, Microsoft said that the team has focused on avoiding repetitive or generically stylised outputs.

Share

Microsoft announced MAI-Image-1, its first image generation model developed ‘entirely in-house’, and it ranked #9 on LMArena. On this platform, users pose queries to two anonymous chatbots and vote for the best responses until a winner emerges.

Microsoft said that the model will be available on Copilot and Bing Image Creator ‘very soon’ and can be used for testing at LMArena. 

With MAI-Image-1, Microsoft said that the team has focused on avoiding repetitive or generically stylised outputs. “For example, we prioritised rigorous data selection and nuanced evaluation focused on tasks that closely mirror real-world creative use cases,” said Microsoft, adding that it has taken feedback from professionals in creative industries. 

The model is said to ‘excel’ at generating landscapes and photorealistic imagery, which involves accurately capturing details of lighting, shadows, and reflections. “This is particularly so when compared to many larger, slower models,” said Microsoft. 

Microsoft’s model scored 1096 points on the text-to-image leaderboard on LMArena, while Gemini-2.5-Flash (Nano-Banana) scored 1154 points at rank 2, and OpenAI’s model scored 1123 points at rank 7. However, leading the race is Hunyuan-image-3.0, the AI model developed by the Chinese tech giant Hunyuan. 

Besides MAI-Image-1, Microsoft has also developed other in-house models, such as MAI-Voice-1, a natural speech generation model, and the Phi series of language models, which are small language models offering efficient performance in reasoning tasks. 

This is in addition to the company’s support for OpenAI’s efforts to develop its own models, providing both financial backing and infrastructure. 

That said, AI image generation is in a period of intense activity. OpenAI’s model went viral for its striking imitation of Studio Ghibli’s art style, soon followed by Google’s ‘Nano Banana,’ which set a new benchmark with its powerful AI editing capabilities.

Using LMArena, AIM compared Microsoft’s MA1-Image-1, Google’s Gemini-2.5-Flash (nano-banana), and OpenAI’s GPT-image-1 on a prompt showing two people in a café by a window during late afternoon. 

The test focused on how well each model handled mixed lighting, reflections, and shadow realism. Users can provide similar prompts to test all of these models on LMArena.

ALSO READ: Databricks Launches Data Intelligence for Cybersecurity

Staff Writer
Staff Writer
The AI & Data Insider team works with a staff of in-house writers and industry experts.

Related

Unpack More