Apple Releases FastVLM and MobileCLIP2 on Hugging Face

The FastVLM family of AI models also includes more powerful versions with 1.5 billion and 7 billion parameters, which improve both performance and response time.

Share

A week before Apple hosts its “Awe Dropping” event on September 9, the Cupertino-based tech giant released two new AI models – FastVLM and MobileCLIP2.

Available on the popular open-source platform Hugging Face, both AI models run locally and offer instantaneous responses. While FastVLM is a Visual Language Model (VLM) that offers almost instant high-resolution processing, MobileCLIP2 brings vision and language capabilities.

Both AI models are fine-tuned for Apple silicon and are based on the company’s own open-sourced machine learning framework, which offers a lightweight way to run and train models.

Talking of MobileCLIP2, Apple claims that it is 85 times faster and 3.4 times smaller than previous versions. MobileCLIP2 is a type of AI model known as a vision language model, which means it can decipher images or videos and language simultaneously.

This is really handy since Apple might use it for looking at a picture and describing what’s in it, identifying objects, and even generating captions for images without compromising your privacy. Apple also unveiled a light version of the FastVLM model. Called FastVLM-0.5B, users can try it out in their preferred browser by heading over to Hugging Face.

The FastVLM family of AI models also includes more powerful versions with 1.5 billion and 7 billion parameters, which improve both performance and response time.

Staff Writer
Staff Writer
The AI & Data Insider team works with a staff of in-house writers and industry experts.

Related

Unpack More