Meta Unveils Revolutionary AI Model SAM, Cutting Out Objects Within Images
SAM's Applications Range from Photo Tagging to Augmented Reality
The Gist
Meta aka Facebook unveiled the Segment Anything Model (SAM), an AI model that can identify individual objects within images and videos.
SAM is an image segmentation model that reacts to text prompts or user clicks to separate specific objects within an image, simplifying image analysis or processing.
Potential applications include: understanding webpage content, augmented reality, image editing, and supporting scientific research by localizing animals or objects in video.
Alongside SAM, Meta released the SA-1B Dataset, consisting of 11 million images and 1.1 billion segmentation masks
A free interactive demo of Meta's segmentation technology is available on GitHub, allowing users to experience the technology by uploading a photo and selecting objects.
Facebook-owner Meta introduced an AI model named the Segment Anything Model (SAM) on Wednesday, which can recognize individual objects in images and videos, even those not seen during training.
In their blog post, Meta describes SAM as an image segmentation model that reacts to text prompts or user clicks to separate specific objects within an image. Image segmentation is a computer vision process that divides an image into multiple segments or regions, each representing a particular object or area of interest.
The goal of image segmentation is to simplify image analysis or processing. Meta believes the technology will be useful for understanding webpage content, augmented reality applications, image editing, and supporting scientific research by automatically localizing animals or objects to track in video.
Creating an accurate segmentation model typically "requires highly specialized work by technical experts with access to AI training infrastructure and large volumes of carefully annotated in-domain data." By developing SAM, Meta aims to "democratize" this process by decreasing the need for specialized training and expertise, which they hope will encourage further research into computer vision.
Alongside SAM, Meta has compiled a dataset called "SA-1B" consisting of 11 million images licensed from "a large photo company" and 1.1 billion segmentation masks generated by its segmentation model. Meta will offer SAM and its dataset for research purposes under an Apache 2.0 license.
The code is currently accessible on GitHub, and Meta has developed a free interactive demo of its segmentation technology.
Though image segmentation technology is not new, SAM stands out for its ability to recognize objects not included in its training dataset and its partially open approach. Additionally, the release of the SA-1B model could trigger a new generation of computer vision applications, similar to how Meta's LLaMA language model is already inspiring derivative projects.
While Meta has not yet launched a commercial product using this type of AI, they have previously employed technology like SAM internally for photo tagging, content moderation, and determining recommended posts on Facebook and Instagram.
Meta's announcement occurs amidst intense competition among Big Tech companies to lead the AI field.
OpenAI's ChatGPT language model captured widespread attention
Microsoft integrated OpenAI in their Bing search engine
Google announced their competitor Bard earlier this year
Amazon recently announced $300k in AWS credits for AI startups.
Given how complimentary SAM is with Meta’s AR/VR initiatives, we’ll likely see more updates in this space.