Meta Launches Computer Vision Model to ID Objects

Meta has launched new computer vision tools that help identify objects within an image.

The tools include the new Segment Anything Model (SAM) and the Segment Anything 1-Billion mask dataset (SA-1B), Meta said in a Wednesday (April 5) blog post.

Their names refer to “segmentation,” which is the process of identifying which image pixels belong to an object, according to the post.

The SAM is available under a permissive open license, while the SA-1B is available for research purposes, the post said.

Because SAM has learned a general idea of what objects are, it can identify objects that it hasn’t seen before, per the post.

“SAM is general enough to cover a broad set of use cases and can be used out of the box on new image ‘domains’ — whether underwater photos or cell microscopy — without requiring additional training (a capability often referred to as zero-shot transfer),” Meta said in the post.

This model could be used to help larger artificial intelligence (AI) systems better understand the world, help content creators extract parts of an image while making collages or editing videos, and help scientific researchers study and track animals or objects, according to the post.

The SA-1B mask dataset includes 1.1 billion masks spanning diverse geographic regions, income levels and demographics in order to help SAM perform similarly across different groups and in real-world use cases, the post said.

“By sharing our research and dataset, we hope to further accelerate research into segmentation and more general image and video understanding,” Meta said in the post.

Computer vision, a field of AI, has been used by companies in a number of ways.

For example, retail technology Pensa has used computer vision and AI to power a shelf scanning solution that removes the labor required to manage inventory, boosts accuracy, offers more up-to-date information and improves transparency for customers shopping via eCommerce channels.

Google uses computer vision, AI and billions of images to create the high-fidelity representations of places featured in its “immersive view” in maps.

Amazon uses computer vision and sensor technology to register items in its Just Walk Out cashierless checkout technology.