Continuing its commitment to open-source initiatives, Meta has unveiled a new AI benchmark named FACET. FACET, short for “Fairness in Computer Vision Evaluation,” serves as a tool to assess the impartiality of AI models designed for classifying and detecting objects, including individuals, in photos and videos.
This dataset comprises 32,000 images featuring 50,000 individuals meticulously annotated by human annotators. FACET covers various categories such as occupations and activities, including “basketball player,” “disc jockey,” and “doctor,” as well as demographic and physical attributes. This comprehensive approach enables deep evaluations of biases against these categories.
Meta expressed its intent behind releasing FACET in a blog post shared with TechCrunch, stating, “By releasing FACET, our goal is to enable researchers and practitioners to perform similar benchmarking to better understand the disparities present in their own models and monitor the impact of mitigations put in place to gain deeper insights into potential disparities within their own models and track the effectiveness of measures taken to address fairness issues, we urge researchers to employ FACET as a benchmarking tool. We also encourage researchers to utilize FACET for assessing fairness in various vision and multimodal tasks.
While the concept of benchmarks for examining biases in computer vision algorithms is not new, Meta’s FACET claims to be more thorough than its predecessors. It seeks to answer questions such as whether models exhibit biases in classifying people based on gender presentation or physical attributes like hair type.
FACET was created by having annotators label images for demographic attributes, physical characteristics, and classes, combined with labels from Meta’s Segment Anything 1 Billion datasets. However, it remains unclear whether the individuals depicted in the images were informed about their use for this purpose. Moreover, the recruitment process for annotators and their compensation is not detailed in the blog post.
Historically, many annotators employed for AI dataset labeling have come from lower-income countries, raising concerns about fair compensation. In the context of this discussion, it is essential to note the recent scrutiny of annotation firms paying workers low rates and delaying payments.
According to Meta’s white paper detailing FACET’s development, annotators were described as “trained experts” from various geographic regions, including North America, Latin America, the Middle East, Africa, Southeast Asia, and East Asia. They were compensated based on an hourly wage specific to their respective countries.
Putting aside potential issues surrounding its origins, Meta states that FACET can be used to assess various AI models across different demographic attributes, including classification, detection, instance segmentation, and visual grounding. As a test case, Meta applied FACET to its own DINOv2 computer vision algorithm and identified biases in gender presentation and stereotypical identification of certain professions.
Meta acknowledges that no benchmark is flawless and acknowledges the potential limitations of FACET in capturing real-world concepts and demographic groups. They have also made a web-based dataset explorer tool available, with the stipulation that developers use FACET solely for evaluation, testing, and benchmarking purposes and refrain from training computer vision models on it.