
In 2015, Jacky Alciné, a software engineer in Brooklyn, noticed that his Google Photos account had auto-generated an album titled “Gorillas.” Inside, he found pictures of himself and a friend, incorrectly identified as primates by Google’s facial recognition software. After Alciné posted a screenshot on Twitter, Google’s chief architect of social responded via retweet, apologizing for the mistake. A research team was dispatched to examine the data. In the end, they determined that the problem was not a malicious occurrence but a symptom of a still-developing technology. One employee pointed out other recent cases in which Google Photos had tagged some white faces as dogs and seals.

2015年,布鲁克林的软件工程师JackyAlciné注意到他的Google Photos帐户自动生成了一张名为“ Gorillas”的相册。 在内部,他找到了自己和一个朋友的照片,这些照片被Google的面部识别软件错误地识别为灵长类动物。 在Alciné在Twitter上发布了屏幕截图后,Google的社交网络首席架构师通过转推做出了回应,对此错误表示歉意。 派出研究小组检查数据。 最后,他们确定问题不是恶意事件,而是技术仍在发展的症状。 一名员工指出了其他最近的案例,在这些案例中,Google相册将一些白脸标记为狗和海豹。

But here’s the thing: computers might not actually be racist, but they are stupid. Like, really stupid. They may have vast quantities of processing power, but they only know what they are told. And when it comes to machine learning algorithms, this can be a huge problem.

但这就是事实:计算机实际上可能不是种族主义者,但它们很愚蠢。 喜欢,真的很愚蠢。 他们可能具有巨大的处理能力,但他们只知道所告诉的内容。 当涉及到机器学习算法时,这可能是一个巨大的问题。

A facial recognition algorithm is a lot like someone painting a portrait. Just as an artist uses the end of a paintbrush to gauge the distance between tear ducts or find the angle between the pupils and the corner of the mouth, facial recognition programs convert the human face into a series of measurements, referred to as “biometrics.” At its core, the technology is that simple: a picture of someone is fed into the computer, the computer models the person’s features with biometrics, and then compares that set of measurements to a library of data in order to find a match.

面部识别算法很像有人画肖像。 就像画家使用画笔的末端来测量泪管之间的距离或找到瞳Kong和嘴角之间的角度一样,面部识别程序会将人脸转换为一系列测量值,称为“生物计量学”。 ” 从本质上讲,该技术是如此简单:将某人的照片输入计算机,计算机使用生物特征识别技术对人的特征进行建模,然后将这组测量结果与数据库进行比较以找到匹配项。

Biometric Facial Recognition at Houston International Airport, U.S. Customs and Border Protection

But before a computer can start mapping facial features, it needs to be able to know what parts of an image are face and what parts are not. In order to teach the software, programmers feed the computer a set of “training data” —an album of many, many faces. When the computer later goes to perform facial recognition tasks, it uses what it has learned from the training data as a basis for its decisions.

但是在计算机开始绘制面部特征之前,它需要能够知道图像的哪些部分是面部,哪些部分不是面部。 为了教授该软件,程序员向计算机提供了一组“训练数据”,即一张有很多张面Kong的相册。 当计算机稍后执行面部识别任务时,它将使用从训练数据中学到的信息作为决策的基础。

As a result, the more similarities that photos being identified share with the photos in the training dataset, the better the algorithm performs. Unfortunately, if a facial recognition system is going to operate in the world, the computer needs to be able to handle tons of variation in lighting, angle, and quality. A surveillance photo from a convenience store robbery, for instance, is very different from the stark lighting and controlled environment of a photo in a mugshot database.

结果,被识别的照片与训练数据集中的照片共享的相似度越高,该算法的性能就越好。 不幸的是,如果要在世界范围内使用面部识别系统,则计算机需要能够处理大量的光照,角度和质量变化。 例如,来自便利店抢劫的监视照片与面部照片数据库中照片的鲜明照明和受控环境有很大不同。

Madcoverboy Madcoverboy

One of the biggest issues concerning training data is race. In a 2018 study, researchers from MIT and Stanford tested the facial recognition systems developed by Microsoft, IBM, and Megvii. The study examined how well each algorithm could guess the genders of more than 1,200 subjects. In order to ensure a wide range of skin tones, the dataset of subjects was pulled from three African countries and three Nordic countries. The researchers found that all three of the face recognition programs misidentified women of color the most (error rates ranged from 21 to 35 percent). For white, male subjects, however, all error rates were lower than one percent.

关于训练数据的最大问题之一是种族。 在2018年的一项研究中,麻省理工学院和斯坦福大学的研究人员测试了微软,IBM和Megvii开发的面部识别系统。 这项研究检查了每种算法对1200多个受试者的性别的猜测程度。 为了确保广泛的肤色,从三个非洲国家和三个北欧国家中提取了受试者的数据集。 研究人员发现,这三个面部识别程序都最容易误判有色女性(错误率在21%至35%之间)。 但是,对于白人男性受试者,所有错误率均低于1%。

When looking at training data, the researchers found that one “major U.S. technology company” trained its software on a dataset that was more than 83 percent white and more than 77 percent male.


Photo by Quick PS on Unsplash
Quick PS在 Unsplash上 拍摄的照片

Other systems have encountered similar issues. In 2018, the ACLU tested Amazon’s facial recognition program, Rekognition, using pictures of U.S. lawmakers. When checking for matches against a mugshot database, Rekognition incorrectly identified 28 members of Congress as people who had been arrested for crimes. The misidentification was worse for those with darker complexions, with a 39 percent error rate (despite people of color making up only 20 percent of Congress).

其他系统也遇到了类似的问题。 2018年,ACLU使用美国议员的照片测试了亚马逊的面部识别程序Rekognition。 当根据面部照片数据库检查匹配项时,Rekognition错误地将28名国会议员确定为因犯罪而被捕的人。 对于肤色较深的人,错误识别率更差,错误率达39%(尽管有色人种仅占国会的20%)。

The most recent and high-profile study on facial recognition accuracy was released in late 2019, by the National Institute of Standards and Technology. The researchers examined 189 different algorithms, voluntarily submitted by 99 developers. The algorithms were given a dataset of more than 18 million pictures. Shockingly, the researchers found that Asians and Blacks were up to 100 times more likely to be misidentified than white men. Native Americans, meanwhile, had the highest false-positive rate (where one person is incorrectly identified as another).

美国国家标准技术研究院于2019年底发布了最新的,备受瞩目的面部识别准确性研究。 研究人员研究了189种不同的算法,这些算法由99位开发人员自愿提交。 为这些算法提供了超过1800万张图片的数据集。 令人震惊的是,研究人员发现,亚裔和黑人被误认的可能性比白人高100倍。 同时,美洲原住民的假阳性率最高(一个人被错误地识别为另一人)。

The NIST study also found that biases existed across a variety of search types. In one-to-many searches, which compare a single image to a large database in order to find a match, black women were commonly misidentified. This disparity is alarming, considering the fact that one-to-many searches are most often employed by police investigators looking for a suspect. In one-to-one matching, the kind used for unlocking phones or checking a passport, Asians, Blacks, and Native Americans all suffered from higher false-positive rates.

NIST的研究还发现,多种搜索类型之间存在偏差。 在一对多的搜索中(将单个图像与大型数据库进行比较以找到匹配项),黑人妇女通常被误认。 考虑到事实是,一对多搜索是警察调查人员寻找嫌疑犯的最常见事实,这种差异令人震惊。 在一对一匹配中,用于解锁电话或检查护照的那种方式,亚洲人,黑人和美国原住民都遭受较高的假阳性率。

Algorithms developed in Asian countries, however, had a much lower difference between White and Asian error rates, suggesting that racial distribution in training data may indeed be a factor in resolving error rate disparities.


More than an indicator that the technology might not be ready for widespread implementation, facial recognition’s high error rate for faces of color is an example of how the biases within society become concrete, systemic disadvantages. Take the issue of training data representation, for instance. In many cases, well-lit, high quality albums of people of color aren’t as readily available. From the earliest days of color pictures, photo technology has been optimized for pale complexions. Even today, cell phone light sensors and digital cameras struggle to capture dark skin tones in a variety of conditions.

人脸识别对彩色面Kong的高错误率不仅表明该技术可能尚未广泛实施,还表明社会内部的偏见如何变成具体的系统性弊端。 以训练数据表示的问题为例。 在许多情况下,光线充足,高质量的有色人种相册并不容易获得。 从彩色图片的早期开始,照相技术就针对肤色浅而进行了优化 。 即使在今天,手机光传感器和数码相机也难以在各种条件下捕捉深色皮肤。

Photo by Marc Mueller on Unsplash
Marc Mueller在 Unsplash上 拍摄的照片

We like to believe that while technology has the potential to exacerbate societal issues like racism and misogyny, it is a problem of implementation. But computers only know what they are told, and when they are designed to prioritize certain demographics, that’s exactly what they’ll do. If we fail to examine our technology against the social context in which it is developed, we can mistake human prejudice for scientific fact. We start to believe that cameras and photo tagging software don’t need to improve, it’s just that black people are not photogenic. It’s not that training data is too homogenous, it’s that most people of color look alike. And that is a dangerous road to go down.

我们喜欢相信,尽管技术有可能加剧种族主义和厌女症等社会问题,但这是实施的问题。 但是计算机只知道它们被告知的内容,并且当它们被设计为对某些人口统计信息进行优先级排序时,这正是它们所要做的。 如果我们未能根据技术发展的社会背景来检查我们的技术,我们就会将人类的偏见误认为科学事实。 我们开始相信相机和照片标记软件不需要改进,只是黑人没有上镜能力。 这并不是说训练数据太过同质,而是大多数有色人种看起来都差不多。 这是一条危险的道路。

Two years after the gorilla debacle, Google Photos finally “fixed” its tagging problem — by completely removing the tags “gorilla,” “chimpanzee,” and “monkey” from the platform. Mission…accomplished?

大猩猩崩溃两年后,Google相册终于“解决”了其标签问题-通过从平台上完全删除“大猩猩”,“黑猩猩”和“猴子”标签。 任务完成?

Photo by yarne fiten on Unsplash
照片由 yarne fiten在 Unsplash上 拍摄




