Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms

Edited by Thomas D. Albright, The Salk Institute for Biological Studies, La Jolla, CA, and approved April 30, 2018 (received for review December 13, 2017)
May 29, 2018
115 (24) 6171-6176

Significance

This study measures face identification accuracy for an international group of professional forensic facial examiners working under circumstances that apply in real world casework. Examiners and other human face “specialists,” including forensically trained facial reviewers and untrained superrecognizers, were more accurate than the control groups on a challenging test of face identification. Therefore, specialists are the best available human solution to the problem of face identification. We present data comparing state-of-the-art face recognition technology with the best human face identifiers. The best machine performed in the range of the best humans: professional facial examiners. However, optimal face identification was achieved only when humans and machines worked in collaboration.

Abstract

Achieving the upper limits of face identification accuracy in forensic applications can minimize errors that have profound social and personal consequences. Although forensic examiners identify faces in these applications, systematic tests of their accuracy are rare. How can we achieve the most accurate face identification: using people and/or machines working alone or in collaboration? In a comprehensive comparison of face identification by humans and computers, we found that forensic facial examiners, facial reviewers, and superrecognizers were more accurate than fingerprint examiners and students on a challenging face identification test. Individual performance on the test varied widely. On the same test, four deep convolutional neural networks (DCNNs), developed between 2015 and 2017, identified faces within the range of human accuracy. Accuracy of the algorithms increased steadily over time, with the most recent DCNN scoring above the median of the forensic facial examiners. Using crowd-sourcing methods, we fused the judgments of multiple forensic facial examiners by averaging their rating-based identity judgments. Accuracy was substantially better for fused judgments than for individuals working alone. Fusion also served to stabilize performance, boosting the scores of lower-performing individuals and decreasing variability. Single forensic facial examiners fused with the best algorithm were more accurate than the combination of two examiners. Therefore, collaboration among humans and between humans and machines offers tangible benefits to face identification accuracy in important applications. These results offer an evidence-based roadmap for achieving the most accurate face identification possible.

Continue Reading

Acknowledgments

Work was funded in part by the Federal Bureau of Investigation (FBI) to the NIST; the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA) via IARPA R&D Contract 2014-14071600012 (to R.C.); Australian Research Council Linkage Projects LP160101523 (to D.W.) and LP130100702 (to D.W.); and National Institute of Justice Grant 2015-IJ-CX-K014 (to A.J.O.). The views and conclusions contained herein should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, the IARPA, or the FBI. The US Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation thereon. The identification of any commercial product or trade name does not imply endorsement or recommendation by the NIST.

Supporting Information

Appendix (PDF)
Dataset_S01 (CSV)
Dataset_S02 (CSV)

References

1
E Noyes, PJ Phillips, AJ O’Toole, What is a super-recogniser? Face Processing: Systems, Disorders, and Cultural Differences, eds M Bindermann, AM Megreya (Nova, New York), pp. 173–201 (2017).
2
D White, AM Burton, RI Kemp, R Jenkins, Crowd effects in unfamiliar face matching. Appl Cognit Psychol 27, 769–777 (2013).
3
D White, PJ Phillips, CA Hahn, MQ Hill, AJ O’Toole, Perceptual expertise in forensic facial image comparison. Proc R Soc B 282, 20151292 (2015).
4
AJ Dowsett, AM Burton, Unfamiliar face matching: Pairs out-perform individuals and provide a route to training. Br J Psychol 106, 433–445 (2015).
5
A O’Toole, H Abdi, F Jiang, PJ Phillips, Fusing face recognition algorithms and humans. IEEE Trans Syst Man Cybern B 37, 1149–1155 (2007).
6
PJ Phillips, AJ O’Toole, Comparison of human and computer performance across face recognition experiments. Image Vis Comput 32, 74–85 (2014).
7
PJ Phillips, A cross benchmark assessment of deep convolutional neural networks for face recognition. Proceedings of the 12th IEEE International Conference on Automatic Face Gesture Recognition, pp 705–710. Available at https://ieeexplore.ieee.org/document/7961810/. Accessed May 14, 2018. (2017).
8
; National Research Council Strengthening Forensic Science in the United States: A Path Forward (National Academies Press, Washington, DC, 2009).
9
D White, K Norell, PJ Phillips, AJ O’Toole, Human factors in forensic face identification. Handbook of Biometrics for Forensic Science, eds M Tistaerlli, C Champod (Springer, Cham, Switzerland), pp. 195–218 (2017).
10
; Facial Identification Scientific Working Group, Guidelines for facial comparison methods, Version 1.0. Available at https://www.fiswg.org/FISWG_GuidelinesforFacialComparisonMethods_v1.0_2012_02_02.pdf. Accessed May 14, 2018. (2012).
11
JP Davis, K Lander, R Evans, A Jansari, Investigating predictors of superior face recognition ability in police super-recognisers. Appl Cognit Psychol 30, 827–840 (2016).
12
DJ Robertson, E Noyes, A Dowsett, R Jenkins, AM Burton, Face recognition by metropolitan police super-recognisers. PLoS One 11, e0150036 (2016).
13
D White, JD Dunn, AC Schmid, RI Kemp, Error rates in users of automatic face recognition software. PLoS One 10, e0139827 (2015).
14
OM Parkhi, A Vedaldi, A Zisserman, Deep face recognition. Proceedings of the British Machine Vision Conference, eds Xie X, Jones MW, Tam GKL, pp 41.1–41.12. Available at www.bmva.org/bmvc/2015/index.html. Accessed May 14, 2018. (2015).
15
JC Chen, VM Patel, R Chellappa, Unconstrained face verification using deep cnn features. Proceedings of the IEEE Winter Conference of Appl Computer Vis (WACV), pp 1–9. Available at https://ieeexplore.ieee.org/document/7477557/. Accessed May 14, 2018. (2016).
16
R Ranjan, S Sankaranarayanan, CD Castillo, R Chellappa, An all-in-one convolutional neural network for face analysis. Proceedings of the 12th IEEE International Conference on Automatic Face Gesture Recognition Gesture Recognition, pp 17–24. Available at https://ieeexplore.ieee.org/document/7961718/. Accessed May 14, 2018. (2017).
17
R Ranjan, CD Castillo, R Chellappa, L2-constrained softmax loss for discriminative face verification. arXiv:170309507. (2017).
18
Y Taigman, M Yang, M Ranzato, L Wolf, Deepface: Closing the gap to human-level performance in face verification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, Washington, DC), pp. 1701–1708 (2014).
19
J Prince, To examine emerging police use of facial recognition systems and facial image comparison procedures—Israel, Netherlands, UK, USA, Canada. The Winston Churchill Memorial Trust of Australia. Available at https://www.churchilltrust.com.au/media/fellows/2012_Prince_Jason.pdf. Accessed May 14, 2018. (2012).
20
R Russell, B Duchaine, K Nakayama, Super-recognizers: People with extraordinary face recognition ability. Psychon Bull Rev 16, 252–257 (2009).
21
J Kittler, M Hatef, RPW Duin, J Matas, On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20, 226–239 (1998).
22
K Norell, et al., The effect of image quality and forensic expertise in facial image comparisons. J Forensic Sci 60, 331–340 (2015).
23
Y Hu, et al., Person recognition: Qualitative differences in how forensic face examiners and untrained people rely on the face versus the body for identification. Vis Cognit 25, 492–506 (2017).
24
G Jeckeln, CA Hahn, E Noyes, JG Cavazos, AJ O’Toole, Wisdom of the social versus non-social crowd in face identification. Br J Psychol,, March 5, 2018).
25
AJ O’Toole, et al., Face recognition algorithms surpass humans matching faces across changes in illumination. IEEE Trans Pattern Anal Mach Intell 29, 1642–1646 (2007).
26
PJ Phillips, et al., FRVT 2006 and ICE 2006 large-scale results. IEEE Trans Pattern Anal Mach Intell 32, 831–846 (2010).
27
PJ Phillips, et al., An introduction to the good, the bad, and the ugly face recognition challenge problem. Proceedings of the Ninth IEEE International Conference on Automatic Face Gesture Recognition, pp 346–353. Available at https://ieeexplore.ieee.org/document/5771424/. Accessed May 14, 2018. (2011).
28
AJ O’Toole, X An, J Dunlop, V Natu, PJ Phillips, Comparing face recognition algorithms to humans on challenging tasks. ACM Trans Appl Perception 9, 1–13 (2012).
29
A Rice, PJ Phillips, V Natu, X An, AJ O’Toole, Unaware person recognition from the body when face identification fails. Psychol Sci 24, 2235–2243 (2013).

Information & Authors

Information

Published in

Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 115 | No. 24
June 12, 2018
PubMed: 29844174

Classifications

Submission history

Published online: May 29, 2018
Published in issue: June 12, 2018

Keywords

  1. face identification
  2. forensic science
  3. face recognition algorithm
  4. wisdom-of-crowds
  5. machine learning technology

Acknowledgments

Work was funded in part by the Federal Bureau of Investigation (FBI) to the NIST; the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA) via IARPA R&D Contract 2014-14071600012 (to R.C.); Australian Research Council Linkage Projects LP160101523 (to D.W.) and LP130100702 (to D.W.); and National Institute of Justice Grant 2015-IJ-CX-K014 (to A.J.O.). The views and conclusions contained herein should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, the IARPA, or the FBI. The US Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation thereon. The identification of any commercial product or trade name does not imply endorsement or recommendation by the NIST.

Notes

This article is a PNAS Direct Submission.

Authors

Affiliations

Information Access Division, National Institute of Standards and Technology, Gaithersburg, MD 20899;
Amy N. Yates
Information Access Division, National Institute of Standards and Technology, Gaithersburg, MD 20899;
Ying Hu
School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080;
Carina A. Hahn
School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080;
Eilidh Noyes
School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080;
Kelsey Jackson
School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080;
Jacqueline G. Cavazos
School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080;
Géraldine Jeckeln
School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080;
Rajeev Ranjan
Department of Electrical and Computer Engineering, University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854;
Swami Sankaranarayanan
Department of Electrical and Computer Engineering, University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854;
Jun-Cheng Chen
University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854;
Carlos D. Castillo
University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854;
Rama Chellappa
Department of Electrical and Computer Engineering, University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20854;
David White
School of Psychology, The University of New South Wales, Sydney, NSW 2052, Australia
Alice J. O’Toole
School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX 75080;

Notes

1
To whom correspondence should be addressed. Email: [email protected].
Author contributions: P.J.P., A.N.Y., D.W., and A.J.O. designed research; R.R., S.S., J.-C.C., C.D.C., and R.C. contributed new reagents/analytic tools; P.J.P., A.N.Y., Y.H., C.A.H., E.N., K.J., J.G.C., G.J., and A.J.O. analyzed data; R.R., S.S., J.-C.C., C.D.C., and R.C. implemented and ran the face recognition algorithms; and P.J.P. and A.J.O. wrote the paper.

Competing Interests

Conflict of interest statement: The University of Maryland is filing a US patent application that will cover portions of algorithms A2017a and A2017b. R.R., C.D.C., and R.C. are coinventors on this patent.

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.


Citation statements

Altmetrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

    Loading...

    View Options

    View options

    PDF format

    Download this article as a PDF file

    DOWNLOAD PDF

    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to access the full text.

    Single Article Purchase

    Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms
    Proceedings of the National Academy of Sciences
    • Vol. 115
    • No. 24
    • pp. 6095-E5635

    Media

    Figures

    Tables

    Other

    Share

    Share

    Share article link

    Share on social media