Tech

How Does Facial Recognition Work? Understanding the Technology Behind the Screen

Published on

January 18, 2025

‍Introduction

We all have faces: that's a fact. The first thing we see after birth is a face; during the next days and months, we continue seeing faces. Studies are showing that newborns prefer patterns that resemble those of a face (in this picture, newborns prefer the top row rather than the low one)[1]:

Forms showing preference of face-like images

‍

Because it is so easy and relevant for humans to identify a person looking at their face, it has been used since the invention of photography as a way of identifying persons. Indeed, it was in 1882 that M. Bertillon introduced the "mugshot" at the Paris Police Station, a way of registering identities that is still in place today:

Mugshot representation — Alphonse Bertillon, Self-portrait. Source: https://en.wikipedia.org/wiki/Alphonse_Bertillon

Facial recognition was also one of the earliest attempts to implement computer systems that can do the same. In fact, the first known facial recognition system was implemented back in 1965 by Woody Bledsoe (See Wired ” The secret story of facial recognition”: https://www.wired.com/story/secret-history-facial-recognition/):

Nevertheless, it was not until 2010, with the advancements of Deep Convolutional Neural Networks, that systems performing as similar as humans were developed. Before Deep Learning, facial recognition systems performed really bad. Indeed, in a report of Facial Recognition Systems performed by NIST, it is claimed that "The major result of the evaluation is that massive gains in accuracy have been achieved in the last five years (2013- 2018) and these far exceed improvements made in the prior period (2010-2013). While the industry gains are broad - at least 28 developers’ algorithms now outperform the most accurate algorithm from late 2013 - there remains a wide range of capabilities. With good quality portrait photos, the most accurate algorithms will find matching entries, when present, in galleries containing 12 million individuals, with error rates below 0.2%. "

This "revolution" enables an explosion of facial recognition uses and applications. Today, facial recognition is everywhere: from the very mundane, such as unblocking our phones, to more sophisticated ones, such as automatically processing immigration, to even more controversial ones, such as public surveillance. But what is facial recognition, and how does it work? Let's dive into the following sections.

What Is Facial Recognition?

In short, facial recognition is a type of biometric system, in which a person is recognized by the information provided by its face.

‍Definition: a biometric system is a system for the purpose of the biometric recognition of individuals based on their behavioral and biological characteristics (ISO/IEC 2382-37).
‍Definition: biometric recognition is the automated recognition of individuals based on their biological and behavioral characteristics (ISO/IEC 2382-37).‍
Definition: a biometric characteristic is any biological and behavioral characteristic of an individual from which distinguishing, repeatable biometric features can be extracted for the purpose of biometric recognition .

Examples of biometric characteristics are: Galton ridge structure, face topography, facial skin texture, hand topography, finger topography, iris structure, vein structure of the hand, ridge structure of the palm, retinal pattern, handwritten signature dynamics, etc. (ISO/IEC 2382-37).

Automated facial recognition technology uses one or more photos or images obtained from a video feed of a person, and converts them into a facial template or a mathematical representation of the image. Then, an algorithm (i.e., a set of instructions) that calculates the degree of matching can compare that template with a template generated from another photo. In this way, it can determine the level of similarity. The templates can be stored independently of the images of the faces.

How Facial Recognition Works: The Basic Steps

As mentioned before, FRS are a type of biometrics system. Any biometric system (and this FRS is not the exception) is very simple to describe in terms of the functionalities it provides. There are mainly three functionalities associated to any biometric system:

Enrolment is the recording of the biometric characteristics captured for a given person, along with some type of identifier that uniquely represents them in the system (for instance, user name or passport number or national ID). This process creates a facial database which, as we will see shortly, is a key component of any FRS. Even in those cases where FRS is used only to verify a unique identity (like unblocking the phone), the biometric characteristic must be stored.
Verification. In this functionality, the system verifies whether the person is who they claim to be. The user's biometric sample is compared exclusively with the sample stored in the system to confirm the declared identity.

Graph representing identity verification

Identification. In this process, the system determines the identity of the person without requiring them to declare it. The outcome of an identification process is often a list of possible candidates, ranked by their degree of similarity or resemblance, based on the criteria defined by the system.

Graph representing biometric identification

In order to implement any of these functionalities, a FRS requires capturing information on the real world and translating it to a digital representation. A number of steps are involved in this process:

Steps of the process for face recognition system

Once the template is created, it can be compared against other templates (which is known as "Facial Template Matching" and finally a decision whether the templates belong to the same person or not is taken. The next diagram illustrates the entire process:

‍

You may ask: can facial recognition be fooled, but that's a question for another article. Now, let's go deeper into each one of the steps.

‍

Capture of the face image

One of the reasons for the massive use of facial recognition systems is that in most cases, a simple camera is the only sensor required to capture the information. Any mobile phone, notebook or PC has a built-in camera, with everything needed to capture a proper facial image. Even for other purposes like video surveillance, cameras were already there before any FRS existed. Although there are special cases where additional sensors are used (for instance, cameras on the iPhone to obtain the depth map or NIR cameras to picture during night), most facial recognition systems are based on simple images.

Although this process is quite simple, there is still an additional step to be performed before generating the facial template: locating and isolating the face from a picture. This process is critical, because if no face is detected in the picture, the system will not continue with the process (in the following image, the system detects two out of three faces):

Generation of the facial template

Once the sample is captured (in the form of a digital image), a facial template is created. In this step, the system processes the digital image and extracts the relevant information, and a facial template is created. Now, why is this needed? Why can't we just compare the images? The main reason is the variability of the pictures. Consider for example, a set of images from John Turturro taken from the Internet:

Set of images of John Turturro — Google Search. *John Turturro images.*

We can see that in some pictures John has glasses, in others he is young, in others he is older, in others he is playing a specific character, etc. We need to find a way to extract all the John Turturroness to create a representation as unique as possible. And this is what this step performs: given a picture it creates a compact representation of the face, which is known as a template face. You can imagine that this specific step is the most important one in the entire process since a bad representation will lead automatically to bad facial recognition systems. We will explain how this is achieved in the following section, but for now let's assume that some kind of magic process creates such a facial template and let's continue with the rest of the FRS steps.

Matching facial templates

To determine if two pictures are from the same person, we compare the two facial templates. Because these templates were created in a way that condenses the uniqueness of the person, two templates from the same person are expected to be similar, whereas two templates from different persons are expected to be different. Usually, this matching process generates a score related to how similar two templates are: the higher the score, the more similar the system believes that the two templates are.

Decision

Due to the imperfect nature of the capture and feature extraction processes to generate the facial template, the comparison of two facial templates from the same trait will never yield a perfect match. In other words, the comparison of two facial pictures obtained from the same person will never be identical, even under ideal capture conditions. Therefore, the last step of a facial recognition system is to determine whether two samples correspond to the same person or not.

This decision stage is very important in any facial recognition system, and the selection of the threshold or thresholds to be used is crucial for the final operation of the entire system. If you want to learn more about this decision step, stay tuned for our article con how the performance of a biometric system is determined.

‍

Key Technologies Behind Facial Recognition

As we've mentioned before, all facial recognition implemented today is based on deep learning techniques and architectures. In 2014, a team of researchers from Facebook and Tel Aviv University published DeepFace [2], and one year later a team from Google published FaceNet [3]. From there, a constellation of papers were published improving these results even further. A prove of this is the number of algorithms being tested by the NIST on its ongoing "Face Recognition Vendor Test": in October 2024, 375 different algorithms were tested so far.

Although each system has its own architecture, training dataset and training procedure, most of them are based on Convolutional Neural Networks, and trained over millions of pictures of several thousand individuals. A typical architecture includes a CNN backbone, followed by a number of fully-connected layers. For instance, the proposed DeepFace network architecture is the following:

One key question that arises all the time is if in order to work on a given dataset, the network has to be retrained with that data, and the answer is that in most cases that is not necessary. The training process is quite simple: given a dataset composed of individuals with several images per person, the network learns to classify each image in the corresponding identity. At the end, the network has learnt how to classify a picture to the corresponding identity. But wait: this was done in a given dataset: how does this network then generalize to be used on a totally different dataset, with different identities ? The process is quite simple: we simply remove the classification layer and keep the output of the last fully connected layer, which usually is a 1D vector (for instance, 512 or 1024 values). This vector is indeed the facial template which is then used as part of the facial recognition system in the way we've explained before.

‍

Advantages and Disadvantages of Facial Recognition Technology

‍

When we talk about advantages and disadvantages of a facial recognition system, we need to discuss what are the alternatives that we have and this is maybe the first step to decide if a FRS is the correct answer or not. Obviously, this will also be related to the application in which we have in mind. At Digital Sense, we've developed for the IADB a set of good practices on the use of biometrics. Although it is focused on social services, the recommendations are general enough to be considered for any biometric system.

There is one clear situation in which facial recognition is the preferred biometric characteristic: when the system is expected to work unattended. Although other biometrics characteristics can be better in terms of accuracy (for instance, fingerprint), facial recognition is the only one that can be easily implemented without additional devices (like fingerprint scanners or iris scanners). What is more, one of the key elements of any unattended biometric system is that the system must have a way to determine if the person interacting with the system is real or not. At Digital Sense, we've developed an SDK that helps organizations to include these functionalities in their applications, for example, check Mi Dinero’s case study.

‍

Common Uses and Applications of Facial Recognition

Facial recognition is being employed extensively in various scenarios. For instance, it is used to unlock iPhones and other mobile devices.

Another example of use is at immigration where many countries have deployed automated gates (eGates) that match a passenger’s face with the image stored in their passport to confirm and record their entry or departure At Digital Sense, we've deployed our facial recognition system at Punta del Este airport to increase the security of immigration transactions.

It’s also used in cases where identifying the individual isn't the primary goal, such as tracking people's movement through a shopping mall by linking different camera views. Police forces are also utilizing this technology to conduct broad searches for convicted individuals.

‍

Privacy Concerns and Ethical Challenges

The way to provide assurances to a population whose biometric data is being used appropriately is through the formulation of solid data management policies, equipped with clear protocols for use and transparency. When deploying biometric systems, compliance with the legal framework is of utmost importance. This is why there is a set of standards and regulations that must be considered when designing a biometric solution. In particular, it is highly recommended to follow the "privacy by design" principles.

‍

Privacy by design

Privacy by design is a framework for the development of systems and processes in which the protection and privacy of data are incorporated from the very conception of the system. The concept of 'privacy by design' was developed by Dr. Anne Cavoukian [4] and later standardized in ISO 29100 (Privacy Framework Principles) [5], through the description of 11 basic principles. These principles are closely related to the protection of personal data in general and to the principles included in the EU GDPR.

Consent and Choice: Unless clearly stated by law, all collection, storage, and disclosure of data must be approved by the individual. If possible, the individual should be informed about: what type of biometric trait will be acquired and when, who is authorized to do so and why, who else is authorized to access their biometric data and for what purpose, who will protect, store, transmit, access, or link their biometric data, and for how long the data will be stored.
Purpose legitimacy and specification
The intentions behind collecting, utilizing, retaining, and disclosing biometric data should be conveyed to the individual (data subject) during or prior to the data collection process. These purposes should be explicitly defined, restricted, and pertinent to the specific circumstances. It is of utmost importance that this is clear, not only for those responsible and technicians of the biometric system but for all involved parties (operators, officials in general, and obviously the individual themselves). It is common for information regarding the purpose to be published on the organization's website.
Biometric Collection Limitation
The registration of biometric information must be fair, lawful and be subject to limits, responding strictly to the needs for which it is requested. When analyzing a biometric system, potential risks to the individual's privacy must be considered, and it must be ensured that these risks are proportional to the benefits that the biometric system will provide. For example, if a fingerprint-based system is being considered, how many fingerprints will be requested? What are the benefits and risks of requesting more or fewer fingerprints?
Data Minimization
The biometric data stored in the systems should be the minimum necessary. Whenever possible, transactions generated in different systems should not be able to be correlated with each other.
Use, retention and disclosure limitation
The use, retention, and disclosure of biometric data shall be limited to the relevant purposes identified to the individual, for which he or she has consented, except where otherwise required by law. Personal information shall be retained only as long as necessary to fulfill the stated purposes, and then securely destroyed.
Accuracy and Quality
Biometric data must be correct, complete, and up-to-date. In particular, keeping biometric data up-to-date is of vital importance, especially those traits that change significantly over time.
Openness, transparency and notice
All parties involved must be adequately informed about policies and practices regarding the handling of personal data.
Individual participation and access
People should have the ability to access their personal data and be informed about its uses and instances of disclosure. This enables users to notify the organization in case there are inaccuracies in the data, facilitating necessary corrections
Accountability
It is necessary to document and communicate to all those involved the policies and procedures established for the proper management of privacy. Ideally, organizations should designate a specific role responsible for safeguarding personal data.
Information Security
Organizations are accountable for ensuring the security of stored biometric data, and they should adhere to recognized standards and implement best practices in security.
Privacy compliance
Organizations should create mechanisms for handling complaints and providing remedies, making this information accessible to the public. This should include clear communication on how individuals can escalate their concerns to the next level of appeal.

‍

European Artificial Intelligence Act

In June 2024, the European Union published in the Official Journal (OJ), the final version of the text to be enforced. As stated in their web page: ". The Act assigns applications of AI to three risk categories. First, applications and systems that create an unacceptable risk, such as government-run social scoring of the type used in China, are banned. Second, high-risk applications, such as a CV-scanning tool that ranks job applicants, are subject to specific legal requirements. Lastly, applications not explicitly banned or listed as high-risk are largely left unregulated." In the case of facial recognition systems, this is crucial, as it is specifically mentioned several times in the Act. For instance, the following activities are prohibited:

‍

compiling facial recognition databases by untargeted scraping of facial images from the internet or CCTV footage.
inferring emotions in workplaces or educational institutions, except for medical or safety reasons.
‘real-time’ remote biometric identification (RBI) in publicly accessible spaces for law enforcement, except when:some text
- searching for missing persons, abduction victims, and people who have been human trafficked or sexually exploited;
- preventing substantial and imminent threat to life, or foreseeable terrorist attack; or
- identifying suspects in serious crimes (e.g., murder, rape, armed robbery, narcotic and illegal weapons trafficking, organised crime, and environmental crime, etc.).

The next category of remote facial recognition use cases, that do not fall into the previously mentioned cases, are considered high-risk: Remote biometric identification systems, excluding biometric verification that confirm a person is who they claim to be. Biometric categorisation systems inferring sensitive or protected attributes or characteristics. Emotion recognition systems.

Algorithm Discrimination and Bias

The term ‘bias’ typically refers to a systematic error or distortion that affects the accuracy or fairness of a system. In biometric systems, the most common biases are associated with an individual's gender, age, and race. There are several reasons for these biases but the most common ones are those related to the decisions taken during the design of a solution. For example, most facial recognition algorithms are based on a machine learning approach, which means that some data was used to train them. Depending on the data used, the facial recognition system may have racial biases (for example, if trained mostly with images of Asian people, it may have lower accuracy when dealing with images of Caucasian or Black people). But these biases can also be seen on systems that do not use any machine learning approach. For instance, the most notable bias on fingerprint-based systems is linked to age: it is common for fingerprints of minors to exhibit a considerable decrease in accuracy when compared to the results obtained for adults. Thus, it is strongly recommended to always analyze a biometric system in a context as similar to the environment in which it will be installed as possible.

‍

Future of Facial Recognition: What to Expect

The future of facial recognition holds exciting possibilities across various domains, with several innovations, but also ethical considerations shaping its evolution. Here are some key trends and future directions:

Improve on the accuracy. Facial recognition systems have achieved incredibly good performances and for a number of applications, this performance is enough. Nevertheless, there are still places for improvement, in particular for FRS that requires a high degree of accuracy. For instance, by October 2024, the NIST reports that the FNMR for a FMR of 10e-6 (that is, the number of rejected people if we expect to have only one in a million false acceptance) is nearly 10e-3 (depending on the test performed). In practice, that means that if a system is set up to have only one in a million false acceptances, the number of people rejected by the system is one each thousand people. This is even more challenging for identification on big databases, since the errors are linear with the size of the dataset.
Improve on liveness detection. As we've mentioned before, liveness detection is key to having a secure system. The use of generative AI to generate fake images in real time presents a real challenge for most liveness detection systems.

Compliance with the law and other ethical implications. AI frameworks that reduce biases (gender, race, age) in facial recognition will be essential for creating fairer and more inclusive systems. Continuous ethical oversight may be required to ensure responsible use, especially in public surveillance. Compliance with existing regulations (like the European Artificial Intelligence Act) will be mandatory for most facial recognition systems willing to offer their services world-wide.

‍

References

[1] - F. Simion, V. M. Cassia, C. Turati, and E. Valenza, BThe origins of face perception: Specific versus non-specific mechanisms,[ Infant Child Devel., vol. 10, pp. 59–65, 2001

[2] - Taigman, Yaniv, et al. "Deepface: Closing the gap to human-level performance in face verification." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.

[3] - Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.

[4] - Cavoukian, A. Privacy by design: The 7 foundational principles. Information and privacy commissioner of Ontario, Canada, 5, 12. 2009

[5] - ISO/IEC (2011). 29100:2011 - Information technology — Security techniques — Privacy framework. Disponible en: https://www.iso.org/standard/45123.html