What Is Face Detection And How Does It Work?

8 min readMay 17, 2021

Many people have become familiar with the term face detection, but how does it work?

During the last decades, face detection has become one of the most relevant and promising advances in the field of image analysis, going from being something that was only seen in movies to real life, and even in several of the daily activities, such as taking photos and using the social networks.

Face detection is a technology based on artificial intelligence which is used to find human faces in images. This technology can be applied to various fields, such as security, biometrics, law enforcement, and to provide real-time surveillance and monitoring.

In recent years, many studies have been proposed in this field to make it more advanced and accurate, progressing from rudimentary computer vision techniques to advances in machine learning to sophisticated artificial neural networks, and other related technologies.

How Face Detection Works

Face detection uses algorithms and machine learning to find human faces in images that usually incorporate other objects such as landscapes, buildings, and other parts of the human body. These algorithms usually start by looking for human eyes, one of the easiest features to find, as well as the eyebrows, mouth, nose, nostrils, and iris. Once the algorithm has found a facial region, it applies additional tests to confirm it.

For the algorithms to be accurate, it must be trained with huge data sets containing hundreds of thousands of images, some containing faces while others do not. These trainings seek to improve the algorithm’s ability to decide if an image contains faces and where they are.

Face Detection Methods

Researchers David Kriegman, Ming-Hsuan Yang, and Narendra Ahuja, from the University of California, classified face detection methods into four categories or methods, of which face detection algorithms can belong to more than 2 of these.

Knowledge-Based

This method depends on a set of rules and relies on human knowledge to detect faces. These rules can include, for example, that a face must have eyes, mouth and nose in certain positions within certain distances. The problem is in creating an appropriate set of rules, because if the rules are too general, many false positives can occur, and if they are too detailed, many false negatives can be generated. This method is considered to be unable to find multiple faces in multiple images.

Template-Matching

In this method, predefined templates are used to detect faces, measuring the correlation between the input photos and the templates. A template can show, for example, that a human face can be divided into nose, mouth, eye, and face contour regions. A facial model could also be composed only of edges using the edge detection method. Although this method is easy to implement, it is insufficient or inadequate for face detection.

Appearance-Based

This method relies on a set of training facial images to discover what a face should look like. It is better than the other methods previously described, as it relies on statistical analysis techniques and machine learning to find the relevant facial features in the images. It is also used for feature extraction in facial recognition.

Feature-Based

This method works by locating the faces, and extracting their structural features. To do this, First an algorithm is trained as a classifier, then it is used to classify the facial regions from the non-facial regions, surpassing the instinctive recognition of faces by humans. This method has a 94% success rate, even in photos with multiple faces. It can be adversely affected by noise and light.

Viola-Jones Framework

Since 2001, great advances have been made in face detection, thanks to researchers Paul Viola and Michael Jones, who proposed a Feature-Based framework which has become very popular and widely used for detecting faces in real time with high precision. This framework is based on training a model to understand what a face is and what it is not, which when trained, extracts specific features, which are stored to be compared with previously stored features (Haar features) in various stages. If the image passes all the comparison stages, it is concluded that a face was detected

Haar Features

The features sought by the face detection framework universally involve the sums of image pixels within rectangular areas such as those shown in the following image.

All human faces share similar properties that can be represented or paired with these rectangular areas, such as the eye region that is darker than the upper part of the cheeks or the bridge of the nose region that is brighter than the eyes.

The value of any feature is the sum of the pixels within the light rectangles subtracted from the sum of the pixels within the dark rectangles, which allows to determine whether or not an image passes a comparison stage.

Common Procedure In An Image

These are the most common, briefly described steps to detect faces in an image using the Viola-Jones framework.

First the image is transformed to grayscale, because the information to be processed for each pixel is reduced, facilitating the detection.
If it is necessary, manipulate the image using operations such as resizing, cropping, blurring and sharpening.
Segment the image, to detect contours or segment the multiple objects, so that the algorithm can differentiate between objects and faces more quickly.
Apply the Viola-Jones framework, using the Haar-features described above.
If necessary, apply something that indicates where the face was detected, such as the box that appears on digital cameras.
Finally, another detection method can be used, for other more specific characteristics of the face such as the smile, the eyes, or the blinking among others.

Challenges In Face Detection

Currently face detection technologies are not totally perfect, although they have been improving over the years, and detection systems have become very accurate, there are still many challenges to overcome.

Human faces can have a strange expression, such as a grimace, which makes it difficult for facial detection algorithms to identify it as a face.
If a face is hidden by hair, a hat, a hand, glasses, a scarf, or a mask this can result as a false negative.
Illumination is one of the main problems in face detection, especially in algorithms that search for features based on the illumination of the face like the Viola-Jones. An image may not have uniform lighting effects, part of the image may be overexposed, while another part is too dark, causing erroneous detection
The background. When an image with a face has many objects in the background, the detection accuracy is reduced. If an image have a plain monochrome background or a static background, removing it can help to improve the face detection.
In color images, skin color can be used to find faces, but this may or may not work for all types of faces. If the skin color falls outside the gradient that the algorithm recognizes, the face might not be detected, this is why another technique is used, such as use movement to find them.
In videos, a face is almost always moving, so when implementing this technique, the area in motion must be calculated. The problem with this is that there can be confusion with other objects moving in the background.
The resolution of an image can also be a problem, if it is poor, it is more difficult to detect faces and other objects.
Large storage requirements. Because face detection makes use of machine learning technologies, these require a large data storage, mainly to save the images with which the algorithms are trained.
Because face detection opens the possibility for some applications to collect data, such as the facial characteristics of users, there is a great question about whether face detection is compatible with human rights to privacy.

Face Detection Vs Face Recognition

These are two terms that people tend to confuse that although they are strongly linked they are not the same. While facial detection is responsible for identifying if there is a face, facial recognition determines whose face is. That is why this application is widely used in security, such as biometric verification, either by unlocking mobile phone applications, stubs, among others.

Face Detection And Augmented Reality

Some of the most important applications in facial detection are made by integrating it with augmented reality technologies, mainly, when a face is detected in real time, either to indicate to the user that a face has been found and where, or other types of developments, more interestings, that will be described below

Photography

One of the most common uses that combine facial detection and augmented reality, we can observe it every time we are going to take a photo in modern digital cameras, such as cell phones, when focusing on a person’s face, and a box is unfolded indicating that a face has been found.

AR Filters or Lenses

Pupularized by the social network Snapchat, augmented reality filters are computer-generated effects that are superimposed on the real-life image shown by the camera. These types of filters can transform the face of the person once it is detected, into something completely different, such as a puppy, a royal with a crown or flowers or even a baby.

Lips Reading and Speech Recognition

Many studies show that augmented reality and automatic speech recognition technologies can be used to help people with disabilities. Although many of these studies have been done solely in their specialized field, audiovisual speech recognition is one of the advances that combines audio, video, and facial expressions to capture voice. The purpose of this type of systems is to take the speech of a narrator and convert it into a readable text directly on a screen. In such a way that people with hearing disabilities can read it, facilitating communication with other people.

Conclusion

Face detection is an emerging technology, which is the basis of a large number of applications, some of them we do in our daily lives, when we unlock our mobile phones or when we are going to take a photo. This technology integrates other emerging technologies, such as machine learning, of which despite the great advances that have been made in recent years, new discoveries continue to be made to improve its performance, leading to constant improvement, to overcome all challenges.
Facial recognition is one of the most important applications of face detection, specifically in biometrics due to improvements in security, easy integration, and automation in the identification of people, but integration with technologies such as augmented reality, can lead to developments in which there is great potential to improve people’s quality of life and entertainment.