Research Paper On Face Recognition Using Neural Networks
Department of Computer Science, Ho Chi Minh University of Science, Ho Chi Minh City 70000, Vietnam
Copyright © 2011 Thai Hoang Le. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This paper introduces some novel models for all steps of a face recognition system. In the step of face detection, we propose a hybrid model combining AdaBoost and Artificial Neural Network (ABANN) to solve the process efficiently. In the next step, labeled faces detected by ABANN will be aligned by Active Shape Model and Multi Layer Perceptron. In this alignment step, we propose a new 2D local texture model based on Multi Layer Perceptron. The classifier of the model significantly improves the accuracy and the robustness of local searching on faces with expression variation and ambiguous contours. In the feature extraction step, we describe a methodology for improving the efficiency by the association of two methods: geometric feature based method and Independent Component Analysis method. In the face matching step, we apply a model combining many Neural Networks for matching geometric features of human face. The model links many Neural Networks together, so we call it Multi Artificial Neural Network. MIT + CMU database is used for evaluating our proposed methods for face detection and alignment. Finally, the experimental results of all steps on CallTech database show the feasibility of our proposed model.
Face recognition is a visual pattern recognition problem. In detail, a face recognition system with the input of an arbitrary image will search in database to output people’s identification in the input image. A face recognition system generally consists of four modules as depicted in Figure 1: detection, alignment, feature extraction, and matching, where localization and normalization (face detection and alignment) are processing steps before face recognition (facial feature extraction and matching) is performed .
Figure 1: Structure of a face recognition system.
Face detection segments the face areas from the background. In the case of video, the detected faces may need to be tracked using a face tracking component. Face alignment aims at achieving more accurate localization and at normalizing faces thereby, whereas face detection provides coarse estimates of the location and scale of each detected face. Facial components, such as eyes, nose, and mouth and facial outline, are located; based on the location points, the input face image is normalized with respect to geometrical properties, such as size and pose, using geometrical transforms or morphing. The face is usually further normalized with respect to photometrical properties such illumination and gray scale. After a face is normalized geometrically and photometrically, feature extraction is performed to provide effective information that is useful for distinguishing between faces of different persons and stable with respect to the geometrical and photometrical variations. For face matching, the extracted feature vector of the input face is matched against those of enrolled faces in the database; it outputs the identity of the face when a match is found with sufficient confidence or indicates an unknown face otherwise.
Artificial neural networks were successfully applied for solving signal processing problems in 20 years . Researchers proposed many different models of artificial neural networks. A challenge is to identify the most appropriate neural network model which can work reliably for solving realistic problem.
This paper provides some basic neural network models and efficiently applies these models in modules of face recognition system. For face detection module, a three-layer feedforward artificial neural network with Tanh activation function is proposed that combines AdaBoost to detect human faces so that face detecting rate is rather high. For face alignment module, a multilayer perceptron (MLP) with linear function (three-layer) is proposed, and it creates 2D local texture model for the active shape model (ASM) local searching. For feature extraction module, a method for combination of geometric feature-based method and ICA method in facial feature extraction is proposed. For face matching, a model which combines many artificial neural networks applied for geometric features classification is proposed. This case study demonstrates how to solve face recognition in the neural network paradigm. Figure 2 illustrates algorithms for the steps of the face recognition system.
Figure 2: Proposed models for steps of a face recognition system.
The face detection and alignment steps are conducted on MIT + CMU test set  in order to evaluate effectively the performance. Then, the system, which is built from The proposed models, is conducted on CalTech database . Experimental results show that our method performs favorably compared to state-of-the-art methods.
The paper is structured as follows: Section 2 will describe in detail the applying of AdaBoost and artificial neural network for detecting faces. Section 3 will present an ASM method with a novel local texture model, which uses multilayer perceptron (MLP) for ASM local searching. Section 4 will describe a methodology for improving the efficiency of feature extraction stage based on the association of two methods: geometric feature-based method and independent component analysis (ICA) method. Section 5 will present multiartificial neural network (MANN) and MANN application for face matching. The experimental results are presented in Section 6. Conclusions are mentioned in Section 7.
2. AdaBoost and ANN for Face Detection
The face detection processing is the first step of the face recognition system. The step will decide the performance of the system, so it is the most important step of the recognition system. To carry out its efficiently, many researchers have proposed different approaches. In general, there are four groups of face detecting methods : (1) Knowledge-based methods; (2) Invariant feature-based methods; (3) Template matching-based methods; (4) Machine learning-based methods.
In this paper, we focus on only machine learning methods because they eliminate subjective thinking factors from human experience. Moreover, they only depend on training data to make final decisions. Thus, if training data is well organized and adequate, then these systems will achieve high performance without human factors.
One of the most popular and efficient learning machine-based approaches for detecting faces is AdaBoost approach . Viola and Jones designed a fast, robust face detection system where AdaBoost learning is used to build nonlinear classifiers. AdaBoost is used to solve the following three fundamental problems: (1) learning effective features from a large feature set; (2) constructing weak classifiers, each of which is based on one of the selected features; (3) boosting the weak classifiers to construct a strong classifier. Viola and Jones make use of several techniques for effective computation of a large number of such features under varying scale and location which is important for real-time performance. Moreover, the cascade of strong classifiers which form cascade tree will make the computation even more efficient. Their system is the first real-time frontal-view face detector. However, their system still has some drawbacks. Since the detection results depend on weak classifiers, the detection results often have many false positives. To decrease the rate of false positives, it is compelled to increase the number of strong classifiers and Haar-like features in cascade tree, but this will cause a significant increase in the performance time, and detection rate can be decreased. Thus, to deal with the issue, we should combine AdaBoost with other machine learning techniques to achieve the same face detecting ratios but with the minimum number of false positives and the running time.
One of the popular methods having the same achievement as well is artificial neural networks (ANNs) . ANN is the term on the method to solve problems by simulating neuron’s activities. In detail, ANNs can be most adequately characterized as “computational models” with particular properties such as the ability to adapt or learn, to generalize, or to cluster or organize data, and which operation is based on parallel processing. However, many of the previously mentioned properties can be attributed to nonneural models. A hybrid approach combining AdaBoost and ANN is proposed to detect faces with the purpose of decreasing the performance time but still achieving the desired faces detecting rate.
Our hybrid model is named ABANN. This is the model of combining AB and ANN for detecting faces. In this model, ABs have a role to quickly reject nonface images; then ANNs continue filtering false negative images to achieve better results. The final result is face/nonface.
The selected neural network here is three-layer feedforward neural network with back propagation algorithm. The number of input neurons is equivalent to the length of extracted feature vector, and the number of output neurons is just 1 (), This will return true if the image contains a human face and false if it does not. The number of hidden neurons will be selected based on the experiment; it depends on the sample database set of images.
The result image (20 × 20 pixels) of AB is the input of ANN. The output of the ANN is a real value between −1 (false) and +1 (true). The preprocessing and ANN steps are illustrated in Figure 3(b). The original image is decomposed into a pyramid of images as follows: 4 blocks 10 × 10 pixels, 16 blocks 5 × 5 pixels, and 5 overlapping blocks 20 × 6 pixels. Thus, the ANN will have 4 + 16 + 5 = 25 input nodes. Its goal is to find out important face features: horizontal blocks to find out mouths and eyes, square blocks to find out each of the eyes, noses, and mouths. The system uses one hidden layer with 25 nodes to represent local features that characterize faces well . Its activation function is Tanh function with the learning rate .
Figure 3: (a) The process of detecting faces of ABANN and (b) input features for neural network.
In detail, a model of cascade of classifiers includes many strong classifiers, and ANN is combined with the strong classifiers to be a final strong classifier of the system to achieve better results in Figure 3(a). For example, AB includes 5 strong classifiers, called AB5, which will be combined with ANN, the sixth strong classifier, to be ABANN5.
The image results of the step will be the inputs of the face alignment step. The next section elaborates our proposed method.
3. Local Texture Classifiers Based on Multilayer Perceptron for Face Alignment
The face alignment is one of the important stages of the face recognition. Moreover, face alignment is also used for other face processing applications, such as face modeling and synthesis. Its objective is to localize the feature points on face images such as the contour points of eye, nose, mouth, and face (illustrated in Figure 4).
Figure 4: Face alignment.
There have been many face alignment methods. Two popular face alignment methods are active shape model (ASM) and active appearance model (AAM) proposed by Cootes . The two methods use a statistical model to parameterize a face shape with PCA method. However, their feature model and optimization are different. ASM algorithm has a 2-stage loop: in the first stage, given the initial labels, searching for a new position for every label point in its local region which best fits the corresponding local 1D profile texture model; in the second stage, updating the shape parameters which best fit these new label positions. AAM method uses its global appearance model to directly conduct the optimization of shape parameters. Owing to the different optimization criteria, ASM performs more precisely on shape localization and is quite more robust to illumination and bad initialization. In the paper extent, we develop the classical ASM method to create a new method named MLP-ASM which has achieved better results.
Because ASM only uses a 1D profile texture feature, which is not enough to distinguish feature points from their local regions, the ASM algorithm often fell into local minima problem in the local searching stage. A few representative texture features and pattern recognition methods are proposed to reinforce the ASM local searching, for example, Gabor wavelet , Haar wavelet , Ranking-Boost , and FisherBoost . However, an accurate local texture model to large databases is still unachieved target.
In the next subsection, we present an ASM method with a novel local texture model, which uses multilayer perceptron (MLP) for ASM local searching. MLP is very sufficient for face detecting .
3.1. Statistical Shape Models
A face shape can be represented by points as a -element vector, . Given s training face images, there are shape vectors . Before we can perform statistical analysis on these vectors, it is important that the shapes represented are in the same coordinate frame. Figure 5 illustrates shape model.
Figure 5: Shape model of an image.
In particular, we seek a parameterized model of the form (Figure 6), where is a vector of parameters of the model. Such a model can be used to generate new vectors, . If we can model the distribution of parameters, , we can limit them so the generated s are similar to those in the training set. Similarly, it should be possible to estimate using the model.
Figure 6: Using PCA to compute statistical shape model.
To simplify the problem, we first wish to reduce the dimensionality of the data from to something more manageable. An effective approach is to apply PCA to the data. The data form a cloud of points in the -D space. PCA computes the main axes of this cloud, allowing one to approximate any of the original points using a model with fewer than parameters. The approach is as follows .
Step 1. Compute the mean of the data set
Step 2. Compute the covariance matrix of the data set
Step 3. Compute the eigenvectors, , and corresponding eigenvalues, , of the data set (sorted so ).
Step 4. We can approximate from the training set where (, the number of modes, can be chosen to explain a given proportion of 98% of the variance in the training data set) and , shape model parameters, given by A real shape of images can be generated by applying a suitable transformation to the points : This transformation includes a translation , a scaling , and a rotation .
3.2. ASM Algorithm
Given a rough starting approximation, the parameters of an instance of a model can be modified to better fit the model to a new image. By choosing a set of shape parameters, , for the model, we define the shape of the object in an object-centered coordinate frame. We can create an instance of the model in the image frame by defining the position , orientation , and scale parameters. An iterative approach to improve the fit of the instance, (Figure 7), to an image proceeds as follows.
Figure 7: Transformation model into image.
Step 1. Examine a region of the image around each point of to find the best nearby match for the points . There are some ways to find . A popular method, the classical texture model, will be presented in Section 3.3, then our method, the MLP local texture model, will be presented in Section 3.4.
Step 2. Repeat until convergence.
Update the parameters to best fit to the new found points to minimize the sum of square distances between corresponding model and image points:
Substep 2.1. Fix and find to minimize .
Substep 2.2. Fix and find to minimize .
3.3. Classical Local Texture Model
The objective is to search for local match for each point (illustrated in Figure 8). The model is assumed to have The strongest edge, correlation, and statistical model of profile.
Figure 8: 1D profile texture model.
Step 1. Computing normal vector at point and calculating tangent vector , Normalize tangent vector , Calculate normal vector ,