A de-illumination scheme for face recognition based on fast decomposition and detail feature fusion

Yi Zhou; Sheng-Tong Zhou; Zuo-Yang Zhong; Hong-Guang Li

doi:10.1364/OE.21.011294

1. Introduction

Face recognition is the biometric identification of a human face, which matches the image against a library of known faces. It has attracted significant attention because of its wide range of applications in public security, law enforcement and commerce, access control, information security and intelligent surveillance [1–3]. Over the last few years, numerous algorithms have been proposed for face recognition, including principal component analysis (PCA) [4, 5], linear discriminant analysis (LDA) [6], independent component analysis (ICA) [7, 8], local feature analysis (LFA) [9], elastic bunch graph matching (EBGM) [10], neutral networks [11], active appearance model [12], 3D morphable model [13], hidden Markov model [14], support vector machine [15], Hausdorff distances [16, 17], kernel methods [18], image-based recognition [19], volume holographic correlator [20], a correlation method along with ICA model [21], parallel correlated recognition [22] etc.

Despite of a certain level of maturity and several practical successes, automatic face recognition remains a challenging task with uncooperative users as well as uncontrolled environment such as facial expression, pose changes and variable lighting conditions. The existing recognition algorithms may face a dramatic drop due to these variations. Changes in lighting conditions during training and testing stages contribute more significantly. It is believed that lighting variations are even larger than differences among distinct individuals. In recent years, research on face recognition has been focused on diminishing the impact of illumination changes. Much progress and research work has been made to implement illumination normalization before recognizing.

Generally speaking, these schemes can be categorized into three groups: modeling, preprocessing and feature extraction.

Modeling schemes tend to model 3D human faces containing almost all the environment variations [23–26]. However, there are two major drawbacks in these schemes, which limit their applications in practical recognition systems. One is that these methods need numerous experimental samples to simulate face images under disparate lighting conditions. The other is that these methods consider human face as convex object and thus ignore casting shadows.

Feature extraction scheme: This approach attempts to extract facial features that are invariant to illumination variations. Such examples discussed by researchers are edge maps, image intensity derivatives, adaptive feature-specific imaging [27] and masked fake face detection [28]. However, none of the methods is sufficient to avoid interference from illumination variations. The Self-quotient image (SQI) is introduced to implement illumination-invariant recognition when lighting conditions change [29]. But this method requires Bootstrap database and its performance may degrade if dominant features between training and test samples are misaligned.

Preprocessing schemes are to eliminate the influence of illumination variations using some image processing techniques before recognition, including: logarithm transform [30], adaptive histogram equalization [31], block-based histogram equalization [32], symmetric shape-from-shading and a generic 3-D model [33], logarithm total variation, discrete cosine transform in logarithm domain (LOG-DCT) [34], advanced correlation filter [35], Bi-dimensional empirical mode decomposition (BEMD) normalization [36], etc. For Ref [30], it is difficult to deal with non-uniform illumination using global methods. Ref [31]. can cope with non-uniform illumination variation but the performances are still not satisfactory. Ref [33]. enhances the recognition rates greatly only for frontal face images. The kernel algorithm in Ref [34], DCT, is not adaptive with external basic function needed. The kernel algorithm in Ref [36], BEMD, has some drawbacks unsolved in itself such as mode mixing, boundary effects and time consuming.

The major drawback in BEMD is the frequent appearance of leaky wave, caused by 2D mode mixing. Leaky wave can be explained that some elements originally belonging to one BIMF (Bi-dimensional Intrinsic Mode Function) may leak into other BIMFs. This phenomenon makes the decomposition incomplete. To overcome the problem, a new noise-assisted data analysis method [37], named bi-dimensional ensemble empirical mode decomposition (BEEMD), was introduced. This ensemble method takes too much time because of a huge number of BEMD trials, though it is quite efficient. Therefore, a fast BEEMD (FBEEMD) approach based on envelope estimation and self-similar boundary extension will be proposed in this paper.

The contribution of this paper is as follows: Firstly, fast decomposition method, FBEEMD, will be used to decompose a facial image into a set of multi-scale BIMFs. And then effective BIMFs will be extracted and then fused together as illumination-invariant feature of face. In this process, two measurements are proposed to calculate weights for quantifying the detail feature contained in each BIMF. With these two methods, the contribution ratio of each BIMF to global feature can be computed adaptively. By reconstructing facial image, we can reduce the overall effect of illumination variation effectively. Recognition results in six experiments are reported using Yale B, FERET and Carnegie Mellon University’s Pose Illumination and Expression (PIE) databases.

2. Bi-dimensional ensemble empirical mode decomposition

The successful application of empirical mode decomposition has stimulated the development of BEMD [38–42]. However, there generally exists a major drawback called 2D mode mixing in BEMD. It can be defined [43] as any BIMF consisting of oscillations of dramatically disparate scales. When mode mixing occurs, a BIMF may cease to have physical meaning by itself, suggesting that the clean separation of scales has been damaged.

To alleviate the problem, a new noise-added method of signal analysis, named BEEMD, was proposed by N. E. Huang et al [37]. The method defines BIMF components as the mean of an ensemble of trials, each consisting of original signal and white noise with finite amplitude. This new approach takes full advantage of statistical characteristic of white noise. Since the noise in each trial is different, it is canceled out in the ensemble mean of sufficient trails and the ensemble mean is treated as the true answer. Moreover, this ensemble decomposition method has been applied in 2D signal processing and information demodulation [44, 45].

However, it is noteworthy that such an ensemble BEMD approach requires large computational resource, proportional to the number of BEMD trials. A typical implementation of BEEMD is usually 100 independent trials in the ensemble. If processing a 256 × 256 image, it will take about 200 seconds in each trial. The large demand of computational resource has imposed a barrier of developing BEEMD by far. Therefore, in section 3, some new details will be introduced to original BEMD in order to implement BEEMD much faster and better.

3. The new BEEMD details

3.1 Local extrema detection

Detecting the local extrema from a 2D source signal is the first problem to be tackled in BEMD. Some BEMD methods used mathematical morphology method to locate the local extrema, but this will make the extreme points reduce quite fast. It means that the residual signal will become too smooth to find sufficient extreme points for fitting a surface after extracting two or three BIMFs.

Neighboring window method [46] is used to find the local maxima and minima points in this study. A data point is considered as a local maximum (minimum), if its value is strictly higher (lower) than all of its neighbors within a window. Generally, 3 × 3 window could result in an optimum extrema map. Although the larger windows are also used to reduce the computation cost in some cases, the extreme points will reduce very fast as the mathematical morphology method.

3.2 A fractal-based processing method for boundary effect

Another difficulty in BEMD comes from the processing of boundary effect [47]. As the number of iterations increases, this effect will not only appear at the boundary, but also propagate into the interior of the transformed data, which may finally make the decomposition useless.

A self-similar extension method will be proposed to reduce boundary effect in this study. From the original image, we can find a self-similar part corresponding to the extended part. The concrete algorithm is shown in Fig. 1 as follows:

Fig. 1 The schematic diagram of self-similar boundary extension.

Download Full Size | PDF

Assume the size of original image I is N × N. The size of the extended block is k × k. After extending, the size of the new image is (N + 2k) × (N + 2k), where the middle N × N block is the original image. The original image I is divided to N/k blocks with each one k × k size. Each extended block i_e has three neighbor blocks, denoted as i_n, in the original image. And then in the original image I, find the blocks which are the most similar to i_n. The criteria for judging similarity is based on the MAD (Mean Absolute Difference) for representing the distances different between boundary blocks and the matched blocks. At last, the block with most similar neighbor blocks is used as the extended block.

Before the above extension processing, an important parameter has to be determined, which is the boundary width. It is found that 8 or 16 are enough for most textures and natural images. After boundary processing based on the self-similar extension, the boundary interference in BEMD will be reduced, and the BIMF components is more significant.

3.3 A fast BEMD method

The ensemble method demands a huge number of trials to eliminate the added white noise. Moreover, the extraction of each BIMF in a single BEMD trial requires the repetitive iterations until finding an optimized fitting surface. This makes the decomposition process complex and excessively time-consuming.

A fast BEMD (FBEMD) method based on envelope estimation is proposed by S. M. A. Bhuiyan et al [48, 49]. The process of surface estimation runs faster than that of the interpolation in traditional BEMD. Moreover, this method can help reduce the number of iterations per BIMF greatly. All of these make FBEMD a rapid and efficient algorithm.

In traditional BEMD, SD is employed as the fundamental stopping criterion and at the same time, the maximum number of allowable iterations (MNAI) is applied as an additional stopping criterion to prevent the over-sifting [50, 51]. Moreover, similar or even better results could be produced by fast BEMD with the one iteration per BIMF than those achieved by traditional BEMD with a few iterations. Therefore, we will just use the MNAI criterion to stop the process of sifting iterations, which could limit the number of iterations per BIMF to one or a few times considered being sufficient.

To increase adaptivity of the FBEMD method, K. Patorski et al. proposed a simple modification in [52]. The adjustment step for the extrema detector window width is added at the beginning of calculating each BIMF. Proposed modification extends the computation time but it is specially tailored to yield adaptivity of FBEMD.

3.4 Comparison between some traditional BEMDs and FBEEMD

Figure 2 shows the development process of BEMD clearly: When the traditional BEMDs suffered from two major drawbacks (mode-mixing and time consuming), two study branches appeared. The ensemble concept was introduced to BEMD by N. E. Huang et al to overcome the mode-mixing, called E-BEMD or BEEMD usually. The fast thought was proposed for BEMD by S. M. A. Bhuiyan et al to overcome the time consuming, called F-BEMD. Then in the study, we will propose a decomposition method combining the above ensemble and fast BEMD methods together, named as FBEEMD by our research team.

Fig. 2 The development process of BEMD methods

Download Full Size | PDF

A typical example will be used to illustrate the problem of 2D mode mixing and then present the performance in resolving the problem. FBEEMD is utilized to decompose Lena image with 5 iterations, 100 trails and the standard deviation of white noise signal set to one fifth of the one of the original signal.

Through examination of each BIMF component in Fig. 3, one can clearly see several zones with deep color, for example, at the edge of face, nose, arm and hat, considered as 2D mode mixing, which is so serious as to make the fundamental components unclear, especially in Fig. 3(c) and Fig. 3(d).

Fig. 3 The decomposition of Lena using BEMD. (a) Lena image. (b)-(e) BIMF components 1-4 orders, from fine to large scale. (f) the residual

Download Full Size | PDF

In Fig. 4, one can scarcely see the zones with deep color, and the fundamental components are uncontaminated by the intermittence and much more visible than the corresponding one in Fig. 3. Improvement of the BIMF quality for Lena image is obvious with FBEEMD. The decomposition method more clearly reveals the edges and other characteristic features at different scales compared to the BIMFs obtained by original BEMD.

Fig. 4 The decomposition of Lena using FBEEMD. (a) Lena image. (b)-(e) BIMF components 1-4 orders, from fine to large scale. (f) the residual

Download Full Size | PDF

The orthogonality index (OI) has been proposed in Ref [50], the extended formula for two dimension is defined as follows:

O I = \sum_{x = 1}^{M} \sum_{y = 1}^{N} (\sum_{i = 1}^{K + 1} \sum_{j = 1}^{K + 1} \frac{C_{i} (x, y) C_{j} (x, y)}{\sum C^{2} (x, y)})

where the size of a 2D signal is M × N, K is the total number of BIMFs excluding the residue, and the summation of all the BIMFs C_i(x,y) is the original signal. A low value of the OI indicates a good decomposition in terms of local orthogonality among the BIMFs [51]. Generally, OI value less than 0.1 are acceptable. The OI values of the Lena image by FBEEMD and other classical methods are shown in the Table 1.

Table 1. Compared results of orthogonality index of Lena image and the consuming time

View Table

As the Table 1 shown, FBEEMD can reduce the OI effectively, which improve the traditonal BEMD method. Although the gross consuming time is a little longer than others, the average time of each BEMD trial is only about 3.6 seconds. Therefore, from the experimental results, it proves that our FBEEMD algorithm cannot only yield a better image representation, but also can resolve the major drawback, which is too much consuming time.

FBEEMD is adopted to decompose face images under different illumination conditions into a set of BIMFs. Only 3 BIMFs are shown in Fig. 5, including BIMF2, BIMF3 and BIMF4.

Fig. 5 The first column is five face images in different illumination conditions; the second column to the fourth column are BIMF2, BIMF3 and BIMF4 respectively.

Download Full Size | PDF

4. Illumination-reflectance model

The theoretical foundation of the majority of existing photometric normalization techniques can be linked to the Retinex theory developed and presented by Land and McCann in [53]. The theory tries to explain the basic principles governing the process of image formation and states that an image I(x,y) can be modeled as the product of the reflectance R(x,y) and luminance functions L(x,y):

I (x, y) = R (x, y) L (x, y)

The nature of L(x,y) is determined by the lighting source, while R(x,y) is determined by the characteristic of the surface of object. Therefore, which R(x,y) can be regarded as illumination insensitive measure. In general, a common assumption indicates that L(x,y) varies slowly corresponding to skin, background and some other large-scale features while R(x,y) can change abruptly matching edges, corners or some other small-scale features on face images.

Taking logarithm transform for Eq. (3), we can get

\log I (x, y) = \log R (x, y) + \log L (x, y)

This is to transform the product of illumination-reflectance model into a sum of two components that are low-pass and high-pass respectively. Then, FBEEMD is performed to separate these components into a set of multi-scale BIMFs. And then effective BIMF components will be fused together as illumination invariant feature of face. In next section, we will propose two measurements to calculate weights for quantifying the detail feature contained in these BIMFs.

5. Fusion of multi-scale detail features

5.1 A computing framework

It is a difficult task to avoid the interference from illumination since detail information with high frequency and illumination variation in different conditions are embedded in a face image closely. FBEEMD can separate face image into several BIMFs according to local features. However, it is hard to master clearly that which detail information in those BIMFs can depict a face better.

Therefore, it is considered to propose a computing framework to fuse essential detail features extracted from BIMFs. At first, a set of BIMFs are obtained by FBEEMD. Then, weights of each BIMF will be calculated with certain measurements. At last, those BIMFs with effective features will be fused together to reconstruct an illumination-invariant face. The computing framework can be expressed as follows:

D = λ_{1} d_{1} + λ_{2} d_{2} + \cdot \cdot \cdot + λ_{n} d_{n} = \sum_{i = 1}^{n} λ_{i} d_{i}

where D is feature distance between two images, d_i is feature distance between two i-th BIMFs, λ_i is distance weight representing contribution rate of the i-th BIMF to the value D. Although the first BIMF with the highest frequency contains a number of details and textural features, there are little information on facial structure and contour. It is well known that structure and contour information has much greater impact on face recognition. In other words, the feature distance between large-scale BIMFs contributes more to the global. Thus, we can obtainλ₁<λ₂<…<λ_n and

\sum_{i = 1}^{n} λ_{i} = 1

.

It is assumed that there exist two images I and I^’ needed to measure feature distance. Through FBEEMD, one can obtain two sets of BIMFs respectively: I₁, I₂,…,I_n and I^’₁, I^’₂,…,I^’_n. d_i is feature distance between I_i and I^’_i.

With a certain measurement, the value for quantifying detail information in each BIMF can be obtained: MI₁, MI₂,…,MI_n and MI^’₁, MI^’₂,…,MI^’_n, where M stands for an operator of information extraction. Therefore, the value λ_i can be computed as follows:

λ_{i} = \frac{\frac{1}{M I_{i}} + \frac{1}{M I^{'}_{i}}}{\sum_{i = 1}^{n} \frac{1}{M I_{i}} + \sum_{i = 1}^{n} \frac{1}{M I^{'}_{i}}}

5.2 The measurement for quantifying information of detail feature

Two measurements for quantifying detail information will be proposed in this section.

5.2.1 The number of extreme points in each BIMF

The BIMFs contain a large number of edge curves depicting contour feature. The more these curves are, the more detailedly essential features are described. The edge curves are a series of extreme points distributed continuously in local area of gray-scale image.

The fine-scale BIMFs contain much more extreme points than the large-scale ones. The total number of extreme points EP_k in a BIMF can be considered as a criterion for quantifying detail information. The formula can be expressed as follows:

E P_{k} = \sum_{x}^{w} \sum_{y}^{h} (| {p_{c} | p_{c} > p_{i}} | + | {p_{c} | p_{c} < p_{i}} |) (p_{i} \in A_{c})

where w and h are width and height of a BIMF I_k; p_c is a pixel at the coordinate (x,y); A_c is a n × n local area with center at the pixel p_c; p_i is any other pixel except p_c in the area A_c, i = 1,2,…,n × n-1. Figure 6 is plotted based on 100 images and shows that the value EP_k of BIMFs from 1 to 6 orders changes as the value k.

Fig. 6 The relationship between the average of extreme points and decomposition order

Download Full Size | PDF

5.2.2 The sum of contrast ratio in each BIMF

Contrast ratio can be defined as difference between disparate pixels. It is apparent that the larger the difference is, the easier it is for human eyes to distinguish and then the more information the image contains. One can obtain a set of multi-scale BIMFs through FBEEMD. Fine-scale BIMFs have larger contrast ratio for they contains more detail information while large-scale BIMFs have smaller contrast ratio with less detail information. Therefore, the sum of local contrast ratio (CV_k) can be a measure criterion to quantify detail information contained in a BIMF:

C V_{k} = \sum_{x = 1}^{w} \sum_{y = 1}^{h} \sqrt{\sum {(p_{c} - p_{i})}^{2}} (p_{i} \in A_{c})

where w and h are width and height of a BIMF I_k; p_c is a pixel at the coordinate (x,y); A_c is a n × n local area with center at the pixel p_c; p_i is any other pixel except p_c in the area A_c, i = 1,2,…,n × n-1. Figure 7 is plotted based on 100 images and shows that the value CV_k of BIMFs from 1 to 6 orders changes as the value k.

Fig. 7 The relationship between the sum of contrast ratio and decomposition order.

Download Full Size | PDF

5.2.3 The measurements based on the weights of computing framework

It can be observed in Fig. 6 and Fig. 7 that these two values decrease monotonously as the decomposition order increases. Without any external parameters, these values are calculated by BIMF’s own features adaptively. It cannot be proved whether the above two methods can precisely quantify detail information. But a huge number of experiments show that these values can reflect a relative value of detailed information quantity between different images. Therefore EP_k and CV_k are adopted in fusion computing framework.

One can obtain the values EP_i and EP^’_i, CV^’_i and CV^’_i respectively by using EP_k and CV_k to quantify detail information in the i-th BIMF I and I^’. One can also obtain two types of schemes to compute weights of framework after putting these four values into Eq. (5).

λ_{E P_{i}} = \frac{\frac{1}{E P_{i}} + \frac{1}{E P^{'}_{i}}}{\sum_{i = 1}^{n} \frac{1}{E P_{i}} + \sum_{i = 1}^{n} \frac{1}{E P^{'}_{i}}}

λ_{C V_{i}} = \frac{\frac{1}{C V_{i}} + \frac{1}{C V^{'}_{i}}}{\sum_{i = 1}^{n} \frac{1}{C V_{i}} + \sum_{i = 1}^{n} \frac{1}{C V^{'}_{i}}}

6. Experiments and analysis

There are many methods proposed to eliminate interference from illumination variation. We choose only three of them compared with ours: SQI [54], LOG-DCT [34], BEMD [36]. The images are recognized by PCA, after normalizing the illuminated image by using each of four schemes. At last, we will arrange an experiment to test the performance of our de-illumination method under four different face-recognition algorithms.

All these experiments are evaluated based on Yale B [24], PIE [55], and FERET [56] databases respectively. All the faces images are cropped to remove the background and the hair. We use only the frontally illuminated images as the training and testing samples.

6.1 Reconstructing illumination-invariant face images

A typical example will be used to present the performance of our scheme in eliminating illumination variations of five face images under disparate illumination conditions.

In our scheme, we firstly take logarithm transform to transform a face image into a sum of two components that are low-pass and high-pass respectively. Then, FBEEMD is performed to separate these components into a set of multi-scale BIMFs. Thirdly three effective BIMFs will be fused together as illumination-invariant feature of face, where any image under good illumination is used as reference image for measuring feature distance. At last, an illumination-invariant face will be reconstructed as the input of recognition algorithms.

Figure 8 shows example images under different illumination conditions. The original images are in upper row and the corresponding processed images are in lower row. It can be observed from the upper row that there are large changes in illumination, which may have a greater effect on recognition rates than individuals. After de-illumination, the corresponding illumination-neutral faces are obtained in the lower row. The recognition rates will be enhanced greatly with this kind of processed face images.

Fig. 8 Example of normalized face images: upper row is original images and lower row is processed images by proposed method.

Download Full Size | PDF

6.2 Experiments based on Yale B face database

9 different poses of 10 individuals are presented in the Yale B database. 64 different illumination conditions exist for each pose. We use 64 frontal facial images to evaluate 4 illumination compensation methods. The database with 640 images can be divided into five subsets according to illumination angle: subset-1(0°-12°), subset-2(13°-25°), subset-3(26°-50°), subset-4(51°-77°), subset-5(≥78°). The image number of five subsets is: 70, 120, 120, 140 and 190.

The first experiment: use the subset-1 (70 images) with the best illumination condition as training sample and use the other 570 facial images as test samples. The recognition rates using four methods are compared in Fig. 9.

Fig. 9 The recognition rates in experiment#1 using four methods based on Yale B database (%).

Download Full Size | PDF

The second experiment: use the subset-4 (140 images) with the worst illumination condition as training sample and use the other 500 facial images as test samples. The recognition rates using four methods are compared in Fig. 10.

Fig. 10 The recognition rates in experiment#2 using four methods based on Yale B database (%)

Download Full Size | PDF

The third experiment: use a facial image of each person in the subset-1 (10 images) as training sample and use the other 630 facial images as test samples. The recognition rates using four methods are compared in Fig. 11.

Fig. 11 The recognition rates in experiment#3 using four methods based on Yale B database (%)

Download Full Size | PDF

6.3 Experiments based on PIE face database

PIE is another database that is often used for studies of illumination variations. The database consists of 68 individuals (28 wearing glasses) at 3 poses (frontal, side and profile) under illumination from 21 different directions and ambient light only. We only use illumination subset (C27) as experiment data. This subset has 21 facial images of each individual under different illumination conditions. The experiment uses a face image of each person (68 images) as training sample and uses the other 1380 facial images as test samples. The recognition rates using four methods are compared in Fig. 12.

Fig. 12 The recognition rates using four methods based on PIE database (%)

Download Full Size | PDF

6.4 Experiments based on training sample of variable size

We will implement a comparative experiment: Training sample sets vary in size from 2 to 10. Each experiment involved training using the specified training set, recognition, and recording rates. For PCA nine experiments are run with the same four de-illumination methods: SQI [54], LOG-DCT [34], BEMD [36] and ours. Performance is quantified by recognition rate over all experiments. The recognition rates for different recognition methods are presented in Fig.13.

Fig. 13 Recognition rate for SQI [54], LOG-DCT [34], BEMD [36] and ours.

Download Full Size | PDF

6.5 Experiments using different recognition algorithms based on FERET face database

The FERET database is one of the most famous databases for the evaluation of face recognition algorithms all frontal face images are divided into five categories: fa, fb, fc, dup1 and dup2. Fa images and fc images were taken under different illumination conditions. As we are only concerned with the illumination problem, 1196 fa images are used as gallery and 194 fc images are used as test set.

We will make a comparison experiment based on FERET database in order to present the performance of our compensation method on four different classical recognition algorithms (such as PCA [4, 5], LDA [6], LFA [9] and EBGM [10]). Through the proposed method, 1390 illumination-neutral faces images will be obtained as the input of four algorithms. The results of recognition rates are shown in Fig. 14.

Fig. 14 Recognition rate with or without our pre-processing scheme for PCA [4, 5], LDA [6], LFA [9] and EBGM [10].

Download Full Size | PDF

6.5 Analysis

It can be concluded through the above experiments that the performance of the de-illumination method proposed in this paper is prior to other three classical approaches. With our method, the recognition rate is improved obviously in small training sample experiments such as experiement#3 based on Yale B, experiment based on PIE and experiment based on training sample of variable size. Moreover, the last experiment shows that our pre-processing scheme can be applied to other classic recognition algorithms to avoid mistaken recognition efficiently.

7. Conclusion

Components with high frequency corresponding to face detail information can be considered as illumination-invariant feature for they are not affected by illumination changes so much. FBEEMD can efficiently decompose a facial image into multi-scale BIMFs, each of which contains detail features with disparate quantity and frequency. A fusion computation framework has been proposed to extract effective BIMFs. At the same time, two adaptive measurements without any external parameters are proposed to quantify detail information and then calculate fusion weights. A number of experiments are conducted to demonstrate that the proposed method is efficient. Moreover, the scheme can also be applied in many optical fields to eliminate the illumination components or DC background.

Acknowledgements

The authors are grateful for the research support received from the National High Technology Research and Development Program of China (863 Program No. 2012AA040106) and the National Natural Science Foundation of China (NSFC No. 10972137).

References and links

1. R. Chellappa, C. L. Wilson, and S. Sirohey, “Human and machine recognition of faces: a survey,” Proc. IEEE 83(5), 705–741 (1995). [CrossRef]

2. W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, “Face recognition: A literature survey,” ACM Comput. Surv. 35(4), 399–458 (2003). [CrossRef]

3. M. Park, C.-W. Park, M. Park, and C.-H. Lee, “Algorithm for detecting human faces based on convex-hull,” Opt. Express 10(6), 274–279 (2002). [CrossRef] [PubMed]

4. M. Turk and A. Pentland, “Eigenfaces for Recognition,” J. Cogn. Neurosci. 3(1), 71–86 (1991). [CrossRef]

5. L. Sirovich and M. Kirby, “Low-dimensional procedure for the characterization of human faces,” J. Opt. Soc. Am. A 4(3), 519–524 (1987). [CrossRef] [PubMed]

6. P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. Fisherfaces: recognition using class specific linear projection,” IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997). [CrossRef]

7. M. S. Bartlett, J. R. Movellan, and T. J. Sejnowski, “Face recognition by independent component analysis,” IEEE Trans. Neural Netw. 13(6), 1450–1464 (2002). [CrossRef] [PubMed]

8. C. Liu and H. Wechsler, “Independent component analysis of Gabor features for face recognition,” IEEE Trans. Neural Netw. 14(4), 919–928 (2003). [CrossRef] [PubMed]

9. P. S. Penev and J. J. Atick, “Local feature analysis: a general statistical theory for object representation,” Network-Comp Neural. 7(3), 477–500 (1996). [CrossRef]

10. L. Wiskott, J. M. Fellous, N. Kuiger, and C. von der Malsburg, “Face recognition by elastic bunch graph matching,” IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 775–779 (1997). [CrossRef]

11. M. J. Er, S. Wu, J. Lu, and H. L. Toh, “Face recognition with radial basis function (RBF) neural networks,” IEEE Trans. Neural Netw. 13(3), 697–710 (2002). [CrossRef] [PubMed]

12. A. Lanitis, C. J. Taylor, and T. F. Cootes, “Automatic face identification system using flexible appearance models,” Image Vis. Comput. 13(5), 393–401 (1995). [CrossRef]

13. V. Blanz and T. Vetter, “Face recognition based on fitting a 3D morphable model,” IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1063–1074 (2003). [CrossRef]

14. V. N. Ara and H. H. Monson III, “Face Recognition Using An Embedded HMM,” in Proceedings of IEEE Conference on Audio and Video-based Biometric Person Authentication, pp. 19–24. (1999).

15. G. Guo, S. Z. Li, and K. L. Chan, “Support vector machines for face recognition,” Image Vis. Comput. 19(9-10), 631–638 (2001). [CrossRef]

16. B. Guo, K.-M. Lam, K.-H. Lin, and W.-C. Siu, “Human face recognition based on spatially weighted Hausdorff distance,” Pattern Recognit. Lett. 24(1-3), 499–507 (2003). [CrossRef]

17. Y. Gao and M. K. H. Leung, “Face recognition using line edge map,” IEEE Trans. Pattern Anal. Mach. Intell. 24(6), 764–779 (2002). [CrossRef]

18. J. Yang, A. F. Frangi, J. Y. Yang, D. Zhang, and Z. Jin, “KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 27(2), 230–244 (2005). [CrossRef] [PubMed]

19. S. K. Zhou and R. Chellappa, “Image-based face recognition under illumination and pose variations,” J. Opt. Soc. Am. A 22(2), 217–229 (2005). [CrossRef] [PubMed]

20. L. Cao, Q. He, C. Ouyang, Y. Liao, and G. Jin, “Improvement to human-face recognition in a volume holographic correlator by use of speckle modulation,” Appl. Opt. 44(4), 538–545 (2005). [CrossRef] [PubMed]

21. A. Alfalou and C. Brosseau, “Robust and discriminating method for face recognition based on correlation technique and independent component analysis model,” Opt. Lett. 36(5), 645–647 (2011). [CrossRef] [PubMed]

22. Y. Liao, Y. Guo, L. Cao, X. Ma, Q. He, and G. Jin, “Experiment on parallel correlated recognition of 2030 human faces based on speckle modulation,” Opt. Express 12(17), 4047–4052 (2004). [CrossRef] [PubMed]

23. J. García, J. Valles, and C. Ferreira, “Detection of three-dimensional objects under arbitrary rotations based on range images,” Opt. Express 11(25), 3352–3358 (2003). [CrossRef] [PubMed]

24. A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman, “From few to many: illumination cone models for face recognition under variable lighting and pose,” IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 643–660 (2001). [CrossRef]

25. A. Mian, “Illumination invariant recognition and 3D reconstruction of faces using desktop optics,” Opt. Express 19(8), 7491–7506 (2011). [CrossRef] [PubMed]

26. H. Song, S. Lee, J. Kim, and K. Sohn, “Three-dimensional sensor-based face recognition,” Appl. Opt. 44(5), 677–687 (2005). [CrossRef] [PubMed]

27. P. K. Baheti and M. A. Neifeld, “Adaptive feature-specific imaging: a face recognition example,” Appl. Opt. 47(10), B21–B31 (2008). [CrossRef] [PubMed]

28. Y. Kim, J. Na, S. Yoon, and J. Yi, “Masked fake face detection using radiance measurements,” J. Opt. Soc. Am. A 26(4), 760–766 (2009). [CrossRef] [PubMed]

29. A. Shashua and T. Riklin-Raviv, “The quotient image: class-based re-rendering and recognition with varying illuminations,” IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 129–139 (2001). [CrossRef]

30. M. Savvides and B. V. K. V. Kumar, “Illumination Normalization Using Logarithm Transforms for Face Authentication,” in Lecture Notes in Computer Science (Springer-Verlag, Berlin, 2003), pp. 549–556.

31. S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. ter Haar Romeny, J. B. Zimmerman, and K. Zuiderveld, “Adaptive Histogram Equalization and Its Variations,” Comput. Vis. Graph. Image Process. 39(3), 355–368 (1987). [CrossRef]

32. X. Xie and K.-M. Lam, “Face recognition under varying illumination based on a 2D face shape model,” Pattern Recognit. 38, 221–230 (2005).

33. W. Zhao and R. Chellappa, “Illumination-insensitive face recognition using symmetric shape-from-shading,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (Hilton Head Island, South Califonia, 2000), pp. 286–293.

34. W. Chen, M. J. Er, and S. Wu, “Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain,” IEEE Trans. Syst. Man Cybern. Part B-Cybern. 36(2), 458–466 (2006). [CrossRef]

35. S. L. Wijaya, M. Savvides, and B. V. Vijaya Kumar, “Illumination-tolerant face verification of low-bit-rate JPEG2000 wavelet images with advanced correlation filters for handheld devices,” Appl. Opt. 44(5), 655–665 (2005). [CrossRef] [PubMed]

36. M. Shao, Y. Wang, and X. Ling, “A BEMD based normalization method for face recognition under variable illuminations,” in Proceedings of IEEE Conference on Acoustics Speech and Signal Processing (ICASSP), (Dallas, Texas, 2010), pp. 1114–1117. [CrossRef]

37. Z. Wu, N. E. Huang, and X. Chen, “The Multi-dimensional ensemble empirical mode decomspostion method,” Adv.Adapt. Data Anal. 1(03), 339–372 (2009). [CrossRef]

38. J. C. Nunes, Y. Bouaoune, E. Delechelle, O. Niang, and P. Bunel, “Image analysis by bidimensional empirical mode decomposition,” Image Vis. Comput. 21(12), 1019–1026 (2003). [CrossRef]

39. J. C. Nunes, S. Guyot, and E. Deléchelle, “Texture analysis based on local analysis of the Bidimensional Empirical Mode Decomposition,” Mach. Vis. Appl. 16, 177–188 (2005). [CrossRef]

40. J. C. Nunes, O. Niang, Y. Bouaoune, E. Delechelle, and P. Bunel, “Bidimensional empirical mode decomposition modified for texture analysis,” in Proceedings of Image Analysis, J. Bigun, and T. Gustavsson, eds. (Springer, Berlin, 2003), pp. 171–177.

41. X. Zhou, A. G. Podoleanu, Z. Yang, T. Yang, and H. Zhao, “Morphological operation-based bi-dimensional empirical mode decomposition for automatic background removal of fringe patterns,” Opt. Express 20(22), 24247–24262 (2012). [CrossRef] [PubMed]

42. X. Zhou, T. Yang, H. Zou, and H. Zhao, “Multivariate empirical mode decomposition approach for adaptive denoising of fringe patterns,” Opt. Lett. 37(11), 1904–1906 (2012). [CrossRef] [PubMed]

43. Z. Wu and N. E. Huang, “Ensemble empirical mode decomposition: a noise-assisted data analysis method,” Adv. Adapt. Data Anal. 1(01), 1–41 (2009). [CrossRef]

44. X. Zhou, H. Zhao, and T. Jiang, “Adaptive analysis of optical fringe patterns using ensemble empirical mode decomposition algorithm,” Opt. Lett. 34(13), 2033–2035 (2009). [CrossRef] [PubMed]

45. Y. Zhou and H. Li, “Adaptive noise reduction method for DSPI fringes based on bi-dimensional ensemble empirical mode decomposition,” Opt. Express 19(19), 18207–18215 (2011). [CrossRef] [PubMed]

46. A. Linderhed, “Variable sampling of the empirical mode decomposition of two-dimensional signals,” Int. J. Wavelets Multi. 3(03), 435–452 (2005). [CrossRef]

47. Z. Liu and S. Peng, “Boundary Processing of bidimensional EMD using texture synthesis,” IEEE Signal Process. Lett. 12(1), 33–36 (2005). [CrossRef]

48. S. M. A. Bhuiyan, R. R. Adhami, and J. F. Khan, “Fast and adaptive bidimensional empirical mode decomposition using order-statistics filter based envelope estimation,” EURASIP J. Adv. Signal Process. 2008(164), 725356 (2008).

49. S. M. A. Bhuiyan, R. R. Adhami, and J. F. Khan, “A novel approach of fast and adaptive bidimensional empirical mode decomposition,” in Processings of IEEE international Conference on Acoustics, Speech and Signal Processing (Institute of Electrical and Electronics Engineers, 2008), pp. 1313–1316. [CrossRef]

50. N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N.-C. Yen, C. C. Tung, and H. H. Liu, “The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis,” Proc. R. Soc. Lond. A 454(1971), 903–995 (1998). [CrossRef]

51. N. E. Huang, M.-L. C. Wu, S. R. Long, S. S. P. Shen, W. Qu, P. Gloersen, and K. L. Fan, “A confidence limit for the empirical mode decomposition and Hilbert spectral analysis,” Proc. R. Soc. Lond. A 459(2037), 2317–2345 (2003). [CrossRef]

52. K. Patorski, K. Pokorski, and M. Trusiak, “Fourier domain interpretation of real and pseudo-moiré phenomena,” Opt. Express 19(27), 26065–26078 (2011). [CrossRef] [PubMed]

53. E. H. Land and J. J. McCann, “Lightness and Retinex Theory,” J. Opt. Soc. Am. 61(1), 1–11 (1971). [CrossRef] [PubMed]

54. H. Wang, S. Z. Li, and Y. Wang, “Face recognition under varying lighting conditions using self quotient image,” in Proceedings of IEEE Conference on Automatic Face and Gesture Recognition (Institute of Electrical and Electronics Engineers, 2004), pp. 819–824.

55. T. Sim, S. Baker, and M. Bsat, “The CMU Pose, Illumination, and Expression (PIE) database,” in Proceedings of IEEE Conference on Automatic Face and Gesture Recognition (Institute of Electrical and Electronics Engineers, 2002), pp. 46–51. [CrossRef]

56. P. J. Philips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for face-recognition algorithms,” Image Vis. Comput. 16, 295–306 (1998). [CrossRef]

Methods	OI	Consuming time (s)
RBF [39]	0.086	126
Linderhed [46]	0.075	139
BEEMD [45]	0.021	17012
FBEEMD	0.011	358

A de-illumination scheme for face recognition based on fast decomposition and detail feature fusion

Abstract

1. Introduction

2. Bi-dimensional ensemble empirical mode decomposition

3. The new BEEMD details

3.1 Local extrema detection

3.2 A fractal-based processing method for boundary effect

3.3 A fast BEMD method

3.4 Comparison between some traditional BEMDs and FBEEMD

4. Illumination-reflectance model

5. Fusion of multi-scale detail features

5.1 A computing framework

5.2 The measurement for quantifying information of detail feature

5.2.1 The number of extreme points in each BIMF

5.2.2 The sum of contrast ratio in each BIMF

5.2.3 The measurements based on the weights of computing framework

6. Experiments and analysis

6.1 Reconstructing illumination-invariant face images

6.2 Experiments based on Yale B face database

6.3 Experiments based on PIE face database

6.4 Experiments based on training sample of variable size

6.5 Experiments using different recognition algorithms based on FERET face database

6.5 Analysis

7. Conclusion

Acknowledgements

References and links

Cited By

Figures (14)

Tables (1)

Equations (9)

Optics Express