Home > Projects > Face Alignment with Part-Based Modeling

Face Alignment with Part-Based Modeling


The problem

The problem tackled in this work is to find the position of a set of landmarks points in a novel face image. It is assumed that we have access to a database of face images where each image has these points marked. This database can be used for training.

The solution

Regression

We solve the problem via regression, that is, we can learn a mapping from a feature description (PHOG in this work) of the face's appearance to the coordinates of the landmark points. The parameters of this mapping can be learnt from the training data.

We want to model this mapping as a linear function due to the simplicity and the ability to generalize of this model. However, the mapping from a patch corresponding to the whole face is neither simple nor linear.

Use multiple linear regression functions

Therefore we have to approximate the mapping with multiple linear mappings. We do this with the introduction of parts.

Divide the face into 4 four parts and associate a subset of the landmark points with each part. (The selection of the parts is done by hand, but they are chosen so that they contain enough structure but minimal intra-class variation so that they can be detected in novel face images and used in the regression mapping.)

The mapping from each part to its landmark points is now modeled as a linear function. Thus the original mapping is now described by 4 linear mappings.

Finding the parts

However, in our quest to approximate our mapping with linear models we have introduced a new problem. We now have to estimate the position and scale of the parts in our test image.

Luckily for us Pictorial Structures is a well known efficient part-based model which we can use to find the parts. And this is what we do.

Results

Here are some sample results from the IMM dataset where the images from the person in the test image have been omitted from the training images. The red lines shows the landmark positions predicted by our method while the green lines show the ground truth.

This video shows the results of applying the model learnt from the IMM dataset on a video captured at a very different time and set-up.



When the test images are too far from the training dataset, we can improve the accuracy by correcting the output of the regression model. In following results a series of 10-20 of images from the test data were corrected after the segmentation process, and the model is retrained. The following results are the results given after retraining of our model:

More details of this work are available in this paper:

Face Alignment with Part-Based Modeling [Oral]
V. Kazemi and J. Sullivan
In Proc. IEEE British Machine Vision Conference (BMVC 2011), Dundee, Scotland, Sept 2011.