|
Home > Projects > Face Alignment with Part-Based Modeling
The problemThe problem tackled in this work is to find the position of a set of landmarks points in a novel face image. It is assumed that we have access to a database of face images where each image has these points marked. This database can be used for training.
The solutionRegressionWe solve the problem via regression, that is, we can learn a mapping from a feature description (PHOG in this work) of the face's appearance to the coordinates of the landmark points. The parameters of this mapping can be learnt from the training data.
We want to model this mapping as a linear function due to the simplicity and the ability to generalize of this model. However, the mapping from a patch corresponding to the whole face is neither simple nor linear. Use multiple linear regression functionsTherefore we have to approximate the mapping with multiple linear mappings. We do this with the introduction of parts. Divide the face into 4 four parts and associate a subset of the landmark points with each part. (The selection of the parts is done by hand, but they are chosen so that they contain enough structure but minimal intra-class variation so that they can be detected in novel face images and used in the regression mapping.)
The mapping from each part to its landmark points is now modeled as a linear function. Thus the original mapping is now described by 4 linear mappings.
Finding the partsHowever, in our quest to approximate our mapping with linear models we have introduced a new problem. We now have to estimate the position and scale of the parts in our test image. Luckily for us Pictorial Structures is a well known efficient part-based model which we can use to find the parts. And this is what we do. ResultsHere are some sample results from the IMM dataset where the images from the person in the test image have been omitted from the training images. The red lines shows the landmark positions predicted by our method while the green lines show the ground truth.
This video shows the results of applying the model learnt from the IMM dataset on a video captured at a very different time and set-up.
When the test images are too far from the training dataset, we can improve the accuracy by correcting the output of the regression model. In following results a series of 10-20 of images from the test data were corrected after the segmentation process, and the model is retrained. The following results are the results given after retraining of our model:
More details of this work are available in this paper:
Face Alignment with Part-Based Modeling [Oral] |