EmguCV Cut Face+Neck Skin Only And Save New Image
In my app, I will input a human image and I want to get the face and neck only of that person as output in separate image. Example: Below image as input:(Source:http://www.fremantlepress.com.au) And I want to get the up image as output: I want to perform the following algorithm:
1. Detect face 2. Select (face region * 2) area 3. Detect skin and neck 4. Cut the skin region of the selected image 5. Save that cut region into a new image
As going through the EmguCV wiki and other online resources, I am confident to perform the step 1 and 2. But I am not sure how can I accomplish step 3 and 4. There are some functions/methods I am looking on (Cunny Edge Detection, Contour etc) but I am not sure how and where should I apply those methods. I am using EmguCV (C#) and Windows Form Application. Please help me how can I do step 3 and 4. I will be glad if someone elaborate these two steps and some code also.
Well there are several ways you could approach this. Edge detection will only give you a binary image of edges and you will have to perform some line tracing or Hough transforms to detect the location of these. There accuracy will vary.
I will assume for know that you can detect the eyes and the relative location of the face. I would expect a statistical filter would provide a favourable outcome with better performance than a neural network which is the best alternative. A good Alternative is naturally colour segmentation if colour images are used (This is far easier to implement). I will also assume that the head position can change slightly with the neck being more or less visible within an image.
So for a Statistical Filter:
(Note that the background of the individual is similar to the face data when dealing with a greyscale image so a colour image would be better to work with).
Take a blank copy of our original image. We will form a binary map of our face on this while not necessary it will allow us to examine our success easier
Find the Face, Eyes and Mouth in the original image.
We can assume that any data from the eyes and mouth form part of the face and mark these on the blank copy with "1"s.
Now we need a bit of maths, as we know the face detection algorithm can only detect a face at a certain angle to the camera. We use this and select a statistical mask from the image of certain parts from the image let’s say 10x10 pixels 2 or 3 from the cheek area. This will be the most likely area of the face within the image. We use this data and get values from the image such as mean and standard deviation.
We now scan across the segmented part of the image where we have detected the face. We won't do the whole image as this will take a long period of time. (Note: There is a border half the size of the mask that won't be looked at). We examine each pixel and it surrounding neighbours to the size of the 10x10 mask. If the average or standard deviation (whatever we are examining) is similar to that of our filter say within 10% then we mark this pixel in our blank copy as a "1" and consider that pixel to belong to the skin.
As for Colour Segmentation:
(Note: You could also try this process for greyscale however it will be less successful due to the brickwork)
Repeat steps 1 to 2.
Again we will select certain areas of the image that we can expect to contain face data (i.e. 10 pixels below the eye). In this case however we examine the data that forms the colour of this pixel. Don't forget HSV images can obtain better results from this process an a combination more so. We can the scan across the image examining each pixel for a similar colour. If it matches mark it on your binary map.
An alternative is subtracting or adding a calculated from the R G and B spectrum of the image of which only the data face will survive. You can convert this directly to a binary image by making any value > 1 == 1;
This will only work for Skin as for the hair we will need other filters. A few notes:
A statistical filter working on a colour image has a far greater ability however takes longer.
Use data from the image to form your statistical filter as this will allow for other skin colours to be classified. A mathematical designed filter or colour segmentation will require a lot of work to achieve the same variability.
The size of the mask is important the greater the mask size the less likely errors will occur but again processing time increases.
You can speed up the process by referencing the same area within the binary map copy if the pixel your examining is already a 1 (classified by eye/nose/mouth detection) then why examine it again just skip it.
Multiple skin filters will provide better results however may also introduce more noise and remember each filter must then by compared with a pixel increasing processing time.
To get an lgorithm working accuratley will require a bit of trial and error but you sould see comparable results fairly quickly using these methods.
I hope this helps you on your way. Sorry for not including any code but hopefully others can help you were you get stuck and writing it yourself will help you understand what is going on and allow you to cut down on processing time. Let me know if you require any additional advice I'm doing my PhD in image analysis just so you know that the advice is sound.
[EDIT] Some quick results:
Here is a 20x20 filter applied in detecting the hair. The program I've written only works on greyscale images at the moment so the skin detection suffers interference from the stone (see later)
Colour Image of Face Region
Binary Map of Average Hair Filter 20x20 Mask 40% Error allowed
As can be observed there is interference from the shirt in this case as it matches the colour of the hair. This can be eliminated by simply only examining the top third or half of the detected facial region.
Binary Map of Average Skin Filter 20x20 Mask 40% Error allowed
In this image I use only 1 filter formed from the chin area as the stubble obviously changes the filters behaviour. There is still noise presented from the stone behind the individual however using colour image could eliminate this. The gaps in the case could be filled by an algorithm or another filter. Again there is noise from the edge of the shirt but we could minimise this either by detecting the shirt and removing any data that forms it or dimply only looking in certain areas.
Examples of the Regions to Inspect
To eliminate false classification you could take the top two thirds of the segmented image and look for the face and the width of the detected eyes to the bottom of the facial region for neck data.