Yet Another Face Recognition Demonstration on Images/Videos : Using Python and Tensorflow

Face verification and identification systems have become very popular in computer vision with advancement in deep learning models like Convolution Neural Networks (CNN). Few weeks before, I thought to explore face recognition using deep learning based models. This blog-post demonstrates building a face recognition system from scratch.

Introduction

A face recognition system comprises of two step process i.e. face detection (bounded face) in image followed by face identification (person identification) on the detected bounded face. The following two techniques are used for respective mentioned tasks in face recognition system.

  1. Multi-Task Cascaded Convolution Networks (MTCNN, 2015): It detects all the faces in an image and put a bounding box to it.
  2. FaceNet CNN Model (FaceNet, 2015) : It generates embedding (512 dimensional feature vector in the pre-trained model used here) of the detected bounded face which is further matched against embeddings of the training faces of people in the database. This model is used for person identification in the detected face.
Face-Recognition-Pipeline

Face Recognition System : Pipeline

Before moving ahead, we will understand the difference between verification and identification tasks.

1. Face verification

It answers the problem of person verification i.e. whether the person is present in the detected face. For example, we may need to verify a person by matching the detected face with his/her stored historical facial images.

Verification is implemented using a threshold score (an empirical value) such that if the score is below threshold then it is considered positive and vice versa. A score is calculated as euclidean distance between vector embeddings of two faces in question. A low score means the detected face is close to the stored historical face of person (and hence verified). Likewise, a high score means both the faces are different.

FaceNet achieved accuracy of 98.87% ± 0.15 and 99.63% ±0.09 with two different settings on the LFW face verification task. The selected optimal threshold as 1.24. Labeled Faces in the Wild (LFW) is the de-facto academic test-set for face verification which contains more than 13,000 labelled facial images of 1680 people collected from the web.

2. Face Identification

It answers the problem of person identification on detected face in the image. For example, we may need to identify a person in the detected face against a database of 1000 people.

Identification can be implemented by training a simple multi-class classifier like K-NN or SVM over the embedding of faces generated by FaceNet.  As this post progresses, we will see how we can train a face classifier on our own data-set of people.

Probably, you might be interested in reading this paper thoroughly to check identification and verification accuracy on 1 million faces (MegaFace benchmark). For both the tasks above, a face detector has to run at first place in order to detect bounded faces in image.

Setting up Environment

Let us setup a virtual environment on a Linux based (Ubuntu) system for this demonstration. It can be done on Windows also easily. A Virtual Environment is a tool to keep the dependencies required by different projects in separate places, by creating virtual Python environments for them. So, It is better to create a virtual environment for such demonstration.

You can find the requirements.txt from here. This file contains list of packages which would be sufficient for demonstration of face recognition experiments in this blog-post.

>>> virtualenv Face_ID
>>> cd Face_ID
>>> source bin/activate
>>> pip install -r requirements.txt

In this blog-post, FaceNet and MTCNN techniques are ingested from David Sandberg’s FaceNet implementation found here. Saying that, we need to do following steps.

  1. Download this GitHub repository. Keep the folders of this repository in the Face_ID directory created with virtual environment.
  2. Download the pre-trained model from here. Keep the pre-trained model directory in Face_ID/facenet/src/.

Face Verification

Let us take four images and see how can we compare them in terms of euclidean distance between the embeddings generated by FaceNet Model. We will make use of Face_ID/facenet/src/compare.py for getting the distance .

>>> python facenet/src/compare.py facenet/src/20180402-114759/ 
    facenet/dataset/test-images/bradley.jpeg 
    facenet/dataset/test-images/hritik.jpeg 
    facenet/dataset/test-images/mark1.jpeg 
    facenet/dataset/test-images/mark.jpeg 
    --image_size 160 --margin 32 --gpu_memory_fraction 0

Images:
0: facenet/dataset/test-images/bradley.jpeg
1: facenet/dataset/test-images/hritik.jpeg
2: facenet/dataset/test-images/mark1.jpeg
3: facenet/dataset/test-images/mark.jpeg

Distance matrix

0 1 2 3
0 0.0000 1.1274 1.4113 1.4722
1 1.1274 0.0000 1.4643 1.4246
2 1.4113 1.4643 0.0000 0.6942
3 1.4722 1.4246 0.6942 0.0000

This simple examples depicts how the euclidean distance between embeddings are low for two faces of the same person (green), high for faces of totally different person (red) and in between (neither high nor low) for similar faces (amber).

It can be said that with threshold 1.1, we can accurately verify the above samples. Readers are encouraged to go through the python code of compare.py. We must agree to the point that initially a face detector (MTCNN model is utilized) has to run which will extract the bounded face from the above images. Further, the distance between extracted embeddings (FaceNet CNN model is utilized) from the bounded face are calculated.

Face Identification

Let us train a face recognition model on our own data-set. We will train a classifier (SVM) on faces of 6 people and then run face recognition on images or videos. We will perform the following steps to do face identification experiment.

  1. Dataset Preparation
    Collect at least 10 images per person at the least. Keep it in Face_ID/facenet/dataset/raw. As of now, you may see raw image folders of 6 people in the mentioned path. Some of the sample images are shown below.
  2. Face Detection
    Run face detection and alignment algorithm i.e. MTCNN based model to extract bounded faces only from all images and prepare a aligned directory. It will be saved in Face_ID/facenet/dataset/aligned path.

    >>> python facenet/src/align/align_dataset_mtcnn.py 
        facenet/dataset/raw facenet/dataset/aligned 
        --image_size 160 --margin 32
    
    # Read about parameters in code align_dataset_mtcnn.py

    Output bounded faces from MTCNN are shown below:

  3. Training Faces
    Initially, it generates 512 dimensional embedding vector for 10 faces of each of the individual. Further, it trains a multi-class classifier support vector machines (SVM) on the generated vectors.

    >>> python facenet/src/classifier.py TRAIN 
        facenet/dataset/aligned facenet/src/20180402-114759/ 
        facenet/src/20180402-114759/my_classifier.pkl 
        --batch_size 1000 --min_nrof_images_per_class 10 
        --nrof_train_images_per_class 10 --use_split_dataset
    
    # Read about parameters in code classifier.py
  4. Face Recognition
    We are ready to run face recognition on test images. The test images can be found in the path Face_ID/facenet/dataset/test-images/

    >>> python facenet/src/face_recognition_image.py facenet/dataset/test-images/1.jpg
    >>> python facenet/src/face_recognition_image.py facenet/dataset/test-images/2.jpg
    >>> python facenet/src/face_recognition_image.py facenet/dataset/test-images/test1.jpg
    >>> python facenet/src/face_recognition_image.py facenet/dataset/test-images/test2.jpg
    >>> python facenet/src/face_recognition_image.py facenet/dataset/test-images/test3.jpg
    

Beginners are encouraged to play with python code in order to have much better understanding of the algorithms. Having executed them, you will find the following results (1.jpg and 2.jpg).

Here, we can see that all the faces are detected by bounding box but none of them are recognized from the database of 6 people we trained. A threshold of 0.43 is set on the predicted probabilities (by predict_proba() definition of SVM  ) for each of the bounded face. Likewise we can see the bounded face have been recognized in the below examples (test1.jpg, test2.jpg, test3.jpg).

test1

test1.jpg

With threshold of 0.43, one of the face in test1.jpg could not be identified as ‘Bug’ (person nickname :-p). There could be possible two reasons. First, face has not been detected well (right eye is not covered in bounded box). Secondly, It’s a low resolution image. The recognition accuracy will be better with higher resolution image. Likewise, I ran face recognition on a short recorded video of my friends.

>>> python facenet/src/face_recognition_video.py 
    facenet/dataset/friends_6.mp4

Concluding Remarks

Hope it was easy to go through tutorial as I have tried to keep it simple and reproducible. Beginners who are interested in image analytics/computer vision can start with this application.

You might be thinking about the mathematics behind the used models like MTCNN and FaceNet models. You are encouraged to study about these models from references section. Apart from that, there can be a lot of experiments which can be done further.

  1. Experiment on a large scale with the pre-trained model. Example in this blog-post has 6 people in database. One can setup an experiment with 100 people in data-set. If you are interested to do it, do reach me out.
  2. Similarly, experiments can be done with more number of images per person as 10 images would not be sufficient as far as accuracy is concerned when there are large number of people in database. Also, SVM may not perform well with lots of classes. Multi-class classifiers like K-NN can be tried out.
  3. A fair study of accuracy can be done on resolution of test images. A high resolution image performs better than low resolution images. Number of pixels captured in bounded face affects the recognition.
  4. For video face detection, people do implement person tracking for each bounded face in order to smoothen the results and filter unwanted wrong identification of few abrupt frames in between. Face tracking has to be implemented for the same.

Probably, I will discuss the architecture about these CNN models and python implementations in some another blog-posts some other time. You can get the full python and tensorflow codes for this experiment from GitHub link here.

References

All the concepts, exploration and demonstration codes has been ingested from these sources.

[1] https://github.com/davidsandberg/facenet
[2] https://github.com/AISangam/Facenet-Real-time-face-recognition-using-deep-learning-Tensorflow
[3] https://arxiv.org/pdf/1503.03832.pdf
[4] https://arxiv.org/pdf/1604.02878.pdf

If you liked the post, follow this blog to get updates about upcoming articles. Also, share it so that it can reach out to the readers who can actually gain from this. Please feel free to discuss anything regarding the post. I would love to hear feedback from you.

Happy deep learning 🙂

Advertisements

29 thoughts on “Yet Another Face Recognition Demonstration on Images/Videos : Using Python and Tensorflow

    • Hey hi,
      It’s simple to follow this tutorial and implement a system on 100 people.

      This blog-post implements the same with 6 people. You follow the same procedure with 100 people.
      You may like to change the classifier from SVM to KNN classifier in the file classifier.py (check Part 3. Training Faces )

      Thanks.

      • HI Abhijeet

        how can we change the classifier from SVM to KNN classifier in the file classifier.py .please tell me.

  1. Thanks for a very lucidly written tutorial.

    I’m having problems running it though. The first command, compare.py, runs fine. But when I run align_dataset_mtcnn.py, I get a strange error “Import Error: no module named facenet”, although the import worked fine for compare.py.

    Can you think of any situation why this should happen? Thanks in advance!Si

    • Also, I’m running Ubuntu 18.04.

      For testing purposes, I created a file test.py with a single line: “import facenet” and ran the file from two places as follows:

      $ python facenet/src/test.py (no error)
      $ python facenet/src/align/test.py (“ImportError: No module named facenet”)

      Why does this happen? Thanks for any inputs! s1b

      • I got through the import hiccup by hard-coding the import path:


        import imp

        imp.load_source(‘facenet’,’/home/s1b/Face_ID/facenet/src/facenet.py’)
        import facenet

        imp.load_source(‘detect_face’, ‘/home/s1b/Face_ID/facenet/src/align/detect_face.py’)
        import detect_face

        I also changed “align.detect_face” to “detect_face” in two places.

        However, I see the following message (after a list of images in four directories):

        Total number of images: 83
        Number of successfully aligned images: 0

        Finally, I have also the following message indicating a missing file in the pretrained model (download as a .tgz as per instructions on the page):

        No such file or directory: ‘facenet/src/20180402-114759/my_classifier.pkl’

        Thanks for any inputs.

        • “””No such file or directory: ‘facenet/src/20180402-114759/my_classifier.pkl”””
          You will have to train the KNN model following steps 1, 2,3 4

          Regarding import issues,
          It is true. We should insert the imported python file in code like this.
          sys.path.insert(1, os.path.join(sys.path[0], ‘..’))

          • As I have checked the above model with 6 no of category then it is giving 90% accuracy but while i increased the number of categories from 6 to 18 then it is giving the accuracy 50%.
            But you suggested in your article that SVM may not perform well with lots of classes. Multi-class classifiers like K-NN can be tried out.

            It will be a great help if you provide the modified K-NN code base.

  2. Hello, it’s a great blog and easy to get started. I have done all the steps correctly. However, I’m getting an issue while testing with images. ie. the images are getting detected and identified. But it’s getting closed automatically. This is the error I’m getting at the console;

    ——————-
    Error: QObject::moveToThread: Current thread (0x14ac9360) is not the object’s thread (0x13a1fa30).
    Cannot move to target thread (0x14ac9360)
    ——————-
    Do you have any insight on this? Thanks in advance!!

  3. Hello, it’s a great blog and easy to get started. I have done all the steps correctly. However, I’m getting an issue while testing with images. ie. the images are getting detected and identified. But it’s getting closed automatically. This is the error I’m getting at the console;

    ——————-
    Error: QObject::moveToThread: Current thread (0x14ac9360) is not the object’s thread (0x13a1fa30).
    Cannot move to target thread (0x14ac9360)
    ——————-
    Do you have any insight on this? Thanks in advance!!

  4. Its really great work , i am also building same for small size input images, like 40X40 as input training image, but accuracy is very poor. Can you suggest something ?

    • If the size of face (training input image) is only 40*40, it will be really difficult to get good accuracy. From what I have read, for a good identification accuracy, you should have at least 96*96 pixels face.
      It is difficult for neural network to learn identity from less number of pixels.

  5. I can’t get outputs, I get this error>

    Start Recognition!
    Traceback (most recent call last):
    File “facenet/src/face_recognition_image.py”, line 71, in
    if frame.ndim == 2:
    AttributeError: ‘NoneType’ object has no attribute ‘ndim’

    Command is> python facenet/src/face_recognition_image.py facenet/dataset/test-images/cicenia.jpeg

  6. I put the pre-trained model into Face_ID/facenet/src/ but when I run the program, it showed “There should not be more than one meta file in the model directory”. How come?

  7. sir how to add speech recognition to emotion recognition in while true loop i just want to get reply in speech based on my emotion my taking the input phrase it has to respond for that is there any way to do that

  8. Hello,

    I have tried KNN for recognizing multiple classes. Can you provide intuition of values present in the result matrix? The values are between 0 and 1.

    Thanks and in advance.

  9. Hello Abhijeet,

    I am using FaceNet for classifying approximately 1000 people. So, it is a good idea to use KNN instead of SVM Classifier. I use K=5 and trained a classifier. In SVM where we get the probability of each class for the test image. Can you explain the intuition behind the values for test image while using KNN? Most of the values are zero and only a few are 0.2, 0.4, 0.6, 0.8 and 1.

    I changed the code in classifier.py to use KNN. I have not done any change in face_recognition_image.py.

    In addition to this, can you give any suggestion to get a distance matrix for test image?

    • Hi,
      kindly find the replies to your queries.
      “””In SVM where we get the probability of each class for the test image”””

      predict_proba() method gives you probability. SVM is a discrete classifier yet they uses some probability calibration technique called Platt scaling. Check https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC.predict_proba

      “””Can you explain the intuition behind the values for test image while using KNN? Most of the values are zero and only a few are 0.2, 0.4, 0.6, 0.8 and 1.”””

      hey, its simple, kindly read how KNN works, if you are taking k=5, it will give probabilities seeing the 5 neighbours. For example, If 3 neighbours are of a particular class, it gives probabilities 0.6. if you take k=10, prob values would be 0.1,0.2,0.3…1.0.

      “””In addition to this, can you give any suggestion to get a distance matrix for test image?”””
      I think you need to run “compare.py” file. Kindly follow the blog post.

      Do share the results on 1000 people for the readers.

      Thanks,
      Abhijeet

  10. Hello, is this code still valid? Im getting an unpickling error in both face_recognition_image.py and face_recognition_video.py scripts. Any clues?

    File “/Users/xxxxx/Desktop/tensorflow/Face_ID-master/facenet/src/face_recognition_image.py”, line 52, in
    (model, class_names) = pickle.load(infile)
    _pickle.UnpicklingError: invalid load key, ‘\x0a’.

    • Yes, Codes are valid.
      You may like to go through whole training process step by step. Also, check this classifier_filename = ‘facenet/src/20180402-114759/my_classifier.pkl’

      Either you have not trained my_classifier.pkl or path is incorrect. May be some linux vs Windows path issue.
      Hope you will figure it out.
      Thanks.

  11. I am getting the error that Face is very close when I test this on image of 2 persons. Though our faces are different.

Leave a Reply