update README with mtcnn section and instructions about mtcnn usage

justadudewhohacks · justadudewhohacks · commit b7c3c3b1bbfd · 2018-07-14T16:50:19.000+02:00
diff --git a/README.md b/README.md
@@ -8,14 +8,16 @@ Check out my article **[face-api.js — JavaScript API for Face Recognition
 
 * **[Running the Examples](#running-the-examples)**
 * **[About the Package](#about-the-package)**
-  * **[Face Detection](#about-face-detection)**
+  * **[Face Detection - SSD Mobilenet v1](#about-face-detection-ssd)**
+  * **[Face Detection & 5 Point Face Landmarks - MTCNN](#about-face-detection-mtcnn)**
   * **[Face Recognition](#about-face-recognition)**
-  * **[Face Landmark Detection](#about-face-landmark-detection)**
+  * **[68 Point Face Landmark Detection](#about-face-landmark-detection)**
 * **[Usage](#usage)**
   * **[Loading the Models](#usage-load-models)**
-  * **[Face Detection](#usage-face-detection)**
-  * **[Face Recognition](#usage-face-recognition)**
-  * **[Face Landmark Detection](#usage-face-landmark-detection)**
+  * **[Face Detection - SSD Mobilenet v1](#usage-face-detection-ssd)**
+  * **[Face Detection & 5 Point Face Landmarks - MTCNN](#usage-face-detection-mtcnn)**
+  * **[Face Recognition - FaceNet](#usage-face-recognition)**
+  * **[68 Point Face Landmark Detection](#usage-face-landmark-detection)**
   * **[Full Face Detection and Recognition Pipeline](#usage-full-face-detection-and-recognition-pipeline)**
 
 ## Examples
@@ -38,8 +40,14 @@ Check out my article **[face-api.js — JavaScript API for Face Recognition
 
 ### Live Video Face Detection
 
+**SSD Mobilenet v1**
+
 ![preview_video-facedetection](https://user-images.githubusercontent.com/31125521/41238649-bbf10046-6d96-11e8-9041-1de46c6adccd.jpg)
 
+**MTCNN**
+
+![preview_mtcnn_video](https://user-images.githubusercontent.com/31125521/42725487-857adfd4-8784-11e8-8de2-4faae81e7ea2.jpg)
+
 ### Face Alignment
 
 ![preview_face_alignment](https://user-images.githubusercontent.com/31125521/41526994-1a690818-72e6-11e8-8f3c-d2cf31fe517b.jpg)
@@ -60,25 +68,33 @@ Browse to http://localhost:3000/.
 
 ## About the Package
 
-<a name="about-face-detection"></a>
+<a name="about-face-detection-ssd"></a>
 
-### Face Detection
+### Face Detection - SSD Mobilenet v1
 
-For face detection, this project implements a SSD (Single Shot Multibox Detector) based on MobileNetV1. The neural net will compute the locations of each face in an image and will return the bounding boxes together with it's probability for each face.
+For face detection, this project implements a SSD (Single Shot Multibox Detector) based on MobileNetV1. The neural net will compute the locations of each face in an image and will return the bounding boxes together with it's probability for each face. This face detector is aiming towards obtaining high accuracy in detecting face bounding boxes instead of low inference time.
 
 The face detection model has been trained on the [WIDERFACE dataset](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/) and the weights are provided by [yeephycho](https://github.com/yeephycho) in [this](https://github.com/yeephycho/tensorflow-face-detection) repo.
 
+<a name="about-face-detection-mtcnn"></a>
+
+### Face Detection & 5 Point Face Landmarks - MTCNN
+
+MTCNN (Multi-task Cascaded Convolutional Neural Networks) represents an alternative to SSD Mobilenet v1, which offers much more room for configuration and is able to achieve much lower processing times. MTCNN is a 3 stage cascaded CNN, which simultanously returns 5 face landmark points along with the bounding boxes and scores for each face. By limiting the minimum size of faces expected in an image, MTCNN allows you to process frames from your webcam in realtime. Additionally with 2MB, the size of the weights file is only a third of the size of the quantized SSD Mobilenet v1 model (~6MB).
+
+MTCNN has been presented in the paper [Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks](https://kpzhang93.github.io/MTCNN_face_detection_alignment/paper/spl.pdf) by Zhang et al. and the model weights are provided in the official [repo](https://github.com/kpzhang93/MTCNN_face_detection_alignment) of the MTCNN implementation.
+
 <a name="about-face-recognition"></a>
 
-### Face Recognition
+### Face Recognition - FaceNet
 
 For face recognition, a ResNet-34 like architecture is implemented to compute a face descriptor (a feature vector with 128 values) from any given face image, which is used to describe the characteristics of a persons face. The model is **not** limited to the set of faces used for training, meaning you can use it for face recognition of any person, for example yourself. You can determine the similarity of two arbitrary faces by comparing their face descriptors, for example by computing the euclidean distance or using any other classifier of your choice.
 
 The neural net is equivalent to the **FaceRecognizerNet** used in [face-recognition.js](https://github.com/justadudewhohacks/face-recognition.js) and the net used in the [dlib](https://github.com/davisking/dlib/blob/master/examples/dnn_face_recognition_ex.cpp) face recognition example. The weights have been trained by [davisking](https://github.com/davisking) and the model achieves a prediction accuracy of 99.38% on the LFW (Labeled Faces in the Wild) benchmark for face recognition.
 
 <a name="about-face-landmark-detection"></a>
 
-### Face Landmark Detection
+### 68 Point Face Landmark Detection
 
 This package implements a CNN to detect the 68 point face landmarks for a given face image.
 
@@ -113,6 +129,7 @@ await faceapi.loadFaceDetectionModel('/models')
 // accordingly for the other models:
 // await faceapi.loadFaceLandmarkModel('/models')
 // await faceapi.loadFaceRecognitionModel('/models')
+// await faceapi.loadMtcnnModel('/models')
 ```
 
 As an alternative, you can also create instance of the neural nets:
@@ -122,12 +139,14 @@ const net = new faceapi.FaceDetectionNet()
 // accordingly for the other models:
 // const net = new faceapi.FaceLandmarkNet()
 // const net = new faceapi.FaceRecognitionNet()
+// const net = new faceapi.Mtcnn()
 
 await net.load('/models/face_detection_model-weights_manifest.json')
 // await net.load('/models/face_landmark_68_model-weights_manifest.json')
 // await net.load('/models/face_recognition_model-weights_manifest.json')
+// await net.load('/models/mtcnn_model-weights_manifest.json')
 
-// or simply
+// or simply load all models
 await net.load('/models')
 ```
 
@@ -145,9 +164,9 @@ const weights = new Float32Array(res.data)
 net.load(weights)
 ```
 
-<a name="usage-face-detection"></a>
+<a name="usage-face-detection-ssd"></a>
 
-### Face Detection
+### Face Detection - SSD Mobilenet v1
 
 Detect faces and get the bounding boxes and scores:
 
@@ -178,6 +197,62 @@ You can also obtain the tensors of the unfiltered bounding boxes and scores for
 const { boxes, scores } = await net.forward('myImg')
 ```
 
+<a name="usage-face-detection-mtcnn"></a>
+
+### Face Detection & 5 Point Face Landmarks - MTCNN
+
+Detect faces and get the bounding boxes and scores:
+
+``` javascript
+// defaults parameters shown:
+const forwardParams = {
+  // number of scaled versions of the input image passed through the CNN
+  // of the first stage, lower numbers will result in lower inference time,
+  // but will also be less accurate
+  maxNumScales: 10,
+  // scale factor used to calculate the scale steps of the image
+  // pyramid used in stage 1
+  scaleFactor: 0.709,
+  // the score threshold values used to filter the bounding
+  // boxes of stage 1, 2 and 3
+  scoreThresholds: [0.6, 0.7, 0.7],
+  // mininum face size to expect, the higher the faster processing will be,
+  // but smaller faces won't be detected
+  minFaceSize: 20
+}
+
+const results = await faceapi.mtcnn(document.getElementById('myImg'), forwardParams)
+```
+
+Alternatively you can also specify the scale steps manually:
+
+``` javascript
+const forwardParams = {
+  scaleSteps: [0.4, 0.2, 0.1, 0.05]
+}
+
+const results = await faceapi.mtcnn(document.getElementById('myImg'), forwardParams)
+```
+
+Finally you can draw the returned bounding boxes and 5 Point Face Landmarks into a canvas:
+
+``` javascript
+const minConfidence = 0.9
+
+if (results) {
+  results.forEach(({ faceDetection, faceLandmarks }) => {
+
+    // ignore results with low confidence score
+    if (faceDetection.score < minConfidence) {
+      return
+    }
+
+    faceapi.drawDetection('overlay', faceDetection)
+    faceapi.drawLandmarks('overlay', faceLandmarks)
+  })
+}
+```
+
 <a name="usage-face-recognition"></a>
 
 ### Face Recognition
@@ -265,7 +340,7 @@ const fullFaceDescriptions = await faceapi.allFaces(input, minConfidence)
 const fullFaceDescription0 = fullFaceDescriptions[0]
 console.log(fullFaceDescription0.detection) // bounding box & score
 console.log(fullFaceDescription0.landmarks) // 68 point face landmarks
-console.log(fullFaceDescription0.descriptor) // face descriptors
+console.log(fullFaceDescription0.descriptor) // face descriptor
 
 ```