Introduction To Face Processing With Computer Vision
Introduction To Face Processing With Computer Vision
• Practice
• Rapid Prototyping
• Scaling
3
Theory
4
Face Detection
5
Haar-Like Features
• Summarize image based on simple color patterns
• Manually determined feature extractors (kernels)
29
Facial Recognition
• Facial recognition actually corresponds to group of
different tasks.
• Verification vs. Identification vs. Grouping vs. …
• Closed-Set vs. Open-Set
30
Closed-Set Recognition
• Every identity appears in training set
• Example: recognizing celebrities
• Effectively a classification problem
• Model aims to learn separable features
31
Closed-Set Identification
Label 0 Label 1 …
… …
Images: Wikimedia 32
Closed-Set Verification
Images: Wikimedia 33
Open-Set Recognition
• Not every identity appears in training set
• Example: Facebook Photos
• Effectively a metric learning problem
• Model aims to learn large-margin features (embeddings)
34
Embeddings
• Map each sample to a vector (coordinate system)
• Used for words, graphs, faces, etc.
• Embeddings preserve similarity
• Similar samples close to each other
• Dissimilar samples far from each other
35
Images: Wikimedia 36
Embeddings
• “Similar” depends on the training data
• Same person, physical characteristic, etc.
• Embeddings represent latent information
• High-dimensional embeddings trained on large datasets
learn to represent latent information about the person (e.g.
physical characteristics)
37
Open-Set Identification
Images: Wikimedia 38
Open-Set Verification
Images: Wikimedia 39
Metric Learning
99.85%+ accuracy
42
Cross-Age
46
Security
• How do we deal with adversarial users?
• Real face goes undetected or misclassified
• Fake face gets recognized
• Private data is extracted from model
•…
47
Security
Ref: Singh et al. (2010); Ross & Jain (2004); Ross & Govindarajan(2005) 51
Ref: Apple 52
Privacy
• How do we deal with…
• Models that can predict gender, race, …?
• Models that leak the data?
• Predictions without sharing the raw data?
•…
54
Alignment & Pose Estimation
56
Classification
Neutral
Happy
Happy
57
3D Reconstruction
60
61
accuracy
…
ac
e
a ce
aF tF
in gh
t si
Re In
N
N et
N
TC ce
M Fa
n
tio
ni
Dozens of Tools
V
nC c og
pe re
O
c e_
fa
…
…
simplicity
APIs
• There are dozens of APIs providing low-cost face
processing at scale
• Most services charge less than $1 per 1000 images
• Depending on the use case, might be cheaper than provisioning GPUs
and deploying your own models (esp. if considering developer time)
62
APIs – Example: Azure
• Detection
• Classification
• Gender, age, emotion, hair, smile, eyes, glasses, makeup, …
• Landmarks
• Pose Estimation
• Recognition
• Verification, identification, grouping, similarity search, …
63
Embeddings
• Face embeddings are typically used for open-set
recognition systems
• They can be leveraged to quickly train models for
downstream tasks (e.g. classification)
• Tools
• face_recognition (Github): extremely fast, reliable for frontal
• FaceNet: based on deep learning, strong across the board
64
Example – Facebook Photos
• Task: open-set face identification
• Strategy:
1. Detect faces and compute embeddings for known photos
of users; store for future use.
2. Whenever a photo is uploaded, do the same and compare
against known set.
65
Example – Detection
import face_recognition as fr
image = fr.load_image_file("file.jpg")
face_locations = fr.face_locations(image)
Ref: github.com/ageitgey/face_recognition 66
Example – Embedding
image = fr.load_image_file("file.jpg")
face_embedding = fr.face_encodings(image)[0]
Ref: github.com/ageitgey/face_recognition 67
Example – L2 Distance
69
Example – Face Landmarks
face_landmarks = fr.face_landmarks(image)[0]
print(face_landmarks.keys())
# left_eyebrow, right_eyebrow, lower_lip, top_lip, …
Ref: github.com/ageitgey/face_recognition 70
71
Example – Snapchat Filters
• Task: face manipulation
• Strategy:
1. Detect face and localize landmarks in image
2. Add objects, reshape image, etc. based on landmarks
72
Example – Snapchat Filters
from PIL import Image, ImageDraw
…
pil_image = Image.fromarray(image)
d = ImageDraw.Draw(pil_image, 'RGBA’)
lip_fill = (150, 0, 0, 128) # shade of red, 50% alpha
d.polygon(face_landmarks['top_lip'], fill=lip_fill)
d.polygon(face_landmarks['bottom_lip'], fill=lip_fill)
73
Scaling
75
Bias
• People & Demographics
• Is your training set… Coworkers? Single location?
• Environment
• Does it cover… Day and night? Seasons? Lighting
conditions? Backgrounds?
• Sensors
• Did you consider… Diverse hardware? Calibration?
Viewpoint (angle)? Resolution? Occlusion?
76
Optimizations
• It is often easier to simplify the real-world task than
drastically improve ML models.
77
Optimizations
Time (weeks)
78
Optimizations
Time (weeks)
79
Risks
• What happens when your model makes a mistake?
• How can you deal with adversarial users?
• What is your threat model?
80
Other Considerations
• How do you handle…
• Model getting stale over time?
• Growing search space?
• Large amounts of real-time data?
• Detecting or tracking people vs. faces?
• Speed vs. cost vs. performance trade-offs?
81
Thank you.
gabriel@scalarresearch.com
82