CenterTrack-Tracking Objects As Points

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

Tracking Objects as Points

Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl


UT Austin & Intel Labs
Early trackers

https://www.mathworks.com/matlabcentral/fileexchange/48745-lucas-kanade-tutorial-example-2
Early trackers

https://www.mathworks.com/matlabcentral/fileexchange/48745-lucas-kanade-tutorial-example-2
Current frameworks: Tracking-after-detection
Frame t-1

Frame t
Current frameworks: Tracking-after-detection
Frame t-1

Frame t
Current frameworks: Tracking-after-detection
Current frameworks: Tracking-after-detection
Current frameworks: Tracking-after-detection

Tang et al. 2017: Re-identification features, pose features


Xu et al. 2019: Spatial-temporal trajectories
Simultaneous detection and tracking
Frame t-1

Frame t

Bergmann et al. 2019 Tracking without bells and whistles


Simultaneous detection and tracking
Frame t-1

Frame t

Bergmann et al. 2019 Tracking without bells and whistles


Frame t

Frame t-1

Tracks t-1 Deep Network


Frame t
Frame t-1
Tracks t-1
Frame t Detections t

Frame t-1

Deep
Network
Tracks t-1 Offsets t → t-1
Detections t
Offsets t → t-1
Offsets t → t-1
Advantages
Advantages
• Simplified tracking conditioned detection.
Conditioned detection
• Ours:
• Tractor [Bergmann et al. 2019]:

• Implicit prior heatmap • Explicit region proposal


Advantages
• Simplified tracking conditioned detection.

• Simplified matching.
Point-based matching
• Ours:
• Prior works:

• Greedy matching by point distance. • Hungarian algorithm.

• Separate motion model.

• Additional association features.


Advantages
• Simplified tracking conditioned detection.

• Simplified matching.

• Simplified training on videos.


Frame t-1
Frame t
Results
Results - KITTI
Extend to monocular 3D tracking
Results - monocular 3D tracking on nuScenes
Ablation studies
MOT17 (30 FPS) KITTI (10 FPS) nuScenes (2FPS)
67 89 30
detection only
w/o offset
w/o heatmap
66 87.75 Ours 22.5

65 86.5 15

64 85.25 7.5

63 84 0
Ablation studies
MOT17 (30 FPS) KITTI (10 FPS) nuScenes (2FPS)
67 89 30
detection only
without vs. with heatmap w/o offset
w/o heatmap
66 87.75 Ours 22.5

65 86.5 15

64 85.25 7.5

63 84 0
Ablation studies
MOT17 (30 FPS) KITTI (10 FPS) nuScenes (2FPS)
89 30
67 without vs. with offset
detection only
w/o offset
w/o heatmap
66 87.75 Ours 22.5

65 86.5 15

64 85.25 7.5

63 84 0
Ablation studies
MOT17 (30 FPS) KITTI (10 FPS) nuScenes (2FPS)
67 89 30
detection only
w/o offset
w/o heatmap
66 87.75 Ours 22.5

65 86.5 15

64 85.25 7.5

63 84 0
Ablation studies - motion models
Trained on image data only
Trained on image data only
Code is available!

https://github.com/xingyizhou/CenterTrack

You might also like