Abstract
Given a short action query video, to detect the same category action in a target video is a very important research topic. We propose a fast action detection method motivated by the idea of Hough Transformation. First, we extract the HOG features at the corner points from the query video. The corner points are referred to as interest points. Then, video clips are formed by sliding a window on the query video. For each T frames of a clip, in the displacement Hough space, the interest points in all of the frames are matched with the interest points in the first frame. We count the matched pairs in the cells of the Hough space to form a 2d displacement histogram. The query video is represented by a 2d displacement histogram sequence. After that, we divide the target video with motion into video cubes. These video cubes are similarly represented by displacement histogram sequences. The matrix cosine similarity is used to compute the similarities between the query video and the video cubes. This process is referred to as action matching. In the end, with the action matching results, we precisely localize the action using the locations of the matched interest points. Our key contribution is that we propose a very simple and fast algorithm that represents the actions as the displacement histogram sequences. Experiments on the challenging datasets containing both of the simple and realistic backgrounds confirm the effectiveness and efficiency of our method.









Similar content being viewed by others
References
Ali S, Shah M (2010) Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans Pattern Anal Mach Intell 32(2):288–303
Bobick AF, Davis JW (2007) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):1257–1265
Boiman O Shechtman E, Irani M (2008) In defense of nearest-neighbor based image classification. In: Proc. IEEE conf. on computer vision and pattern recognition
Cheung K, Baker S, Kanade T (2003) Shape-from-Silhouette of articulated objects and its use for human body kinematics estimation and motion capture. In: Proc. IEEE conf. on computer vision and pattern recognition
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proc. IEEE conf. on computer vision and pattern recognition
Derpanis KG, Sizintsev M, Cannons K, Wildes RP (2010) Efficient action spotting based on a spacetime oriented structure representation. In: Proc. IEEE conf. on computer vision and pattern recognition
Fu Y, Huang TS (2008) Image classification using correlation tensor analysis. IEEE Trans Image Process 17(2):226–234
Gall J, Lempitsky V (2009) Class-specific Hough forests for object detection. In: Proc. IEEE conf. on computer vision and pattern recognition
Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253
Ikizler-Cinbis N, Cinbis RG, Sclaroff S (2009) Learning actions from the web. In: Proc. IEEE conf. on computer vision
Jiang Z, Lin Z, Davis LS (2012) Recognizing human actions by learning and matching shape-motion prototype trees. IEEE Trans Pattern Anal Mach Intell 34(3):533–547
Ke Y, Sukthankar R, Hebert M (2005) Efficient visual event detection using volumetric features. In: Proc. IEEE conf. on computer vision and pattern recognition
Ke Y, Sukthankar R, Hebert M (2007) Event detection in crowded videos. In: Proc. IEEE conf. on computer vision
Kim T, Cipolla R (2009) Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Trans Pattern Anal Mach Intell 31(8):1415–1428
Lampert CH, Blaschko MB, Hofmann T (2008) Beyond sliding windows: object localization by efficient subwindow search. In: Proc. IEEE conf. on computer vision and pattern recognition
Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: Proc. IEEE conf. on computer vision and pattern recognition
Laptev I, Prez P (2007) Retrieving actions in movies. In: Proc. IEEE conf. on computer vision
Laptev Z, Lindeberg T (2003) Space-time interest points. In: Proc. IEEE conf. on computer vision
Little J, Boyd J (1998) Recognizing people by their gait: the shape of motion. J Comput Vis Res 1:2–32
Liu J, Ali S, Shah M (2008) Recognizing human actions using multiple features. In: Proc. IEEE conf. on computer vision and pattern recognition
Mahmood T, Vasilescu A, Sethi S (2001) Recognition of action events from multiple video points. In: Proc. IEEE workshop detection and recognition of events in video
Niebles J, Fei-Fei L (2007) A hierarchical models of shape and appearance for human action classification. In: Proc. IEEE conf. on computer vision and pattern recognition
Niebles J, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318
Ning H, Han T, Walther D, Liu M, Huang T (2009) Hierarchical space-time model enabling efficient search for human actions. IEEE Trans Circuits Syst Video Technol 19(6):808–820
Oikonomopoulous A, Patras I, Pantic M 2005 Spatiotemporal saliency for human action recognition. In: Proc. IEEE conf. on multimedia and expo
Schindler K, Gool LV (2008) Acion snippets: how many frames does human action recognition require? In: IEEE conf. on computer vision and pattern recognition
Scovanner P, Ali S, Shah M (2007) A 3-dimensional SIFT descriptor and its application to action recognition. In: Proc. on ACM multimedia conference
Seo HJ, Milanfar P (2009) Static and space-time visual saliency detection by sel-resemblance. J Vis 9(12):1–27
Seo HJ, Milanfar P (2011) Action recognition from one example. IEEE Trans Pattern Anal Mach Intell 33(5):867–882
Shechtman E, Irani M 2007 Space-time behavior-based correlation-or-how to tell if two underlying motion fields are similar without computing them? IEEE Trans Pattern Anal Mach Intell 29(11):2045–2056
Yilmaz A, Shah M (2005) Action sketch: a novel action representaion. In: Proc. IEEE conf. on computer vision and pattern recognition
Yu G, Yuan JS, Liu ZC (2011) Unsupervised random forest indexing for fast action search. In: Proc. IEEE conf. on computer vision and pattern recognition
Yuan JS, Liu ZC, Wu Y (2011) Discriminative video pattern search for efficient action detection. IEEE Trans Pattern Anal Mach Intell 33(9):1728–1742
Acknowledgements
This work is supported in part by the 973 National Basic Research Program of China (2010CB732501), Fundation of Sichuan Excellent Young Talents (09ZQ026-035) and the Fundamental Research Funds for the Central University.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pei, L., Ye, M., Xu, P. et al. One example based action detection in hough space. Multimed Tools Appl 72, 1751–1772 (2014). https://doi.org/10.1007/s11042-013-1478-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1478-9