A new iterative synthetic data generation method for CNN based stroke gesture recognition

Li, Jiajun; Tao, Jianguo; Ding, Liang; Gao, Haibo; Deng, Zongquan; Luo, Yang; Li, Zhandong

doi:10.1007/s11042-017-5285-6

A new iterative synthetic data generation method for CNN based stroke gesture recognition

Published: 30 October 2017

Volume 77, pages 17181–17205, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jiajun Li ORCID: orcid.org/0000-0003-4848-1715¹,
Jianguo Tao¹,
Liang Ding¹,
Haibo Gao¹,
Zongquan Deng¹,
Yang Luo¹ &
…
Zhandong Li¹

483 Accesses
Explore all metrics

Abstract

Training a stroke gesture classifier by using the state-of-the-art Convolutional Neural Network method requires a large sample size to achieve good performance. This becomes a serious problem when users want to add new gestures to the system because adding so many samples is time-consuming and expensive. In this paper, we propose an iterative synthetic data generation method to solve this problem. The method takes in one user-input template gesture which is modeled by Bezier curve and can generate thousands of samples for training. We propose two different modeling approaches so the method can be applied to both mono and multi-stroke gestures. By applying perturbation to the control points, we can obtain enough samples for training. The generation process is carried out in an iterative way, so the variability in different categories of stroke gestures can be balanced. The variability is measured by the dynamic time wrapping method. The proposed method is tested on our own dataset and two published datasets. Our method outperforms methods with fixed generation process and reaches high recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Gesture recognition based on skeletonization algorithm and CNN with ASL database

Article 06 October 2018

Real-time spatial normalization for dynamic gesture classification

Article 17 July 2021

Gesture MNIST: A New Free-Hand Gesture Dataset

References

Abdulla WH, Chow D, Sin G (2003) Cross-words reference template for DTW-based speech recognition systems. In TENCON 2003. Conference on Convergent Technologies for the Asia-Pacific Region (Vol. 4, pp. 1576–1579). IEEE
Anthony L, Wobbrock JO (2012) $ N-Protractor: A fast and accurate multistroke recognizer. In Proceedings of Graphics Interface 2012 (pp. 117–120). Canadian Information Processing Society
Ballard L, Lopresti D, Monrose F (2007) Forgery quality and its implications for behavioral biometric security. IEEE Trans Syst Man Cybern B 37(5):1107–1118
Article Google Scholar
Bertinetto L, Henriques JF, Valmadre J, Torr P, Vedaldi A (2016) Learning feed-forward one-shot learners. In Advances in Neural Information Processing Systems (pp. 523–531)
Cano J, Pérez-Cortes JC, Arlandis J, Llobet R (2002) Training set expansion in handwritten character recognition. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer Berlin Heidelberg, Heidelberg, pp 548–556
Choe G, Wang T, Liu F, Choe C, So H (2015) Visual tracking based on particle filter with spline resampling. Multimed Tools Appl 74(17):7195–7220
Article Google Scholar
Chung M, Choo H (2014) Picture browsing non-touch interaction methods for smartphones using an accelerometer and camera with a focus on phone dialing. Multimed Tools Appl 72(3):2769–2786
Article Google Scholar
Delaye A, Anquetil E (2013) HBF49 feature set: a first unified baseline for online symbol recognition. Pattern Recogn 46(1):117–130
Article Google Scholar
Elanwar RI (2013) The state of the art in handwriting synthesis. In 2nd International Conference on New Paradigms in Electronics & information Technology (peit’013), Luxor, Egypt
Fan J, Xu W, Wu Y, Gong Y (2010) Human tracking using convolutional neural networks. IEEE Trans Neural Netw 21(10):1610–1623
Article Google Scholar
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Article Google Scholar
Frank M, Biedert R, Ma E, Martinovic I, Song D (2013) Touchalytics: on the applicability of touchscreen input as a behavioral biometric for continuous authentication. IEEE Trans Inf Forensics Secur 8(1):136–148
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680)
Goodrich MA, Schultz AC (2007) Human-robot interaction: a survey. Foundations and trends in human-computer interaction 1(3):203–275
Article MATH Google Scholar
Guyon I (1996) Handwriting synthesis from handwritten glyphs. In Proceedings of the Fifth International Workshop on Frontiers of Handwriting Recognition (pp. 140–153)
Ha TM, Bunke H (1997) Off-line, handwritten numeral recognition by perturbation method. IEEE Trans Pattern Anal Mach Intell 19(5):535–539
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778)
Ho J, Ermon S (2016) Generative adversarial imitation learning. In Advances in Neural Information Processing Systems (pp. 4565–4573)
Isola P, Zhu JY, Zhou T, Efros AA (2016) Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004
Keogh EJ, Pazzani, MJ (2000) Scaling up dynamic time warping for datamining applications. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 285–289). ACM
Keshari B, Watt SM (2008) Online mathematical symbol recognition using svms with features from functional approximation. In Electronic Proc. Mathematical User-Interfaces Workshop
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, Lake Tahoe, pp 1097–1105
Lai YR, Tsai PC, Yao CY, Ruan SJ (2017) Improved local histogram equalization with gradient-based weighting process for edge preservation. Multimed Tools Appl 76(1):1585–1613
Article Google Scholar
Lake BM, Salakhutdinov R, Tenenbaum JB (2015) Human-level concept learning through probabilistic program induction. Science 350(6266):1332–1338
Article MathSciNet MATH Google Scholar
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2016) Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802
Leiva LA, Martín-Albo D, Plamondon R (2016) Gestures à go go: authoring synthetic human-like stroke gestures using the kinematic theory of rapid movements. ACM Trans Intell Syst Technol (TIST) 7(2):15
Google Scholar
Li J, Xu X, Tao J, Ding L, Gao H, Deng Z (2016) Interact with robot: An efficient approach based on finite state machine and mouse gesture recognition. In Human System Interactions (HSI), 2016 9th International Conference on (pp. 203–208). IEEE
Lin Z, Wan L (2007) Style-preserving English handwriting synthesis. Pattern Recogn 40(7):2097–2109
Article MATH Google Scholar
Mantena G, Anguera X (2013) Speed improvements to information retrieval-based dynamic time warping using hierarchical k-means clustering. In Acoustics, Speech and Signal Processing (ICASSP), 2013 I.E. International Conference on (pp. 8515–8519). IEEE
Page T (2014) Touchscreen mobile devices and older adults: a usability study. Int J Hum Factors Ergon 3(1):65–85
Article Google Scholar
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
Romeu JM, Lamiroy B, Sanchez G, Llados J (2006) August. Automatic adjacency grammar generation from user drawn sketches. In Pattern Recognition, 2006. ICPR 2006. 18th International Conference on (Vol. 2, pp. 1026–1029). IEEE
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–9)
Taranta II EM, Maghoumi M, Pittman CR, LaViola Jr, JJ (2016) A Rapid Prototyping Approach to Synthetic Data Generation For Improved 2D Gesture Recognition. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (pp. 873–885). ACM
Toshev A, Szegedy C (2014) Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Washington, D.C, pp 1653–1660
Tu H, Ren X, Zhai S (2012) A comparative evaluation of finger and pen stroke gestures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, pp 1287–1296
Varga T, Bunke H (2003) Generation of synthetic training data for an HMM-based handwriting recognition system. In Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on (pp. 618–622). IEEE
Varga T, Kilchhofer D, Bunke H (2005) Template-based synthetic handwriting generation for the training of recognition systems. In Proceedings of the 12th Conference of the International Graphonomics Society (pp. 206–211)
Vatavu RD, Anthony L, Wobbrock JO (2012) Gestures as point clouds: a $ P recognizer for user interface prototypes. In Proceedings of the 14th ACM international conference on Multimodal interaction (pp. 273–280). ACM
Vatavu RD, Anthony L, Wobbrock JO (2013) Relative accuracy measures for stroke gestures. In Proceedings of the 15th ACM on International conference on multimodal interaction. ACM, Syndney, pp 279–286
Wobbrock JO, Wilson AD, Li Y (2007) Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes. In Proceedings of the 20th annual ACM symposium on User interface software and technology (pp. 159–168). ACM
Woźniak M, Graña M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Information Fusion 16:3–17
Article Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Robotics and System, Harbin Institute of Technology, No. 92 Xidazhi Street, Harbin, Heilongjiang, 150001, China
Jiajun Li, Jianguo Tao, Liang Ding, Haibo Gao, Zongquan Deng, Yang Luo & Zhandong Li

Authors

Jiajun Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianguo Tao
View author publications
You can also search for this author in PubMed Google Scholar
Liang Ding
View author publications
You can also search for this author in PubMed Google Scholar
Haibo Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zongquan Deng
View author publications
You can also search for this author in PubMed Google Scholar
Yang Luo
View author publications
You can also search for this author in PubMed Google Scholar
Zhandong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiajun Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, J., Tao, J., Ding, L. et al. A new iterative synthetic data generation method for CNN based stroke gesture recognition. Multimed Tools Appl 77, 17181–17205 (2018). https://doi.org/10.1007/s11042-017-5285-6

Download citation

Received: 06 March 2017
Revised: 22 August 2017
Accepted: 05 October 2017
Published: 30 October 2017
Issue Date: July 2018
DOI: https://doi.org/10.1007/s11042-017-5285-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

A new iterative synthetic data generation method for CNN based stroke gesture recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Gesture recognition based on skeletonization algorithm and CNN with ASL database

Real-time spatial normalization for dynamic gesture classification

Gesture MNIST: A New Free-Hand Gesture Dataset

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A new iterative synthetic data generation method for CNN based stroke gesture recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Gesture recognition based on skeletonization algorithm and CNN with ASL database

Real-time spatial normalization for dynamic gesture classification

Gesture MNIST: A New Free-Hand Gesture Dataset

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation