Skip to main content
Log in

A new iterative synthetic data generation method for CNN based stroke gesture recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Training a stroke gesture classifier by using the state-of-the-art Convolutional Neural Network method requires a large sample size to achieve good performance. This becomes a serious problem when users want to add new gestures to the system because adding so many samples is time-consuming and expensive. In this paper, we propose an iterative synthetic data generation method to solve this problem. The method takes in one user-input template gesture which is modeled by Bezier curve and can generate thousands of samples for training. We propose two different modeling approaches so the method can be applied to both mono and multi-stroke gestures. By applying perturbation to the control points, we can obtain enough samples for training. The generation process is carried out in an iterative way, so the variability in different categories of stroke gestures can be balanced. The variability is measured by the dynamic time wrapping method. The proposed method is tested on our own dataset and two published datasets. Our method outperforms methods with fixed generation process and reaches high recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
€32.70 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Abdulla WH, Chow D, Sin G (2003) Cross-words reference template for DTW-based speech recognition systems. In TENCON 2003. Conference on Convergent Technologies for the Asia-Pacific Region (Vol. 4, pp. 1576–1579). IEEE

  2. Anthony L, Wobbrock JO (2012) $ N-Protractor: A fast and accurate multistroke recognizer. In Proceedings of Graphics Interface 2012 (pp. 117–120). Canadian Information Processing Society

  3. Ballard L, Lopresti D, Monrose F (2007) Forgery quality and its implications for behavioral biometric security. IEEE Trans Syst Man Cybern B 37(5):1107–1118

    Article  Google Scholar 

  4. Bertinetto L, Henriques JF, Valmadre J, Torr P, Vedaldi A (2016) Learning feed-forward one-shot learners. In Advances in Neural Information Processing Systems (pp. 523–531)

  5. Cano J, Pérez-Cortes JC, Arlandis J, Llobet R (2002) Training set expansion in handwritten character recognition. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer Berlin Heidelberg, Heidelberg, pp 548–556

  6. Choe G, Wang T, Liu F, Choe C, So H (2015) Visual tracking based on particle filter with spline resampling. Multimed Tools Appl 74(17):7195–7220

    Article  Google Scholar 

  7. Chung M, Choo H (2014) Picture browsing non-touch interaction methods for smartphones using an accelerometer and camera with a focus on phone dialing. Multimed Tools Appl 72(3):2769–2786

    Article  Google Scholar 

  8. Delaye A, Anquetil E (2013) HBF49 feature set: a first unified baseline for online symbol recognition. Pattern Recogn 46(1):117–130

    Article  Google Scholar 

  9. Elanwar RI (2013) The state of the art in handwriting synthesis. In 2nd International Conference on New Paradigms in Electronics & information Technology (peit’013), Luxor, Egypt

  10. Fan J, Xu W, Wu Y, Gong Y (2010) Human tracking using convolutional neural networks. IEEE Trans Neural Netw 21(10):1610–1623

    Article  Google Scholar 

  11. Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611

    Article  Google Scholar 

  12. Frank M, Biedert R, Ma E, Martinovic I, Song D (2013) Touchalytics: on the applicability of touchscreen input as a behavioral biometric for continuous authentication. IEEE Trans Inf Forensics Secur 8(1):136–148

    Article  Google Scholar 

  13. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680)

  14. Goodrich MA, Schultz AC (2007) Human-robot interaction: a survey. Foundations and trends in human-computer interaction 1(3):203–275

    Article  MATH  Google Scholar 

  15. Guyon I (1996) Handwriting synthesis from handwritten glyphs. In Proceedings of the Fifth International Workshop on Frontiers of Handwriting Recognition (pp. 140–153)

  16. Ha TM, Bunke H (1997) Off-line, handwritten numeral recognition by perturbation method. IEEE Trans Pattern Anal Mach Intell 19(5):535–539

    Article  Google Scholar 

  17. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778)

  18. Ho J, Ermon S (2016) Generative adversarial imitation learning. In Advances in Neural Information Processing Systems (pp. 4565–4573)

  19. Isola P, Zhu JY, Zhou T, Efros AA (2016) Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004

  20. Keogh EJ, Pazzani, MJ (2000) Scaling up dynamic time warping for datamining applications. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 285–289). ACM

  21. Keshari B, Watt SM (2008) Online mathematical symbol recognition using svms with features from functional approximation. In Electronic Proc. Mathematical User-Interfaces Workshop

  22. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, Lake Tahoe, pp 1097–1105

  23. Lai YR, Tsai PC, Yao CY, Ruan SJ (2017) Improved local histogram equalization with gradient-based weighting process for edge preservation. Multimed Tools Appl 76(1):1585–1613

    Article  Google Scholar 

  24. Lake BM, Salakhutdinov R, Tenenbaum JB (2015) Human-level concept learning through probabilistic program induction. Science 350(6266):1332–1338

    Article  MathSciNet  MATH  Google Scholar 

  25. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551

    Article  Google Scholar 

  26. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  27. Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2016) Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802

  28. Leiva LA, Martín-Albo D, Plamondon R (2016) Gestures à go go: authoring synthetic human-like stroke gestures using the kinematic theory of rapid movements. ACM Trans Intell Syst Technol (TIST) 7(2):15

    Google Scholar 

  29. Li J, Xu X, Tao J, Ding L, Gao H, Deng Z (2016) Interact with robot: An efficient approach based on finite state machine and mouse gesture recognition. In Human System Interactions (HSI), 2016 9th International Conference on (pp. 203–208). IEEE

  30. Lin Z, Wan L (2007) Style-preserving English handwriting synthesis. Pattern Recogn 40(7):2097–2109

    Article  MATH  Google Scholar 

  31. Mantena G, Anguera X (2013) Speed improvements to information retrieval-based dynamic time warping using hierarchical k-means clustering. In Acoustics, Speech and Signal Processing (ICASSP), 2013 I.E. International Conference on (pp. 8515–8519). IEEE

  32. Page T (2014) Touchscreen mobile devices and older adults: a usability study. Int J Hum Factors Ergon 3(1):65–85

    Article  Google Scholar 

  33. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434

  34. Romeu JM, Lamiroy B, Sanchez G, Llados J (2006) August. Automatic adjacency grammar generation from user drawn sketches. In Pattern Recognition, 2006. ICPR 2006. 18th International Conference on (Vol. 2, pp. 1026–1029). IEEE

  35. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  36. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–9)

  37. Taranta II EM, Maghoumi M, Pittman CR, LaViola Jr, JJ (2016) A Rapid Prototyping Approach to Synthetic Data Generation For Improved 2D Gesture Recognition. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (pp. 873–885). ACM

  38. Toshev A, Szegedy C (2014) Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Washington, D.C, pp 1653–1660

  39. Tu H, Ren X, Zhai S (2012) A comparative evaluation of finger and pen stroke gestures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, pp 1287–1296

  40. Varga T, Bunke H (2003) Generation of synthetic training data for an HMM-based handwriting recognition system. In Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on (pp. 618–622). IEEE

  41. Varga T, Kilchhofer D, Bunke H (2005) Template-based synthetic handwriting generation for the training of recognition systems. In Proceedings of the 12th Conference of the International Graphonomics Society (pp. 206–211)

  42. Vatavu RD, Anthony L, Wobbrock JO (2012) Gestures as point clouds: a $ P recognizer for user interface prototypes. In Proceedings of the 14th ACM international conference on Multimodal interaction (pp. 273–280). ACM

  43. Vatavu RD, Anthony L, Wobbrock JO (2013) Relative accuracy measures for stroke gestures. In Proceedings of the 15th ACM on International conference on multimodal interaction. ACM, Syndney, pp 279–286

  44. Wobbrock JO, Wilson AD, Li Y (2007) Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes. In Proceedings of the 20th annual ACM symposium on User interface software and technology (pp. 159–168). ACM

  45. Woźniak M, Graña M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Information Fusion 16:3–17

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiajun Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Tao, J., Ding, L. et al. A new iterative synthetic data generation method for CNN based stroke gesture recognition. Multimed Tools Appl 77, 17181–17205 (2018). https://doi.org/10.1007/s11042-017-5285-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5285-6

Keywords

Navigation