Skip to main content


Log in

Hybrid graph transformer networks for multivariate time series anomaly detection

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript


Anomaly detection for multivariate time series has been widely used in industry and has become one of the hot research problems in the field of data mining. However, the existing anomaly detection methods still have the following limitations: (1) The topological relationship between sensors is complex and nonlinear, and it is difficult to effectively model the inter-sensor dependencies. (2) Most anomaly detection models tend to ignore the critical temporal information in different time steps when capturing the temporal dependencies. To address these problems, we propose a hybrid graph transformer network for multivariate time series anomaly detection (HGTMAD), which combines the transformer with graph convolution to predict multivariate time series-based anomalies. In the time domain, we design a new sparse metric transformer network to capture the time dependence, where the Wasserstein distance is used to measure out significant dot product pairs to learn a better time series representation. In the spatial domain, we design a new channel convolution transformer network fused with a graph convolution network to learn accurately the complex dependencies of multivariate time series in spatial and on feature dimensions based on a combination of local and global approaches. Extensive experiments on publicly available datasets further demonstrate that the proposed HGTMAD significantly outperforms the mainstream state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
€32.70 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

All of our datasets come from public datasets. You can go to the corresponding official website to download.


  1. Mahdavinejad M, Rezvan M, Barekatain M et al (2018) Machine learning for internet of things data analysis: a survey. Dig Commun Netw 4:161–175

    Article  Google Scholar 

  2. Cai Z, He Z (2019) Trading private range counting over big iot data. In: Proceedings of the IEEE International Conference on Distributed Computing Systems, pp 144–153

  3. Deng A, Hooi B (2021) Graph neural network-based anomaly detection in multivariate time series. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 4027–4035

  4. Hundman K, Constantinou V, Laporte C et al (2018) Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining. ACM, 387–395

  5. Li D, Chen D, Jin B et al (2019) Multivariate anomaly detection for time series data with generative adversarial networks. In: Proceedings of the International Conference on Artifcial Neural Networks, pp 703–716

  6. Kingsbury K, Alvaro P (2020) Inferring isolation anomalies from experimental observations. In: Proceedings of the International Conference on Very Large Databases Endowment, pp 268–280

  7. Sakurada M, Yairi T (2014) Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of Workshop on Machine Learning for Sensory Data Analysis, pp 4–11

  8. Su Y, Zhao Y, Niu C et al (2019) Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 2828–283

  9. Dai L, Lin T, Liu C et al (2021) SDFVAE: static and dynamic factorized vae for anomaly detection of multivariate CDN KPIs. In: Proceedings of the International World Wide Web Conference, pp 3076–3086

  10. Zhao H, Wang Y, Duan J et al (2020) Multivariate time-series anomaly detection via graph attention network. In: Proceedings of the IEEE International Conference on Data Mining, pp 841–850

  11. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Proceedings of Neural Information Processing Systems, pp 5998–6008

  12. Devlin J, Chang M, Lee K et al (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 4171–4186

  13. Rao Y, Zhao W, Zhu Z et al (2021) Global Filter Networks for Image Classification. In: Proceedings of the Computer Vision and Pattern Recognition, pp 980–993

  14. Tuli S, Casale G, Jennings N (2022) TranAD: deep transformer networks for anomaly detection in multivariate time series data. In: Proceedings of the International Conference on Very Large Data Bases, pp 1201–1214

  15. Park C, Lee C, Bahng H et al (2020) ST-GRAT: A novel spatio-temporal graph attention networks for accurately forecasting dynamically changing road speed. In: Proceedings of the International Conference on Information and Knowledge Management, pp 1215–1224

  16. Zheng C, Fan X, Wang C et al (2020) GMAN: a graph multi-attention network for traffic prediction. In: Proceedings of the Conference on Artificial Intelligence, pp 1234–1241

  17. Li Y, Yu R, Shahabi C et al (2018) Diffusion convolutional recurrent neural network: data-driven traffic forecasting. In: Proceedings of the International Conference on Learning Representations, pp 1–16

  18. Campos D, Kieu T, Guo C et al (2022) Unsupervised time series outlier detection with diversity-driven convolutional ensembles. In: Proceedings of the International Conference on Very Large Data Bases, pp 611–623

  19. Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: Proceedings of the International Conference on Machine Learning, pp 1310–1318

  20. Wang S, Li B, Khabsa M et al (2020) Linformer: self-attention with linear. arXiv:2006.04768

  21. Wu H, Xu J, Wang J, et al (2021) Autoformer: de-composition transformers with auto-correlation for long-term series forecasting. In: Proceedings of the Advances in Neural Information Processing Systems, pp 101–112

  22. hou H, Zhang S, Peng J et al (2021) Informer: beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 11106–11115

  23. Grubbs Frank E (1969) Procedures for detecting outlying observations in samples. Technometrics 11:1–21

    Article  Google Scholar 

  24. Park D, Hoshi Y, Kemp C (2018) A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder. IEEE Robot Autom Lett 3(3):1544–1551

    Article  Google Scholar 

  25. Liu Y, Garg S, Nie J (2021) Deep anomaly detection for time-series data in industrial iot: a communication-efficient on-device federated learning approach. IEEE Internet of Things J 8:6348–6358

    Article  Google Scholar 

  26. Pilastre B, Boussouf L, d’Escrivan S et al (2020) Anomaly detection in mixed telemetry data using a sparse representation and dictionary learning. Signal Process 168:107320–107330

    Article  Google Scholar 

  27. Aggarwal CC (2015) “Outlier analysis,’’ in data mining. Springer, pp 237–263

    Google Scholar 

  28. Wu Z, Pan S, Chen F et al (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32:4–24

    Article  MathSciNet  Google Scholar 

  29. Yu M, Sun S (2020) Policy-based reinforcement learning for time series anomaly detection. Eng Appl Artif Intell 95:103919–103946

    Article  Google Scholar 

  30. Audibert J, Michiardi P, Guyard F et al (2020) Usad: unsupervised anomaly detection on multivariate time series. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 3395–3404

  31. Wang X, Pi D, Zhang X et al (2022) Variational transformer-based anomaly detection approach for multivariate time series. Measurement 191:110801

    Article  Google Scholar 

  32. Liu Y, Han Y, An W (2022) AttVAE: a novel anomaly detection framework for multivariate time series. In: International Conference on Science of Cyber Security. Springer, pp 407–420

  33. Hidasi B, Karatzoglou A, Baltrunas L et al (2015) Session-based recommendations with recurrent neural networks. In: Proceedings of the International Conference on Learning Representations, arXiv:1511.06939

  34. Yu B, Yin H, Zhu Z (2017) Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp 3634–3640

  35. Nikita K, Lukasz K, Anselm L (2020) Reformer: the efficient transformer. In: Proceedings of the International Conference on Learning Representations, arXiv:2001.04451

  36. Wu H, Xu J, Wang J, Long M (2021) Autoformer: decomposition transformers with auto-correlation for long-term series forecasting. In: Proceedings of the International Conference on Neural Information Processing Systems, pp 22419–22430

  37. Li S, Jin X, Xuan Y et al (2019) Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. In: Proceedings of the International Conference on Neural Information Processing Systems, pp 5243–5253

  38. Chen Z, Chen D, Yuan Z et al (2021) Learning graph structures with transformer for multivariate time series anomaly detection in IoT. IEEE Internet Things J 9:9179–9189

    Article  Google Scholar 

  39. Wu Z, Pan S, Long G, et al (2019) Graph wavenet for deep spatial-temporal graph modeling. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp 1907–1913

  40. Dosovitskiy A, Beyer L, Kolesnikov A et al (2022) An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations, arXiv:2010.11929

  41. Liu Z, Lin Y, Cao Y et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the Computer Vision and Pattern Recognition.

  42. Tsai, Y, Bai S, Yamada M, et al (2019) Transformer dissection: an unified understanding for transformer’s attention via the Lens of Kernel. In: Proceedings of the Association for Computational Linguistics, pp 4335–4344

  43. Xu H, Chen W, Zhao N et al (2018) Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications. In: Proceedings of the World Wide Web Conference, pp 187–196

  44. Zhang Y, Chen Y, Wang J et al (2021) Unsupervised deep anomaly detection for multi-sensor time-series signals. IEEE Trans Knowl Data Eng 35:2118–2132

    Google Scholar 

  45. Pang G, Sheng C, Cao L et al (2021) Deep learning for anomaly detection: a review. ACM Comput Surv 54:1–38

    Article  Google Scholar 

Download references


This work described in this paper was supported by the Open Foundation of State Key Laboratory for Novel Software Technology at Nanjing University of 775 P.R.China (No. KFKT2021B12). This work is supported in part by the Future Network Scientific Research Fund Project (FNSRFP-2021-YB-54), the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province (17KJB520028) and Tongda College of Nanjing University of Posts and Telecommunications (XK203XZ21001).

Author information

Authors and Affiliations



RG: Conceptualization, Methodology, Software. WH: Data curation, Writing-Original draft preparation. LY: Supervision, Writing. DL: Supervision, Writing—review & editing. YY: Supervision, Writing—review & editing. ZY: Review, Editing.

Corresponding author

Correspondence to Lingyu Yan.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, R., He, W., Yan, L. et al. Hybrid graph transformer networks for multivariate time series anomaly detection. J Supercomput 80, 642–669 (2024).

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

