Enhanced end-to-end regression algorithm for autonomous road damage detection

Xing, Hongjia; Yang, Feng; Qiao, Xu; Li, Fanruo; Huang, Xinxin

doi:10.1007/s11227-024-06871-7

Enhanced end-to-end regression algorithm for autonomous road damage detection

Published: 08 January 2025

Volume 81, article number 380, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

132 Accesses
Explore all metrics

Abstract

To address challenges such as variations in lighting, weather, and the size and shape of cracks and potholes, we propose an enhanced end-to-end regression algorithm for autonomous road damage detection. This method balances computational efficiency and accuracy by incorporating feature extraction structures to improve performance in scenarios involving multiple damage types, shadows, and fine-grained feature variations. The proposed model integrates a down-sampling structure for dimensionality reduction and feature extraction, an inverted residual mobile block for feature fusion, and an attention mechanism with multi-scale features for multi-scale detail extraction. Additionally, the integration of a Decoupled Head structure enhances bounding box localization. Experimental results show that the proposed method outperforms YOLOv5s (You Only Look Once version 5 small), achieving a 2.9% improvement in the F1 score and a 4% improvement in the mean average precision. Further validation through visualization experiments in seven challenging road scenarios, including varying lighting and environmental conditions, highlights the model’s superior detection accuracy, completeness, and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Road Surface Defect Detection Based on Partial Convolution and Global Attention

Article 19 July 2024

MED-YOLOv8s: a new real-time road crack, pothole, and patch detection model

Article 29 January 2024

HaWANet: Road Scene Understanding with Multi-modal Sensor Data Using Height-Width-Driven Attention Network

Data availability

The data used to support this study's findings are available from the corresponding author.

References

Arya D, Maeda H, Sekimoto Y (2024) From global challenges to local solutions: a review of cross-country collaborations and winning strategies in road damage detection. Adv Eng Inform 60:102388
Article MATH Google Scholar
Patel N, Dabhi V, Adhvaryu R (2024) Review on identify road potholes using image semantic segmentation for advance driver assistant system. In: AIP Conference Proceedings. AIP Publishing
Cano-Ortiz S, Iglesias LL, del Árbol PMR et al (2024) An end-to-end computer vision system based on deep learning for pavement distress detection and quantification. Constr Build Mater 416:135036
Article Google Scholar
Ranieri A, Thompson EM, Biasotti S (2024) Automatic structural health monitoring of road surfaces using artificial intelligence and deep learning. In: Data driven methods for civil structural health monitoring and resilience. CRC Press, pp 297–311
Tripathi R, Indu S, Kumar R (2024) ERCU-Net: segmentation of road potholes using enhanced residual convolutional block based on U-Net for ADAS. Signal, Image and Video Processing 1–10
Zhang Z, Cui W, Tao Y, Shi T (2024) Road damage detection algorithm based on multi-scale feature extraction. Eng Lett 32:151–159
MATH Google Scholar
Thompson EM, Ranieri A, Biasotti S et al (2022) SHREC 2022: Pothole and crack detection in the road pavement using images and RGB-D data. Comput Graph 107:161–171
Article Google Scholar
Liu S, Han Y, Xu L (2022) Recognition of road cracks based on multi-scale Retinex fused with wavelet transform. Array 15:100193
Article MATH Google Scholar
Vinodhini KA, Sidhaarth KRA (2024) Pothole detection in bituminous road using CNN with transfer learning. Measurement: Sensors 31:100940
Dong J, Wang N, Fang H et al (2024) Automatic augmentation and segmentation system for three-dimensional point cloud of pavement potholes by fusion convolution and transformer. Adv Eng Inform 60:102378
Article MATH Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84–90
Article MATH Google Scholar
Tang Z, Wu Y, Xu X (2024) The study of recognizing ripe strawberries based on the improved YOLOv7-Tiny model. Vis Comput, pp 1–17
Arya D, Maeda H, Ghosh SK, et al (2020) Global road damage detection: State-of-the-art solutions. In: 2020 IEEE International Conference On Big Data (Big Data). IEEE, pp 5533–5539
Li F, Yang F, Xie Y et al (2024) Research on 3D ground penetrating radar deep underground cavity identification algorithm in urban roads using multi-dimensional time-frequency features. NDT and E Int 143:103060. https://doi.org/10.1016/j.ndteint.2024.103060
Article MATH Google Scholar
Zou Q, Zhang Z, Li Q et al (2018) Deepcrack: Learning hierarchical convolutional features for crack detection. IEEE Trans Image Process 28:1498–1512
Article MathSciNet MATH Google Scholar
Ji A, Xue X, Wang Y et al (2020) An integrated approach to automatic pixel-level crack detection and quantification of asphalt pavement. Autom Constr 114:103176
Article MATH Google Scholar
Zhang K, Zhang Y, Cheng H-D (2020) CrackGAN: Pavement crack detection using partially accurate ground truths based on generative adversarial learning. IEEE Trans Intell Transp Syst 22:1306–1319
Article MATH Google Scholar
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst, 28
Pham V, Pham C, Dang T (2020) Road damage detection and classification with detectron2 and faster r-cnn. In: 2020 IEEE International Conference on Big Data (Big Data). IEEE, pp 5592–5601
Li L, Liu J, Xing J, et al (2024) Road pothole detection based on crowdsourced data and extended mask r-CNN. IEEE Trans Intell Transp Syst
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Hegde V, Trivedi D, Alfarrarjeh A, et al (2020) Yet another deep learning approach for road damage detection using ensemble learning. In: 2020 IEEE International Conference on Big Data (Big Data). IEEE, pp 5553–5558
Diao Z, Huang X, Liu H, Liu Z (2023) LE-yolov5: a lightweight and efficient road damage detection algorithm based on improved yolov5. Int J Intell Syst 2023:8879622
Article MATH Google Scholar
Roy AM, Bhaduri J (2023) DenseSPH-YOLOv5: An automated damage detection model based on DenseNet and Swin-Transformer prediction head-enabled YOLOv5 with attention mechanism. Adv Eng Inform 56:102007
Article MATH Google Scholar
Wang S, Jiao H, Su X, Yuan Q (2024) An ensemble learning approach with attention mechanism for detecting pavement distress and disaster-induced road damage. IEEE Trans Intell Transp Syst
Hu H, Li Z, He Z et al (2024) Road surface crack detection method based on improved YOLOv5 and vehicle-mounted images. Measurement 229:114443. https://doi.org/10.1016/j.measurement.2024.114443
Article MATH Google Scholar
Wang C-Y, Bochkovskiy A, Liao H-YM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Vancouver, BC, Canada, pp 7464–7475
Chapter MATH Google Scholar
Ge Z, Liu S, Wang F, et al (2021) Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:210708430
Arya D, Maeda H, Ghosh SK, et al (2022) Rdd2022: A multi-national image dataset for automatic road damage detection. arXiv preprint arXiv:220908538
Everingham M, Eslami SA, Van Gool L et al (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vision 111:98–136
Article MATH Google Scholar
Gunawardana A, Shani G (2009) A survey of accuracy evaluation metrics of recommendation tasks. J Mach Learn Res, 10
Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In: Australasian Joint Conference on Artificial Intelligence. Springer, pp 1015–1021
Zhu X, Lyu S, Wang X, Zhao Q (2021) TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2778–2788
Li C, Li L, Jiang H, et al (2022) YOLOv6: A single-stage object detection framework for industrial applications
Talaat FM, ZainEldin H (2023) An improved fire detection approach based on YOLO-v8 for smart cities. Neural Comput Appl 35:20939–20954. https://doi.org/10.1007/s00521-023-08809-1
Article Google Scholar
Wang C-Y, Yeh I-H, Liao H-YM (2024) YOLOv9: Learning what you want to learn using programmable gradient information
Wang A, Chen H, Liu L, et al (2024) YOLOv10: Real-time end-to-end object detection. arXiv preprint arXiv:240514458
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article MATH Google Scholar
Sun P, Zhang R, Jiang Y, et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14454–14463
Sandler M, Howard A, Zhu M, et al (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4510–4520

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China [Grant Number 2021YFC3090303 and 2021YFC3090304]; The Fundamental Research Funds for the Central Universities (Ph.D. Top Innovative Talents Fund of CUMTB) [BBJ2024069].

Author information

Authors and Affiliations

School of Artificial Intelligence, China University of Mining and Technology (Beijing), Beijing, 100083, China
Hongjia Xing, Feng Yang, Xu Qiao, Fanruo Li & Xinxin Huang

Authors

Hongjia Xing
View author publications
You can also search for this author in PubMed Google Scholar
Feng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xu Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Fanruo Li
View author publications
You can also search for this author in PubMed Google Scholar
Xinxin Huang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HX contributed to conceptualization, data curation, investigation, methodology, validation, visualization, project administration, writing (original draft), writing (review) and editing. FY contributed to conceptualization, data curation, investigation, writing (review) and editing. XQ contributed to investigation, data curation, and project administration. FL contributed to funding acquisition, supervision, and project administration. XH contributed to data curation and supervision.

Corresponding author

Correspondence to Hongjia Xing.

Ethics declarations

Conflict of interest

The authors declare that they have no known financial or non-financial competing interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xing, H., Yang, F., Qiao, X. et al. Enhanced end-to-end regression algorithm for autonomous road damage detection. J Supercomput 81, 380 (2025). https://doi.org/10.1007/s11227-024-06871-7

Download citation

Accepted: 19 December 2024
Published: 08 January 2025
DOI: https://doi.org/10.1007/s11227-024-06871-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Enhanced end-to-end regression algorithm for autonomous road damage detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Road Surface Defect Detection Based on Partial Convolution and Global Attention

MED-YOLOv8s: a new real-time road crack, pothole, and patch detection model

HaWANet: Road Scene Understanding with Multi-modal Sensor Data Using Height-Width-Driven Attention Network

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Enhanced end-to-end regression algorithm for autonomous road damage detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Road Surface Defect Detection Based on Partial Convolution and Global Attention

MED-YOLOv8s: a new real-time road crack, pothole, and patch detection model

HaWANet: Road Scene Understanding with Multi-modal Sensor Data Using Height-Width-Driven Attention Network

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation