Abstract
In this paper, we propose a fine-grained classification method for automobile front face modeling images based on Gestalt psychology. This method divides pixels into features of visual regions through convolutional neural network, divides automobile front face images into parts, and conducts fine-grained classification based on the overall modeling of parts. A more objective method of fine granularity classification of automobile front face image is explored. A fine-grained classification and recognition model of automobile front face modeling based on Gestalt psychology is proposed in this work. Firstly, unclassified input car front face images are filtered through part detection, part segmentation, and regularization processing by combining the image classification training sets of car front face shapes. Secondly, to facilitate weakly supervised learning for each part, we establish recognition models using the simple a priori of U-shaped distribution for individual parts of car images and train the net using image-level object labels on the ResNet-101 network framework. Attention mechanism is then reused for aggregate features to output classification vectors. Finally, recognition accuracy of 89.9% is reached on the Comprehensive Cars (CompCars) dataset. Compared with other CNN methods, the results confirm that U-shaped distribution combined with parts in the exploration image has a higher recognition rate. Moreover, model interpretability can be achieved by dividing images and recognizing the contribution of each part in the classification.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Si, T., He, F., Wu, H., Duan, Y.: “Spatial-driven features based on image dependencies for person re-identification. Pattern Recognit. 124, 108462 (2022). https://doi.org/10.1016/j.patcog.2021.108462
Wu, H., He, F., Duan, Y., Yan, X.: Perceptual metric-guided human image generation. Integr. Comput. Aided. Eng. 1, 1–11 (2021). https://doi.org/10.3233/ica-210672
Pan, Y., He, F., Yu, H.: Learning social representations with deep autoencoder for recommender system. World Wide Web 23(4), 2259–2279 (2020). https://doi.org/10.1007/s11280-020-00793-z
Zhang, S., He, F.: DRCDN: learning deep residual convolutional dehazing networks. Vis Comput. 36(9), 1797–1808 (2020). https://doi.org/10.1007/s00371-019-01774-8
Sun, X., Xv, H., Dong, J., Zhou, H., Chen, C., Li, Q.: Few-shot learning for domain-specific fine-grained image classification. IEEE Trans Ind. Electron. 68(4), 3588–3598 (2021). https://doi.org/10.1109/TIE.2020.2977553
Yu, S., Wu, Y., Li, W., Song, Z., Zeng, W.: A model for fine-grained vehicle classification based on deep learning. Neurocomputing 257, 97–103 (2017). https://doi.org/10.1016/j.neucom.2016.09.116
Fang, J., Zhou, Y., Yu, Y., Du, S.: Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture. IEEE Trans Intell. Transp. Syst. 18(7), 1782–1792 (2017). https://doi.org/10.1109/TITS.2016.2620495
Yu, Y., Jin, Q., Wen Chen, C.: FF-CMnet: a CNN-based model for fine-grained classification of car models based on feature fusion, in 2018 IEEE International Conference on Multimedia and Expo (ICME), (2018), pp. 1–6, https://doi.org/10.1109/ICME.2018.8486443
Li, B., Dong, Y., Wen, Z., Liu, M., Yang, L., Song, M.: A machine learning–based framework for analyzing car brand styling. Adv. Mech. Eng. 10(7), 1–17 (2018). https://doi.org/10.1177/1687814018784429
Fischer, M. S., Holder, D., Maier, T.: Brand affiliation through curved and angular surfaces using the example of the vehicle front, in Volume 8: 32nd International Conference on Design Theory and Methodology (DTM), (2020) pp 1–10, https://doi.org/10.1115/DETC2020-22264
Abbasov, I.: Psychology of visual perception no. January 2019, pp. 1–11, (2015)
Chassy, P., Lindell, T.A.E., Jones, J.A., Paramei, G.V.: A relationship between visual complexity and aesthetic appraisal of car front images: an eye-tracker study. Perception 44(8–9), 1085–1097 (2015). https://doi.org/10.1177/0301006615596882
Yang, J., Wang, C., Jiang, B., Song, H., Meng, Q.: Visual perception enabled industry intelligence: state of the art, challenges and prospects. IEEE Ind. Informatics 17(3), 2204–2219 (2021). https://doi.org/10.1109/TII.2020.2998818
Ludlow, M.: Historical and Conceptual Background in Gregory of Nyssa Ancient and (Post)modern, pp. 13–14. Oxford University Press, Oxford (2007)
Yan, Y., et al.: Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement. Pattern Recognit. 79, 65–78 (2018). https://doi.org/10.1016/j.patcog.2018.02.004
Zhang, Q., Wu, Y.N., Zhu, S.C.: Interpretable convolutional neural networks, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 8827–8836, (2018) https://doi.org/10.1109/CVPR.2018.00920.
Brendel, W., Bethge, M.: Approximating cnns with bag-of-local-features models works surprisingly well on Imagenet, 7th Int. Conf. Learn. Represent. ICLR 2019, pp. 1–15, (2019)
Pham, T.A.: Effective deep neural networks for license plate detection and recognition. Vis Comput. (2022). https://doi.org/10.1007/s00371-021-02375-0
Chen, C., Li, O., Tao, C., Barnett, A.J., Su, J., Rudin, C.: This looks like that: deep learning for interpretable image recognition. Adv Neural Inf. Process. Syst. 32, 1–12 (2019)
Zhang, X., Xiong, H., Zhou, W., Lin, W., and Tian, Q.: Picking deep filter responses for fine-grained image recognition, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 1134–1142, (2016), https://doi.org/10.1109/CVPR.2016.128
Huang, S., Xu, Z., Tao, D., Zhang, Y.: Part-stacked CNN for fine-grained visual categorization, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 1173–1182, 2016, https://doi.org/10.1109/CVPR.2016.132
Zhang, H., et al.: SPDA-CNN: unifying semantic part detection and abstraction for fine-grained recognition, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 1143–1152, (2016) https://doi.org/10.1109/CVPR.2016.129
Li, M., Lei, L., Sun, H., Li, X., Kuang, G.: Fine-grained visual classification via multilayer bilinear pooling with object localization. Vis. Comput. (2021). https://doi.org/10.1007/s00371-020-02052-8
Zhou, B., Khosla, A., Lapedriza, A.: Learning deep features for discriminative localization Bolei, Proc. IEEE Conf. Comput. Vis. Pattern Recognit., vol. 2016, pp. 2921–2929, Aug. 2016, [Online]. Available: https://doi.org/10.5465/ambpp.2004.13862426.
Zheng, H., Fu, J., Mei, T., Luo, J.: learning multi-attention convolutional neural network for fine-grained image recognition, Proc. IEEE Int. Conf. Comput. Vis., vol. 2017-Octob, pp. 5219–5227, 2017, https://doi.org/10.1109/ICCV.2017.557
Abbass, M.Y., Kwon, K.C., Kim, N., Abdelwahab, S.A., El-Samie, F.E.A., Khalaf, A.A.M.: Efficient object tracking using hierarchical convolutional features model and correlation filters. Vis Comput. 37(4), 831–842 (2021). https://doi.org/10.1007/s00371-020-01833-5
Ali, H., Faisal, S., Chen, K., Rada, L.: Image-selective segmentation model for multi-regions within the object of interest with application to medical disease. Vis Comput. 37(5), 939–955 (2021). https://doi.org/10.1007/s00371-020-01845-1
Luo, L., et al.: “A unified framework for interactive image segmentation via Fisher rules. Vis Comput. 35(12), 1869–1882 (2019). https://doi.org/10.1007/s00371-018-1580-0
Averbuch-Elor, H., Kopf, J., Hazan, T., Cohen-Or, D.: Co-segmentation for space-time co-located collections. Vis Comput. 34(12), 1761–1772 (2018). https://doi.org/10.1007/s00371-017-1467-5
Hung, W.C., Jampani, V., Liu, S., Molchanov, P., Yang, M.H., Kautz, J.: SCOPS: self-supervised co-part segmentation, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 869–878, (2019) https://doi.org/10.1109/CVPR.2019.00096
Gu, C., Lim, J.J, Arbeláez, P., Malik, J.: Recognition using regions, 2009 IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2009, pp. 1030–1037, (2009) https://doi.org/10.1109/CVPR.2009.5206727
Yan, J., Yu, Y., Zhu, X., Lei, Z., Li, S.Z.: Object detection by labeling superpixels, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 07–12-June, pp. 5107–5116, (2015) https://doi.org/10.1109/CVPR.2015.7299146
Li, Y., Gupta, A.: Beyond grids: learning graph representations for visual recognition. Adv. Neural Inf. Process. Syst. 2018, 9225–9235 (2018). https://doi.org/10.5555/3327546.3327596
Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H., Expectation-maximization attention networks for semantic segmentation, Proc. IEEE Int. Conf. Comput. Vis., vol. 2019-Octob, pp. 9166–9175, (2019) https://doi.org/10.1109/ICCV.2019.00926.
Jampani, V., Sun, D., Liu, M.Y., Yang, M.H., Kautz, J. Superpixel sampling networks, arXiv, (2018)
X. Li, Z. Zhong, J. Wu, Y. Yang, Z. Lin, and H. Liu, Expectation-maximization attention networks for semantic segmentation, arXiv, (2019)
H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 3304–3311, 2010, https://doi.org/10.1109/CVPR.2010.5540039.
D. Vaju, G. Vlad, and C. Festila, “About the physical methods applied by underground water treatment in food industry, “ 2006 IEEE Int. Conf. Autom. Qual. Testing, Robot. AQTR, vol. 2, no. 1, pp. 137–140, 2006, doi: https://doi.org/10.1109/AQTR.2006.254617.
Woo, S., Park, J., Lee, J., Kweon, I.S.: CBAM:convolutional block attention module, Eccv (2018)
Ding, W., Li, X., Li, G., Wei, Y.: Global relational reasoning with spatial temporal graph interaction networks for skeleton-based action recognition. Signal Process. Image Commun. 83, 115776 (2020). https://doi.org/10.1016/j.image.2019.115776
He, L., Liu, Y., Zeng, Z., Huang, X., Liu, R.: Determination of residual clopidol in chicken muscle by capillary gas chromatography/mass spectrometry. J. AOAC Int. 88(4), 1104–1107 (2005). https://doi.org/10.1093/jaoac/88.4.1104
Joseph, S.: Australian literary journalism and ‘missing voices’: how helen garner finally resolves this recurring ethical tension. J. Pract. 10(6), 730–743 (2016). https://doi.org/10.1080/17512786.2015.1058180
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification, Proc. IEEE Int. Conf. Comput. Vis., vol. 2015 Inter, pp. 1026–1034, (2015) https://doi.org/10.1109/ICCV.2015.123
Yang, L., Luo, P., Loy, C.C., Tang, X.: A large-scale car dataset for fine-grained categorization and verification, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 3973–3981, (2015) https://doi.org/10.1109/CVPR.2015.7299023
Lu, W., Lian, X, Yuille, A.: Parsing semantic parts of cars using graphical models and segment appearance consistency, BMVC 2014 - Proc. Br. Mach. Vis. Conf. 2014, no. 018, (2014) https://doi.org/10.5244/c.28.118
Anonymous, Learning to annotate Part segmentation with gradient matching, pp 1–20, (2022)
Acknowledgements
This work was supported by the Natural Science Foundation of Hebei Province (Grant Number: G2021202008) and Social Science Foundation of Hebei Province (Grant Number: HB20YS046).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pei, H., Guo, R., Tan, Z. et al. Fine-grained classification of automobile front face modeling based on Gestalt psychology*. Vis Comput 39, 2981–2998 (2023). https://doi.org/10.1007/s00371-022-02506-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02506-1