Abstract
Local feature detection and description play a crucial role in various computer vision tasks, including image matching. Variations in illumination conditions significantly affect the accuracy of these applications. However, existing methods inadequately address this issue. In this paper, a novel algorithm based on illumination auxiliary learning module (IALM) is introduced. Firstly, a new local feature extractor named illumination auxiliary Superpoint (IA-Superpoint) is established, based on the integration of IALM and Superpoint. Secondly, illumination-aware auxiliary training focuses on capturing the effects of illumination variations during feature extraction through tailored loss functions and jointly learning mechanisms. Lastly, in order to evaluate the illumination robustness of local features, a metric is proposed by simulating various illumination disturbances. Experiments on HPatches and RDNIM datasets demonstrate that the performance of local feature extraction is greatly improved by our method. Compared to the baseline method, the proposed method exhibits improvements in both mean matching accuracy and homography estimation.






Similar content being viewed by others
Data availibility
The datasets used or analysed during the current study are available from the corresponding author on reasonable request.
References
Wang, C., Xu, D., Zhu, Y., Martín-Martín, R., Lu, C., Fei-Fei, L., Savarese, S.: Densefusion: 6d object pose estimation by iterative dense fusion. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3338–3347 (2019). https://doi.org/10.1109/CVPR.2019.00346
Shen, Y., Wang, R., Zuo, W., Zheng, N.: Tcl: tightly coupled learning strategy for weakly supervised hierarchical place recognition. IEEE Robot. Autom. Lett. 7(2), 2684–2691 (2022). https://doi.org/10.1109/LRA.2022.3141663
Ma, J., Jiang, X., Fan, A., Jiang, J., Yan, J.: Image matching from handcrafted to deep features: a survey. Int. J. Comput. Vis. (2021). https://doi.org/10.1007/s11263-020-01359-2
Zhou, H., Sattler, T., Jacobs, D.W.: Evaluating local features for day-night matching. In: Hua, G., Jegou, H. (Eds.) Computer Vision—ECCV 2016 Workshops, PT III, vol. 9915, pp. 724–736 (2016). https://doi.org/10.1007/978-3-319-49409-8_60
Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: Loftr: detector-free local feature matching with transformers. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8918–8927 (2021). https://doi.org/10.1109/CVPR46437.2021.00881
Lowe, D.: Distinctive image features from scale-invariant key points. Int. J. Comput. Vis. 20, 91–110 (2003)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 337–33712 (2018). https://doi.org/10.1109/CVPRW.2018.00060
Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., Sattler, T.: D2-net: a trainable cnn for joint description and detection of local features. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8084–8093 (2019). https://doi.org/10.1109/CVPR.2019.00828
Zhou, Q., Sattler, T., Leal-Taixé, L.: Patch2pix: Epipolar-guided pixel-level correspondences. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4667–4676 (2021). https://doi.org/10.1109/CVPR46437.2021.00464
Ke, Y., Sukthankar, R.: Pca-sift: a more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004 (CVPR 2004), vol. 2 (2004). https://doi.org/10.1109/CVPR.2004.1315206
Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: Computer Vision—ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7–13, 2006. Proceedings, Part I 9, pp. 404–417. Springer (2006)
Wang, Z., Fan, B., Wu, F.: Local intensity order pattern for feature description. In: 2011 International Conference on Computer Vision, pp. 603–610 (2011). https://doi.org/10.1109/ICCV.2011.6126294
Tang, F., Lim, S.H., Chang, N.L., Tao, H.: A novel feature descriptor invariant to complex brightness changes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2631–2638 (2009). https://doi.org/10.1109/CVPR.2009.5206550
Verdie, Y., Yi, K.M., Fua, P., Lepetit, V.: Tilde: A temporally invariant learned detector. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5279–5288 (2015). https://doi.org/10.1109/CVPR.2015.7299165
Sarlin, P.-E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: Learning feature matching with graph neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4937–4946 (2020). https://doi.org/10.1109/CVPR42600.2020.00499
Efe, U., Ince, K.G., Aydin Alatan, A.: Dfm: A performance baseline for deep feature matching. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4279–4288 (2021). https://doi.org/10.1109/CVPRW53098.2021.00484
Quan, D., Wei, H., Wang, S., Lei, R., Duan, B., Li, Y., Hou, B., Jiao, L.: Self-distillation feature learning network for optical and sar image registration. IEEE Trans. Geosci. Remote Sens. 60, 1–18 (2022). https://doi.org/10.1109/TGRS.2022.3173476
Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: Hpatches: a benchmark and evaluation of handcrafted and learned local descriptors. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3852–3861 (2017). https://doi.org/10.1109/CVPR.2017.410
Pautrat, R., Larsson, V., Oswald, M.R., Pollefeys, M.: Online invariance selection for local feature descriptors. Lecture Notes in Computer Science, pp. 707–724 (2020)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, vol. 15, pp. 10–5244 (1988). Citeseer
Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: Computer Vision—ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7–13, 2006. Proceedings, Part I 9 (2006). Springer, pp. 430–443
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: an efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571 (2011). https://doi.org/10.1109/ICCV.2011.6126544
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2), 66 (2012)
Barroso-Laguna, A., Mikolajczyk, K.: Key.net: keypoint detection by handcrafted and learned cnn filters revisited. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 698–711 (2023). https://doi.org/10.1109/TPAMI.2022.3145820
Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K.M., Trulls, E.: Image matching across wide baselines: from paper to practice. Int. J. Comput. Vis. 129(2), 517–547 (2021). https://doi.org/10.1007/s11263-020-01385-0
Li, C., Guo, C., Han, L., Jiang, J., Cheng, M.-M., Gu, J., Loy, C.C.: Low-light image and video enhancement using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 9396–9416 (2022). https://doi.org/10.1109/TPAMI.2021.3126387
Reinhard, E., Stark, M., Shirley, P., Ferwerda, J.: Photographic tone reproduction for digital images. In: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, pp. 267–276 (2002)
Wang, R., Zhang, Q., Fu, C.-W., Shen, X., Zheng, W.-S., Jia, J.: Underexposed photo enhancement using deep illumination estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6842–6850 (2019). https://doi.org/10.1109/CVPR.2019.00701
Land, E.H.: Recent advances in retinex theory and some implications for cortical computations: color vision and the natural image. Proc. Natl. Acad. Sci. 80(16), 5163–5169 (1983)
Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv:1808.04560 (2018)
Zhang, Y., Yang, Q.: An overview of multi-task learning. Natl. Sci. Rev. 5(1), 30–43 (2018). https://doi.org/10.1093/nsr/nwx105
Vandenhende, S., Georgoulis, S., Proesmans, M., Dai, D., Van Gool, L.: Revisiting multi-task learning in the deep learning era. arXiv preprint arXiv:2004.13379 (2020)
Zhang, A., Gao, Y., Niu, Y., Liu, W., Zhou, Y.: Coarse-to-fine person re-identification with auxiliary-domain classification and second-order information bottleneck. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021). IEEE Conference on Computer Vision and Pattern Recognition, pp. 598–607 (2021). https://doi.org/10.1109/CVPR46437.2021.00066
He, C., Zeng, H., Huang, J., Hua, X.-S., Zhang, L.: Structure aware single-stage 3d object detection from point cloud. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11870–11879 (2020). https://doi.org/10.1109/CVPR42600.2020.01189
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (Eds.) Computer Vision—ECCV 2014, PT V. Lecture Notes in Computer Science, vol. 8693, pp. 740–755 (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Jiang, W., Trulls, E., Hosang, J., Tagliasacchi, A., Yi, K.M.: Cotr: correspondence transformer for matching across images. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6187–6197 (2021). https://doi.org/10.1109/ICCV48922.2021.00615
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Funding
National Natural Science Foundation of China (62073024).
Author information
Authors and Affiliations
Contributions
H.B., S.F. and, H.Z. wrote the main manuscript text. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
No, all the authors have no conflict of interest as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bian, H., Fan, S., Zhang, H. et al. Towards stronger illumination robustness of local feature detection and description based on auxiliary learning. SIViP 18 (Suppl 1), 575–584 (2024). https://doi.org/10.1007/s11760-024-03175-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-024-03175-4