Abstract
Image classification is a well-studied problem. However, there remains challenges for some special categories of images. This paper proposes a new deep convolutional neural network to improve image classification using extra light-field angular information. The proposed network model employs transfer learning by replacing the fully connected layer of a VGG network with a set of interleaved spatial-angular filters. The resulting model takes advantage of both the spatial and angular information of light-field images (LFIs), thus providing more accurate classification performance over traditional models. To evaluate the proposed network model, we established a light-field image dataset, currently consisting of 560 captured LFIs, which have been divided into 11 labeled categories. Based on this dataset, our experimental results show that the proposed LFI model yields an average of 92% classification accuracy as oppose to 84% from the model using traditional 2D images and 85% from the model using stereo pair images. In particular, on classifying challenging objects such as the “screen” images, the proposed LFI model demonstrated to have significant improvement of 16% and 12% respectively over the 2D image model and the stereo image model.









Similar content being viewed by others
References
Adelson EH, Wang JY (1992) Single lens stereo with a plenoptic camera. IEEE Trans Pattern Anal Mach Intell 14(2):99
Aiger D, Allen B, Golovinskiy A (2017) Large-scale 3d scene classification with multi-view volumetric cnn, arXiv preprint arXiv:1712.09216
Bastidas A (2017) Tiny imagenet image classification. https://pdfs.semanticscholar.org/1b0c/2ba54f7e2f3f5b3a2098721d36e6079d0382.pdf
Chen Y, Yang Y, Fang Q, Yao X (2017) Discriminative region guided deep neural network towards food image classification. In: CCF Chinese conference on computer vision. Springer, pp 577–587
Chen J, Hou J, Chau LP (2018) Light field compression with disparity-guided sparse coding based on structural key views. IEEE Trans Image Process 27(1):314
Chen J, Hou J, Chau LP (2018) Light field denoising via anisotropic parallax analysis in cnn framework. IEEE Signal Process Lett (IEEE SPL) 25(9):1403–1407
Chen J, Hou J, Ni Y, Chau LP (2018) Accurate light field depth estimation with superpixel regularization over partially occluded regions. IEEE Trans Image Process (IEEE T-IP) 27(10):4889–4900
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 248–255
Deshpande A (2016) A beginner’s guide to understanding convolutional neural networks part 2 [online]. [cit. 2017-07-07]
Eckert S, Ghebremicael ST, Hurni H, Kohler T (2017) Identification and classification of structural soil conservation measures based on very high resolution stereo satellite data. J Environ Manag 193:592
Gao XW, Hui R (2016) A deep learning based approach to classification of ct brain images. In: SAI computing conference (SAI), 2016. IEEE, pp 28–31
Hahnloser RH, Sarpeshkar R, Mahowald MA, Douglas RJ, Seung HS (2000) Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789):947
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn, arXiv preprint arXiv:1703.06870
Hou J, Chen J, Chau LP (2018) Light field image compression based on bi-level view compensation with rate-distortion optimization. In: IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT)
Image-net.org (2018) imagenet tree view. [online] available at: http://image-net.org/explore. Accessed: 25 Jan 2018
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Jeon HG, Park J, Choe G, Park J, Bok Y, Tai YW, So Kweon I (2015) Accurate depth map estimation from a lenslet light field camera. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1547–1555
Kalantari NK, Wang TC, Ramamoorthi R (2016) Learning-based view synthesis for light field cameras. ACM Trans Graph (TOG) 35(6):193
Kooi FL, Toet A (2004) Visual comfort of binocular and 3d displays. Displays 25(2–3):99
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Ng R, Levoy M, Brédif M, Duval G, Horowitz M, Hanrahan P (2005) Light field photography with a hand-held plenoptic camera. Comput Sci Tech Rep CSTR 2(11):1
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528
Raghavendra R, Raja KB, Busch C (2015) Presentation attack detection for face recognition using light field camera. IEEE Trans Image Process 24(3):1060
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556
Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660
Wang TC, Zhu JY, Hiroaki E, Chandraker M, Efros AA, Ramamoorthi R (2016) A 4d light-field dataset and cnn architectures for material recognition. In: European conference on computer vision. Springer, Berlin, pp 121–138
Wang Y, Hou G, Sun Z, Wang Z, Tan T (2016) A simple and robust super resolution method for light field images. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 1459–1463
Wu G, Masia B, Jarabo A, Zhang Y, Wang L, Dai Q, Chai T, Liu Y (2017) Light field image processing: an overview. IEEE J Sel Top Sign Proces 11(7):926
Yeung HWF, Hou J, Chen J, Chung YY, Chen X (2018) Fast light field reconstruction with deep coarse-to-fine modelling of spatial-angular clues. In: Accepted to European Conference on Computer Vision
Yoon Y, Jeon HG, Yoo D, Lee JY, So Kweon I (2015) Learning a deep convolutional network for light-field image super-resolution. In: Proceedings of the IEEE international conference on computer vision workshops, pp 24–32
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. In: Advances in neural information processing systems, pp 3320–3328
Zhang Y, Lv H, Liu Y, Wang H, Wang X, Huang Q, Xiang X, Dai Q (2017) Light-field depth estimation via epipolar plane image analysis and locally linear embedding. IEEE Trans Circuits Syst Video Technol 27(4):739
Zhao S, Chen Z (2017) Light field image coding via linear approximation prior. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 4562–4566
Funding
This work was supported in part by the National Key Research and Development Program of China under Grant No. 2016YFC0801001, the National Program on Key Basic Research Projects (973 Program) under Grant 2015CB351803, NSFC under Grant 61571413, 61632001, 61390514.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lu, Z., Yeung, H.W.F., Qu, Q. et al. Improved image classification with 4D light-field and interleaved convolutional neural network. Multimed Tools Appl 78, 29211–29227 (2019). https://doi.org/10.1007/s11042-018-6597-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6597-x