Abstract
The problem of attacks on neural networks through input modification (i.e., adversarial examples) has attracted much attention recently. Being relatively easy to generate and hard to detect, these attacks pose a security breach that many suggested defenses try to mitigate. However, the evaluation of the effect of attacks and defenses commonly relies on traditional classification metrics, without adequate adaptation to adversarial scenarios. Most of these metrics are accuracy-based and therefore may have a limited scope and low distinctive power. Other metrics do not consider the unique characteristics of neural network functionality or measure the effectiveness of the attacks indirectly (e.g., through the complexity of their generation). In this paper, we present two metrics that are specifically designed to measure the effect of attacks, or the recovery effect of defenses, on the output of neural networks in multiclass classification tasks. Inspired by the normalized discounted cumulative gain and the reciprocal rank metrics used in information retrieval literature, we treat the neural network predictions as ranked lists of results. Using additional information about the probability of the rank enabled us to define novel metrics that are suited to the task at hand. We evaluate our metrics using various attacks and defenses on a pre-trained VGG19 model and the ImageNet dataset. Compared to the common classification metrics, our proposed metrics demonstrate superior informativeness and distinctiveness.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availibility
Data will be made available after publication.
References
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks, arXiv:1312.6199 (2013)
Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: CVPR, pp. 427–436 (2015)
Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430 (2018)
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: SP, pp. 582–597 IEEE (2016)
Athalye, A., Engstrom, L., Ilyas, A., Kwok, K.: Synthesizing robust adversarial examples. In: ICML, pp. 284–293 (2018)
Rosenberg, I., Shabtai, A., Elovici, Y., Rokach, L.: Adversarial machine learning attacks and defense methods in the cyber security domain. ACM Comput. Surv. (CSUR) 54(5), 1 (2021)
Evtimov, I., Eykholt, K., Fernandes, E., Kohno, T., Prakash, B., Li, A., Rahmati, A., Song, D.: Robust physical-world attacks on machine learning models 2(3), 4 (2017). arXiv:1707.08945
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv:1412.6572 (2014)
Xu, W., Evans, D., Qi, Y.: Feature squeezing: detecting adversarial examples in deep neural networks. In: NDSS (2018)
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. arXiv:1905.02175 (2019)
Hossin, M., Sulaiman, M.: A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Manag. Process 5(2), 1 (2015)
MacKay, D.J., Mac Kay, D.J.: Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge (2003)
Wallach, H.: Cavendish Lab., University Cambridge, Cambridge, online available at http://www.inference.phy.cam.ac.uk/hmw26/. Technical Report (2006)
Huang, J., Ling, C.X.: Constructing new and better evaluation measures for machine learning. In: IJCAI, pp. 859–864 (2007)
Arp, D., Quiring, E., Pendlebury, F., Warnecke, A., Pierazzi, F., Wressnegger, C., Cavallaro, L., Rieck, K.: Dos and don’ts of machine learning in computer security, arXiv preprint arXiv:2010.09470 (2020)
Fursov, I., Morozov, M., Kaploukhaya, N., Kovtun, E., Rivera-Castro, R., Gusev, G., Babaev, D., Kireev, I., Zaytsev, A., Burnaev, E.: Adversarial attacks on deep models for financial transaction records, arXiv preprint arXiv:2106.08361 (2021)
Berman, D.S., Buczak, A.L., Chavis, J.S., Corbett, C.L.: A survey of deep learning methods for cyber security. Information 10(4), 122 (2019)
Carlini, N., Wagner, D.: Audio adversarial examples: targeted attacks on speech-to-text. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 1–7. IEEE (2018)
Finlayson, S.G., Bowers, J.D., Ito, J., Zittrain, J.L., Beam, A.L., Kohane, I.S.: Adversarial attacks on medical machine learning. Science 363(6433), 1287 (2019)
Chan, R.H., Ho, C.W., Nikolova, M.: Salt-and-pepper noise removal by median-type noise detectors and detail-preserving regularization. IEEE Trans. Image Process. 14(10), 1479 (2005)
Vidnerová, P., Neruda, R.: Vulnerability of classifiers to evolutionary generated adversarial examples. Neural Netw. 127, 168 (2020)
Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574–2582 (2016)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: SP, IEEE. pp. 39–57 (2017)
Mopuri, K.R., Garg, U., Babu, R.V.: Fast feature fool: a data independent approach to universal adversarial perturbations, arXiv preprint arXiv:1707.05572 (2017)
Zhang, X., Wang, N., Ji, S., Shen, H., Wang, T.: Interpretable deep learning under fire, arXiv:1812.00891 (2018)
Ghorbani, A., Abid, A., Zou, J.: Interpretation of neural networks is fragile. In: AAAI, vol. 33, pp. 3681–3688 (2019)
Heo, J., Joo, S., Moon, T.: Fooling neural network interpretations via adversarial model manipulation, arXiv:1902.02041 (2019)
Yang, C., Wu, Q., Li, H., Chen, Y.: Generative poisoning attack method against neural networks, arXiv preprint arXiv:1703.01340 (2017)
Shafahi, A., Huang, W.R.: Studer, C., Feizi, S., Goldstein, T.: Are adversarial examples inevitable? arXiv:1809.02104 (2018)
Tabacof, P., Valle, E.: Exploring the space of adversarial images. In: IJCNN, pp. 426–433. IEEE (2016)
Bhagoji, A.N., Cullina, D., Mittal, P.: Dimensionality reduction as a defense against evasion attacks on machine learning classifiers, arXiv:1704.02654 (2017)
Gilmer, J., Metz, L., Faghri, F., Schoenholz, S.S., Raghu, M., Wattenberg, M., Goodfellow, I.: Adversarial spheres, arXiv:1801.02774 (2018)
Sensoy, M., Kaplan, L., Kandemir, M.: Evidential deep learning to quantify classification uncertainty, arXiv preprint arXiv:1806.01768 (2018)
Guo, C., Rana, M., Cisse, M., Van Der Maaten, L.: Countering adversarial images using input transformations, arXiv:1711.00117 (2017)
Akhtar, N., Liu, J., Mian, A.: Defense against universal adversarial perturbations. In: CVPR, pp. 3389–3398 (2018)
Buckman, J., Roy, A., Raffel, C., Goodfellow, I.: Thermometer encoding: One hot way to resist adversarial examples. In: International Conference on Learning Representations (2018)
Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: ICML, pp. 274–283 (2018)
Carlini, N., Athalye, A., Papernot, N., Brendel, W., Rauber, J., Tsipras, D., Goodfellow, I., Madry, A.: On evaluating adversarial robustness, arXiv:1902.06705 (2019)
Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming, arXiv preprint arXiv:1711.07356 (2017)
Wong, E., Kolter, Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: International Conference on Machine Learning (PMLR, 2018), pp. 5286–5295 (2018)
Lecuyer, M., Atlidakis, V., Geambasu, R., Hsu, D., Jana, S.: Certified robustness to adversarial examples with differential privacy. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 656–672. IEEE (2019)
Bastani, O., Ioannou, Y., Lampropoulos, L., Vytiniotis, D., Nori, A., Criminisi, A.: Measuring neural net robustness with constraints, arXiv preprint arXiv:1605.07262 (2016)
Carlini, N., Katz, G., Barrett, C.W., Dill, D.L.: Ground-truth adversarial examples, arXiv preprint arXiv:1709.10207 (2017)
Chen, J., Wang, Z., Zheng, H., Xiao, J., Ming, Z.: ROBY: evaluating the robustness of a deep model by its decision boundaries, arXiv preprint arXiv:2012.10282 (2020)
Weng, T.W., Zhang, H., Chen, P.Y., Yi, J., Su, D., Gao, Y., Hsieh, C.J., Daniel, L.: Evaluating the robustness of neural networks: an extreme value theory approach, arXiv preprint arXiv:1801.10578 (2018)
Peng, Y., Zhao, W., Cai, W., Su, J., Han, B., Liu, Q.: Evaluating deep learning for image classification in adversarial environment. IEICE Trans. Inf. Syst. 103(4), 825 (2020)
Powers, D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, arXiv preprint arXiv:2010.16061 (2020)
Liu, T.Y.: Learning to Rank for Information Retrieval. Springer, Berlin (2011)
Dehghani, M., Zamani, H., Severyn, A., Kamps, J., Croft, W.B.: Neural ranking models with weak supervision. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (2017), pp. 65–74
Guo, J., Fan, Y., Pang, L., Yang, L., Ai, Q., Zamani, H., Wu, C., Croft, W.B., Cheng, X.: A deep look into neural ranking models for information retrieval. Inf. Process. Manag. 57(6), 102067 (2020)
Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reas. 50(7), 969 (2009)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. pp. 2333–2338 (2013)
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)
Qin, T., Liu, T.Y., Li, H.: A general approximation framework for direct optimization of information retrieval measures. Inf. Retriev. 13(4), 375 (2010)
Bruch, S., Zoghi, M., Bendersky, M., Najork, M.: Revisiting approximate metric optimization in the age of deep neural networks. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1241–1244 (2019)
Taylor, M., Guiver, J., Robertson, S., Minka, T.: Softrank: optimizing non-smooth rank metrics. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 77–86 (2008)
Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. Inf. Retriev. 13(3), 254 (2010)
Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: ACM SIGIR Forum, vol. 51, pp. 243–250. ACM, New York (2017)
Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: Proceedings of the 18th ACM conference on Information and knowledge management (2009), pp. 621–630
Croft, W.B., Metzler, D., Strohman, T.: Search Engines Information Retrieval in Practice, vol. 520. Addison-Wesley, Reading (2010)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556 (2014)
Krizhevsky, A., Hinton, G., et al.: Learning Multiple Layers of Features from Tiny Images. Ph.D. thesis, University of Toronto (2009)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Rauber, J., Brendel, W., Bethge, M.: Foolbox: A Python toolbox to benchmark the robustness of machine learning models, arXiv:1707.04131 (2017)
Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world, CoRR abs/1607.02533. http://arxiv.org/abs/1607.02533 (2016)
Carlini, N., Wagner, D.: Adversarial examples are not easily detected: bypassing ten detection methods. In: ACM Workshop on Artificial Intelligence and Security, pp. 3–14. ACM (2017)
Wu, Y., Bamman, D., Russell, S.: Adversarial training for relation extraction. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1778–1783 (2017)
Gota, D.I., Puscasiu, A., Fanca, A., Valean, H., Miclea, L.: Threat objects detection in airport using machine learning. In: 2020 21th International Carpathian Control Conference (ICCC), pp. 1–6. IEEE (2020)
Acknowledgements
This work was supported by the Ariel Cyber Innovation Center in conjunction with the Israel National Cyber Directorate in the Prime Minister’s Office.
Funding
The authors have no relevant financial or non-financial interests to disclose. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.
Author information
Authors and Affiliations
Contributions
All authors contributed to the paper equally.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Human and animal rights
The research does not involve human participants or animals.
Informed consent
Authors provide their informed consent.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Brama, H., Dery, L. & Grinshpoun, T. Evaluation of neural networks defenses and attacks using NDCG and reciprocal rank metrics. Int. J. Inf. Secur. 22, 525–540 (2023). https://doi.org/10.1007/s10207-022-00652-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10207-022-00652-0