survey

Open access

A Survey of Methods for Explaining Black Box Models

Authors:

Riccardo Guidotti,

Salvatore Ruggieri,

Fosca Giannotti,

Dino PedreschiAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 51, Issue 5

Article No.: 93, Pages 1 - 42

https://doi.org/10.1145/3236009

Published: 22 August 2018 Publication History

All formats PDF

Abstract

In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.

Supplemental Material

ZIP File - a93-guidotti-suppl.pdf

Supplemental movie, appendix, image and software files for, A Survey of Methods for Explaining Black Box Models

Download
46.91 KB

References

[1]

Julius Adebayo and Lalana Kagal. 2016. Iterative orthogonal feature projection for diagnosing bias in black-box models. arXiv preprint arXiv:1611.04967.

[2]

Philip Adler, Casey Falk, Sorelle A. Friedler, Gabriel Rybeck, Carlos Scheidegger, Brandon Smith, and Suresh Venkatasubramanian. 2016. Auditing black-box models for indirect influence. In Proceedings of the IEEE 16th International Conference on Data Mining (ICDM’16). IEEE, Springer, 1--10.

[3]

Rakesh Agrawal, Ramakrishnan Srikant et al. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94), Vol. 1215. 487--499.

Digital Library

[4]

Yousra Abdul Alsahib S. Aldeen, Mazleena Salleh, and Mohammad Abdur Razzaque. 2015. A comprehensive review on privacy preserving data mining. SpringerPlus 4, 1 (2015), 694.

[5]

Robert Andrews, Joachim Diederich, and Alan B. Tickle. 1995. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl.-Based Syst. 8, 6 (1995), 373--389.

Digital Library

[6]

M. Gethsiyal Augasta and T. Kathirvalavakumar. 2012. Reverse engineering the neural networks for rule extraction in classification problems. Neural Process. Lett. 35, 2 (2012), 131--150.

Digital Library

[7]

Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS One 10, 7 (2015), e0130140.

[8]

David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert MÃžller. 2010. How to explain individual classification decisions. J. Mach. Learn. Res. 11(June 2010), 1803--1831.

Digital Library

[9]

Jacob Bien and Robert Tibshirani. 2011. Prototype selection for interpretable classification. Ann. Appl. Stat. 5, 4 (2011), 2403--2424.

[10]

Marko Bohanec and Ivan Bratko. 1994. Trading accuracy for simplicity in decision trees. Mach. Learn. 15, 3 (1994), 223--250.

Digital Library

[11]

Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Larry Jackel, Urs Muller, and Karol Zieba. 2016. VisualBackProp: Visualizing CNNs for autonomous driving. CoRR, Vol. abs/1611.05418 (2016).

[12]

Olcay Boz. 2002. Extracting decision trees from trained neural networks. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 456--461.

Digital Library

[13]

Leo Breiman, Jerome Friedman, Charles J. Stone, and Richard A. Olshen. 1984. Classification and Regression Trees. CRC Press.

[14]

Aylin Caliskan-Islam, Joanna J. Bryson, and Arvind Narayanan. 2016. Semantics derived automatically from language corpora necessarily contain human biases. arXiv preprint arXiv:1608.07187 (2016).

[15]

Carolyn Carter, Elizabeth Renuart, Margot Saunders, and Chi Chi Wu. 2006. The credit card market and regulation: In need of repair. NC Bank. Inst. 10 (2006), 23.

[16]

Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1721--1730.

Digital Library

[17]

H. A. Chipman, E. I. George, and R. E. McCulloh. 1998. Making sense of a forest of trees. In Proceedings of the 30th Symposium on the Interface, S. Weisberg (Ed.). Fairfax Station, VA: Interface Foundation of North America, 84--92.

[18]

Paulo Cortez and Mark J. Embrechts. 2011. Opening black box data mining models using sensitivity analysis. In Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM’11). IEEE, 341--348.

[19]

Paulo Cortez and Mark J. Embrechts. 2013. Using sensitivity analysis and visualization techniques to open black box data mining models. Info. Sci. 225 (2013), 1--17.

Digital Library

[20]

Paulo Cortez, Juliana Teixeira, António Cerdeira, Fernando Almeida, Telmo Matos, and José Reis. 2009. Using data mining for wine quality assessment. In Discovery Science, Vol. 5808. Springer, 66--79.

Digital Library

[21]

Mark Craven and Jude W. Shavlik. 1994. Using sampling and queries to extract rules from trained neural networks. In Proceedings of the International Conference on Machine Learning (ICML’94). 37--45.

Digital Library

[22]

Mark Craven and Jude W. Shavlik. 1996. Extracting tree-structured representations of trained networks. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 24--30.

Digital Library

[23]

David Danks and Alex John London. 2017. Regulating autonomous systems: Beyond standards. IEEE Intell. Syst. 32, 1 (2017), 88--91.

Digital Library

[24]

Anupam Datta, Shayak Sen, and Yair Zick. 2016. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In Proceedings of the IEEE Symposium on Security and Privacy (SP’16). IEEE, 598--617.

[25]

Houtao Deng. 2014. Interpreting tree ensembles with intrees. arXiv preprint arXiv:1408.5456 (2014).

[26]

Pedro Domingos. 1998. Knowledge discovery via multiple models. Intell. Data Anal. 2, 1--4 (1998), 187--202.

Digital Library

[27]

Pedro Domingos. 1998. Occam’s two razors: The sharp and the blunt. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD’98). 37--43.

Digital Library

[28]

Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv:1702.08608v2.

[29]

Strumbelj Erik and Igor Kononenko. 2010. An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11(Jan. 2010), 1--18.

Digital Library

[30]

Ruth Fong and Andrea Vedaldi. 2017. Interpretable explanations of black boxes by meaningful perturbation. arXiv preprint arXiv:1704.03296 (2017).

[31]

Eibe Frank and Ian H. Witten. 1998. Generating accurate rule sets without global optimization. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML'98). 144--151.

Digital Library

[32]

Alex A. Freitas. 2014. Comprehensible classification models: A position paper. ACM SIGKDD Explor. Newslett. 15, 1 (2014), 1--10.

Digital Library

[33]

Glenn Fung, Sathyakama Sandilya, and R. Bharat Rao. 2005. Rule extraction from linear support vector machines. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, 32--40.

Digital Library

[34]

Robert D. Gibbons, Giles Hooker, Matthew D. Finkelman, David J. Weiss, Paul A. Pilkonis, Ellen Frank, Tara Moore, and David J. Kupfer. 2013. The CAD-MDD: A computerized adaptive diagnostic screening tool for depression. J. Clin. Psych. 74, 7 (2013), 669.

[35]

Alex Goldstein, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2015. Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24, 1 (2015), 44--65.

[36]

Bryce Goodman and Seth Flaxman. 2016. EU regulations on algorithmic decision-making and a “right to explanation.” In Proceedings of the ICML Workshop on Human Interpretability in Machine Learning (WHI’16). Retrieved from http://arxiv. org/abs/1606.08813 v1.

[37]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, and Fosca Giannotti. 2018. Local rule-based explanations of black box decision systems. arXiv preprint arXiv:1805.10820 (2018).

[38]

Satoshi Hara and Kohei Hayashi. 2016. Making tree ensembles interpretable. arXiv preprint arXiv:1606.05390 (2016).

[39]

Stefan Haufe, Frank Meinecke, Kai Görgen, Sven Dähne, John-Dylan Haynes, Benjamin Blankertz, and Felix Bießmann. 2014. On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage 87 (2014), 96--110.

[40]

Andreas Henelius, Kai Puolamäki, Henrik Boström, Lars Asker, and Panagiotis Papapetrou. 2014. A peek into the black box: Exploring classifiers by randomization. Data Min. Knowl. Discov. 28, 5--6 (2014), 1503--1529.

Digital Library

[41]

Jake M. Hofman, Amit Sharma, and Duncan J. Watts. 2017. Prediction and explanation in social systems. Science 355, 6324 (2017), 486--488.

[42]

Giles Hooker. 2004. Discovering additive structure in black box functions. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 575--580.

Digital Library

[43]

Johan Huysmans, Karel Dejaeger, Christophe Mues, Jan Vanthienen, and Bart Baesens. 2011. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Supp. Syst. 51, 1 (2011), 141--154.

Digital Library

[44]

U. Johansson, R. König, and L. Niklasson. 2003. Rule extraction from trained neural networks using genetic programming. In Proceedings of the 13th International Conference on Artificial Neural Networks. 13--16.

[45]

Ulf Johansson, Rikard König, and Lars Niklasson. 2004. The truth is in there-rule extraction from opaque models using genetic programming. In Proceedings of the FLAIRS Conference. 658--663.

[46]

Ulf Johansson and Lars Niklasson. 2009. Evolving decision trees using oracle guides. In Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM’09). IEEE, 238--244.

[47]

Ulf Johansson, Lars Niklasson, and Rikard König. 2004. Accuracy vs. comprehensibility in data mining models. In Proceedings of the 7th International Conference on Information Fusion, Vol. 1. 295--300.

[48]

Hiroharu Kato and Tatsuya Harada. 2014. Image reconstruction from bag-of-visual-words. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 955--962.

Digital Library

[49]

Been Kim, Elena Glassman, Brittney Johnson, and Julie Shah. 2015. iBCM: Interactive Bayesian case model empowering humans via intuitive interaction. Technical Report: MIT-CSAIL-TR-2015-010.

[50]

Been Kim, Oluwasanmi O. Koyejo, and Rajiv Khanna. 2016. Examples are not enough, learn to criticize&excl; criticism for interpretability. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 2280--2288.

Digital Library

[51]

Been Kim, Cynthia Rudin, and Julie A. Shah. 2014. The Bayesian case model: A generative approach for case-based reasoning and prototype classification. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 1952--1960.

Digital Library

[52]

Been Kim, Julie A. Shah, and Finale Doshi-Velez. 2015. Mind the gap: A generative approach to interpretable feature selection and extraction. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 2260--2268.

Digital Library

[53]

John K. C. Kingston. 2016. Artificial intelligence and legal liability. In Proceedings of the Specialist Group on Artificial Intelligence Conference (SGAI’16). Springer, 269--279.

[54]

Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. arXiv preprint arXiv:1703.04730 (2017).

Digital Library

[55]

Josua Krause, Adam Perer, and Kenney Ng. 2016. Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the CHI Conference on Human Factors in Computing Systems. ACM, 5686--5697.

Digital Library

[56]

Samantha Krening, Brent Harrison, Karen M. Feigh, Charles Lee Isbell, Mark Riedl, and Andrea Thomaz. 2017. Learning from explanations using sentiment and advice in RL. IEEE Trans. Cogn. Dev. Syst. 9, 1 (2017), 44--55.

[57]

R. Krishnan, G. Sivakumar, and P. Bhattacharya. 1999. Extracting decision trees from trained neural networks. Pattern Recogn. 32, 12 (1999).

[58]

Sanjay Krishnan and Eugene Wu. 2017. PALM: Machine learning explanations for iterative debugging. In Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. ACM, 4.

Digital Library

[59]

Joshua A. Kroll, Joanna Huey, Solon Barocas, Edward W. Felten, Joel R. Reidenberg, David G. Robinson, and Harlan Yu. 2017. Accountable algorithms. U. Penn. Law Rev. 165 (2017), 633--705.

[60]

Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2016. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016).

[61]

Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. 2016. Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1675--1684.

Digital Library

[62]

Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2017. Interpretable 8 explorable approximations of black box models. arXiv preprint arXiv:1707.01154 (2017).

[63]

Himabindu Lakkaraju, Jon Kleinberg, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2017. The selective labels problem: Evaluating algorithmic predictions in the presence of unobservables. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 275--284.

Digital Library

[64]

Will Landecker, Michael D. Thomure, Luís M. A. Bettencourt, Melanie Mitchell, Garrett T. Kenyon, and Steven P. Brumby. 2013. Interpreting individual classifications of hierarchical networks. In Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM’13). IEEE, 32--38.

[65]

Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2016. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155 (2016).

[66]

Benjamin Letham, Cynthia Rudin, Tyler H. McCormick, David Madigan et al. 2015. Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model. Ann. Appl. Stat. 9, 3 (2015), 1350--1371.

[67]

Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, and Wenchang Shi. 2017. Deep text classification can be fooled. arXiv preprint arXiv:1704.08006 (2017).

[68]

Zachary C. Lipton. 2016. The mythos of model interpretability. arXiv preprint arXiv:1606.03490 (2016).

[69]

Yin Lou, Rich Caruana, and Johannes Gehrke. 2012. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 150--158.

Digital Library

[70]

Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker. 2013. Accurate intelligible models with pairwise interactions. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 623--631.

Digital Library

[71]

Stella Lowry and Gordon Macpherson. 1988. A blot on the profession. Brit. Med. J. Clin. Res. 296, 6623 (1988), 657.

[72]

Aravindh Mahendran and Andrea Vedaldi. 2015. Understanding deep image representations by inverting them. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5188--5196.

[73]

Aravindh Mahendran and Andrea Vedaldi. 2016. Visualizing deep convolutional neural networks using natural pre-images. Int. J. Comput. Vis. 120, 3 (2016), 233--255.

Digital Library

[74]

Gianclaudio Malgieri and Giovanni Comandé. 2017. Why a right to legibility of automated decision-making exists in the general data protection regulation. Int. Data Priv. Law 7, 4 (2017), 243--265.

[75]

Dmitry M. Malioutov, Kush R. Varshney, Amin Emad, and Sanjeeb Dash. 2017. Learning interpretable classification rules with boolean compressed sensing. In Transparent Data Mining for Big and Small Data. Springer, 95--121.

[76]

David Martens, Bart Baesens, Tony Van Gestel, and Jan Vanthienen. 2007. Comprehensible credit scoring models using rule extraction from support vector machines. Eur. J. Operat. Res. 183, 3 (2007), 1466--1476.

[77]

David Martens, Jan Vanthienen, Wouter Verbeke, and Bart Baesens. 2011. Performance of classification models from a user perspective. Decis. Support Syst. 51, 4 (2011), 782--793.

Digital Library

[78]

Grégoire Montavon, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert Müller. 2017. Explaining nonlinear classification decisions with deep taylor decomposition. Pattern Recogn. 65 (2017), 211--222.

Digital Library

[79]

Patrick M. Murphy and Michael J. Pazzani. 1991. ID2-of-3: Constructive induction of M-of-N concepts for discriminators in decision trees. In Proceedings of the 8th International Workshop on Machine Learning. 183--187.

Digital Library

[80]

Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, and Jeff Clune. 2016. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 3387--3395.

Digital Library

[81]

Anh Nguyen, Jason Yosinski, and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 427--436.

[82]

Haydemar Núñez, Cecilio Angulo, and Andreu Català. 2002. Rule extraction from support vector machines. In Proceedings of the European Symposium on Artificial Neural Networks (ESANN’02). 107--112.

[83]

Julian D. Olden and Donald A. Jackson. 2002. Illuminating the “black box”: A randomization approach for understanding variable contributions in artificial neural networks. Ecol. Model. 154, 1 (2002), 135--150.

[84]

Fernando E. B. Otero and Alex A. Freitas. 2013. Improving the interpretability of classification rules discovered by an ant colony algorithm. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. ACM, 73--80.

Digital Library

[85]

Gisele L. Pappa, Anthony J. Baines, and Alex A. Freitas. 2005. Predicting post-synaptic activity in proteins with data mining. Bioinformatics 21, suppl. 2 (2005), ii19--ii25.

Digital Library

[86]

Frank Pasquale. 2015. The Black Box Society: The Secret Algorithms that Control Money and Information. Harvard University Press.

Digital Library

[87]

Michael J. Pazzani, S. Mani, William R. Shankle et al. 2001. Acceptance of rules generated by machine learning among medical experts. Methods Info. Med. 40, 5 (2001), 380--385.

[88]

Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 560--568.

Digital Library

[89]

Brett Poulin, Roman Eisner, Duane Szafron, Paul Lu, Russell Greiner, David S. Wishart, Alona Fyshe, Brandon Pearcy, Cam MacDonell, and John Anvik. 2006. Visual explanation of evidence with additive classifiers. In Proceedings of the National Conference on Artificial Intelligence, Vol. 21.

Digital Library

[90]

J. Ross Quinlan. 1987. Generating production rules from decision trees. In Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI’87), Vol. 87. 304--307.

Digital Library

[91]

J. Ross Quinlan. 1987. Simplifying decision trees. Int. J. Man-Mach. Stud. 27, 3 (1987), 221--234.

Digital Library

[92]

J Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Elsevier.

Digital Library

[93]

J. Ross Quinlan. 1999. Simplifying decision trees. Int. J. Hum.-Comput. Stud. 51, 2 (1999), 497--510.

Digital Library

[94]

J Ross Quinlan and R. Mike Cameron-Jones. 1993. FOIL: A midterm report. In Proceedings of the European Conference on Machine Learning. Springer, 1--20.

Digital Library

[95]

Alec Radford, Rafal Jozefowicz, and Ilya Sutskever. 2017. Learning to generate reviews and discovering sentiment. arXiv preprint arXiv:1704.01444 (2017).

[96]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016).

[97]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Nothing else matters: Model-agnostic explanations by identifying prediction invariance. arXiv preprint arXiv:1611.05817 (2016).

[98]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144.

Digital Library

[99]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-agnostic explanations. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18).

[100]

Andrea Romei and Salvatore Ruggieri. 2014. A multidisciplinary survey on discrimination analysis. Knowl. Eng. Rev. 29, 5 (2014), 582--638.

[101]

Salvatore Ruggieri. 2012. Subtree replacement in decision tree simplification. In Proceedings of the 12th SIAM International Conference on Data Mining. SIAM, 379--390.

[102]

Andrea Saltelli. 2002. Sensitivity analysis for importance assessment. Risk Anal. 22, 3 (2002), 579--590.

[103]

Wojciech Samek, Alexander Binder, Grégoire Montavon, Sebastian Lapuschkin, and Klaus-Robert Müller. 2017. Evaluating the visualization of what a deep neural network has learned. IEEE Trans. Neural Netw. Learn. Syst. 28, 11 (2017), 2660--2673.

[104]

Vitaly Schetinin, Jonathan E. Fieldsend, Derek Partridge, Timothy J. Coats, Wojtek J. Krzanowski, Richard M. Everson, Trevor C. Bailey, and Adolfo Hernandez. 2007. Confident interpretation of Bayesian decision tree ensembles for clinical applications. IEEE Trans. Info. Technol. Biomed. 11, 3 (2007), 312--319.

Digital Library

[105]

Christin Seifert, Aisha Aamir, Aparna Balagopalan, Dhruv Jain, Abhinav Sharma, Sebastian Grottel, and Stefan Gumhold. 2017. Visualizations of deep neural networks in computer vision: A survey. In Transparent Data Mining for Big and Small Data. Springer, 123--144.

[106]

Ramprasaath R. Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, and Dhruv Batra. 2016. Grad-cam: Why did you say that? visual explanations from deep networks via gradient-based localization. arXiv preprint arXiv:1610.02391 (2016).

[107]

Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning important features through propagating activation differences. arXiv preprint arXiv:1704.02685 (2017).

Digital Library

[108]

Ravid Shwartz-Ziv and Naftali Tishby. 2017. Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810 (2017).

[109]

Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).

[110]

Sameer Singh, Marco Tulio Ribeiro, and Carlos Guestrin. 2016. Programs as black-box explanations. arXiv preprint arXiv:1611.07579 (2016).

[111]

Sören Sonnenburg, Alexander Zien, Petra Philips, and G. Rätsch. 2008. POIMs: Positional oligomer importance matrices—understanding support vector machine-based signal detectors. Bioinformatics 24, 13 (2008), i6--i14.

Digital Library

[112]

Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. 2014. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806 (2014).

[113]

Irene Sturm, Sebastian Lapuschkin, Wojciech Samek, and Klaus-Robert Müller. 2016. Interpretable deep neural networks for single-trial eeg classification. J. Neurosci. Methods 274 (2016), 141--145.

[114]

Guolong Su, Dennis Wei, Kush R. Varshney, and Dmitry M. Malioutov. 2015. Interpretable two-level Boolean rule learning for classification. arXiv preprint arXiv:1511.07361 (2015).

[115]

Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. arXiv preprint arXiv:1703.01365 (2017).

Digital Library

[116]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).

[117]

Hui Fen Tan, Giles Hooker, and Martin T. Wells. 2016. Tree space prototypes: Another look at making tree ensembles interpretable. arXiv preprint arXiv:1611.07115 (2016).

[118]

Pang-Ning Tan et al. 2006. Introduction to Data Mining. Pearson Education, India.

[119]

Jayaraman J. Thiagarajan, Bhavya Kailkhura, Prasanna Sattigeri, and Karthikeyan Natesan Ramamurthy. 2016. TreeView: Peeking into deep neural networks via feature-space partitioning. arXiv preprint arXiv:1611.07429 (2016).

[120]

Nava Tintarev and Judith Masthoff. 2015. Explaining recommendations: Design and evaluation. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, and Bracha Shapira (Eds.). Springer, 353--382.

[121]

Gabriele Tolomei, Fabrizio Silvestri, Andrew Haines, and Mounia Lalmas. 2017. Interpretable predictions of tree-based ensembles via actionable feature tweaking. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 465--474.

Digital Library

[122]

Ryan Turner. 2016. A model explanation system. In Proceedings of the IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP’16). IEEE, 1--6.

[123]

Wouter Verbeke, David Martens, Christophe Mues, and Bart Baesens. 2011. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Syst. Appl. 38, 3 (2011), 2354--2364.

Digital Library

[124]

Marina M.-C. Vidovic, Nico Görnitz, Klaus-Robert Müller, and Marius Kloft. 2016. Feature importance measure for non-linear learning algorithms. arXiv preprint arXiv:1611.07567 (2016).

[125]

Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, and Antonio Torralba. 2013. Hoggles: Visualizing object detection features. In Proceedings of the IEEE International Conference on Computer Vision. 1--8.

Digital Library

[126]

Sandra Wachter, Brent Mittelstadt, and Luciano Floridi. 2017. Why a right to explanation of automated decision-making does not exist in the general data protection regulation. Int. Data Priv. Law 7, 2 (2017), 76--99.

[127]

Fulton Wang and Cynthia Rudin. 2015. Falling rule lists. In Proceedings of the Conference on Artificial Intelligence and Statistics. 1013--1022.

[128]

Jialei Wang, Ryohei Fujimaki, and Yosuke Motohashi. 2015. Trading interpretability for accuracy: Oblique treed sparse additive models. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1245--1254.

Digital Library

[129]

Tong Wang. 2017. Multi-value rule sets. arXiv preprint arXiv:1710.05257 (2017).

[130]

Tong Wang, Cynthia Rudin, Finale Velez-Doshi, Yimin Liu, Erica Klampfl, and Perry MacNeille. 2016. Bayesian rule sets for interpretable classification. In Proceedings of the IEEE 16th International Conference on Data Mining (ICDM’16). IEEE, 1269--1274.

[131]

Philippe Weinzaepfel, Hervé Jégou, and Patrick Pérez. 2011. Reconstructing an image from its local descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, 337--344.

Digital Library

[132]

Adrian Weller. 2017. Challenges for transparency. arXiv preprint arXiv:1708.01870 (2017).

[133]

Dietrich Wettschereck, David W. Aha, and Takao Mohri. 1997. A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. In Lazy Learning. Springer, 273--314.

Digital Library

[134]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning. 2048--2057.

Digital Library

[135]

Xiaoxin Yin and Jiawei Han. 2003. CPAR: Classification based on predictive association rules. In Proceedings of the SIAM International Conference on Data Mining. SIAM, 331--335.

[136]

Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. 2015. Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579 (2015).

[137]

Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European Conference on Computer Vision. Springer, 818--833.

[138]

Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2016. Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530 (2016).

[139]

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2921--2929.

[140]

Yichen Zhou and Giles Hooker. 2016. Interpreting models via single tree approximation. arXiv preprint arXiv:1610.09036 (2016).

[141]

Zhi-Hua Zhou, Yuan Jiang, and Shi-Fu Chen. 2003. Extracting symbolic rules from trained neural network ensembles. AI Commun. 16, 1 (2003), 3--15.

Digital Library

[142]

Alexander Zien, Nicole Krämer, Sören Sonnenburg, and Gunnar Rätsch. 2009. The feature importance ranking measure. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 694--709.

[143]

Luisa M. Zintgraf, Taco S. Cohen, Tameem Adel, and Max Welling. 2017. Visualizing deep neural network decisions: Prediction difference analysis. arXiv preprint arXiv:1702.04595 (2017).

Cited By

Hutzschenreuter TLämmermann T(2025)What Is Your AI Strategy? Systematically Integrating Self-Learning Technologies into Your Business StrategyAcademy of Management Perspectives10.5465/amp.2023.0243Online publication date: 7-Jan-2025
https://doi.org/10.5465/amp.2023.0243
Mohanta AGarikapati KSneha Mohanty SShreya (2025)Predictive Justice in Criminal ProceedingsExploration of AI in Contemporary Legal Systems10.4018/979-8-3693-7205-0.ch011(217-252)Online publication date: 3-Jan-2025
https://doi.org/10.4018/979-8-3693-7205-0.ch011
Najar M(2025)Unleashing Human PotentialInnovations in Optimization and Machine Learning10.4018/979-8-3693-5231-1.ch013(327-368)Online publication date: 17-Jan-2025
https://doi.org/10.4018/979-8-3693-5231-1.ch013
Show More Cited By

Index Terms

A Survey of Methods for Explaining Black Box Models
1. Information systems
  1. Information systems applications
    1. Decision support systems
      1. Data analytics

Recommendations

Benchmarking and survey of explanation methods for black box models
Abstract
The rise of sophisticated black-box machine learning models in Artificial Intelligence systems has prompted the need for explanation methods that reveal how these models work in an understandable way to users and decision makers. Unsurprisingly, ...
A Cloud-Based Black-Box Solar Predictor for Smart Homes
Special Issue on Smart Homes, Buildings and Infrastructures

The popularity of rooftop solar for homes is rapidly growing. However, accurately forecasting solar generation is critical to fully exploiting the benefits of locally generated solar energy. In this article, we present two machine-learning techniques to ...
Multi-criteria Approaches to Explaining Black Box Machine Learning Models
Intelligent Information and Database Systems
Abstract
The adoption of machine learning algorithms, especially in critical domains often encounters obstacles related to the lack of their interpretability. In this paper we discuss the methods producing local explanations being either counterfactuals or ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 51, Issue 5

September 2019

791 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3271482

Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 August 2018

Accepted: 01 June 2018

Revised: 01 June 2018

Received: 01 January 2018

Published in CSUR Volume 51, Issue 5

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Research
Refereed

Data Availability

a93-guidotti-suppl.pdf: Supplemental movie, appendix, image and software files for, A Survey of Methods for Explaining Black Box Models https://dl.acm.org/doi/10.1145/3236009#guidotti.zip

Funding Sources

European Community’s H2020 Program under the funding scheme “INFRAIA-1-2014-2015: Research Infrastructures

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2,440
Total Citations
View Citations
64,697
Total Downloads

Downloads (Last 12 months)10,226
Downloads (Last 6 weeks)1,200

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hutzschenreuter TLämmermann T(2025)What Is Your AI Strategy? Systematically Integrating Self-Learning Technologies into Your Business StrategyAcademy of Management Perspectives10.5465/amp.2023.0243Online publication date: 7-Jan-2025
https://doi.org/10.5465/amp.2023.0243
Mohanta AGarikapati KSneha Mohanty SShreya (2025)Predictive Justice in Criminal ProceedingsExploration of AI in Contemporary Legal Systems10.4018/979-8-3693-7205-0.ch011(217-252)Online publication date: 3-Jan-2025
https://doi.org/10.4018/979-8-3693-7205-0.ch011
Najar M(2025)Unleashing Human PotentialInnovations in Optimization and Machine Learning10.4018/979-8-3693-5231-1.ch013(327-368)Online publication date: 17-Jan-2025
https://doi.org/10.4018/979-8-3693-5231-1.ch013
Miao ZHan LLiu ZKang HTuo FZhang HGan SRen YHu G(2025)A Framework for Soil Moisture Downscaling on the Tibetan Plateau Based on Interpretable Deep LearningWater10.3390/w1704057017:4(570)Online publication date: 16-Feb-2025
https://doi.org/10.3390/w17040570
Gregoriades AThemistocleous C(2025)Improving Crowdfunding Decisions Using Explainable Artificial IntelligenceSustainability10.3390/su1704136117:4(1361)Online publication date: 7-Feb-2025
https://doi.org/10.3390/su17041361
Ferdous JIslam RMahboubi AIslam M(2025)A Survey on ML Techniques for Multi-Platform Malware Detection: Securing PC, Mobile Devices, IoT, and Cloud EnvironmentsSensors10.3390/s2504115325:4(1153)Online publication date: 13-Feb-2025
https://doi.org/10.3390/s25041153
Yeong DPanduru KWalsh J(2025)Exploring the Unseen: A Survey of Multi-Sensor Fusion and the Role of Explainable AI (XAI) in Autonomous VehiclesSensors10.3390/s2503085625:3(856)Online publication date: 31-Jan-2025
https://doi.org/10.3390/s25030856
Tetteh EZielosko B(2025)Greedy Algorithm for Deriving Decision Rules from Decision Tree EnsemblesEntropy10.3390/e2701003527:1(35)Online publication date: 4-Jan-2025
https://doi.org/10.3390/e27010035
Alsbakhi AThabtah FLu J(2025)Autism Data Classification Using AI Algorithms with Rules: Focused ReviewBioengineering10.3390/bioengineering1202016012:2(160)Online publication date: 7-Feb-2025
https://doi.org/10.3390/bioengineering12020160
Vallayil MNand PYan WAllende-Cid HVamathevan T(2025)CARAG: A Context-Aware Retrieval Framework for Fact Verification, Integrating Local and Global Perspectives of Explainable AIApplied Sciences10.3390/app1504197015:4(1970)Online publication date: 13-Feb-2025
https://doi.org/10.3390/app15041970
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents