Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding

Mengnan Du; Subhabrata Mukherjee; Yu Cheng; Milad Shokouhi; Xia Hu; Ahmed Hassan

doi:10.18653/v1/2023.eacl-main.129

Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding

Mengnan Du, Subhabrata Mukherjee, Yu Cheng, Milad Shokouhi, Xia Hu, Ahmed Hassan Awadallah

Abstract

Recent work has focused on compressing pre-trained language models (PLMs) like BERT where the major focus has been to improve the in-distribution performance for downstream tasks. However, very few of these studies have analyzed the impact of compression on the generalizability and robustness of compressed models for out-of-distribution (OOD) data. Towards this end, we study two popular model compression techniques including knowledge distillation and pruning and show that the compressed models are significantly less robust than their PLM counterparts on OOD test sets although they obtain similar performance on in-distribution development sets for a task. Further analysis indicates that the compressed models overfit on the shortcut samples and generalize poorly on the hard ones. We further leverage this observation to develop a regularization strategy for robust model compression based on sample uncertainty.

Anthology ID:: 2023.eacl-main.129
Volume:: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Month:: May
Year:: 2023
Address:: Dubrovnik, Croatia
Editors:: Andreas Vlachos, Isabelle Augenstein
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1766–1778
Language:
URL:: https://aclanthology.org/2023.eacl-main.129/
DOI:: 10.18653/v1/2023.eacl-main.129
Bibkey:
Cite (ACL):: Mengnan Du, Subhabrata Mukherjee, Yu Cheng, Milad Shokouhi, Xia Hu, and Ahmed Hassan Awadallah. 2023. Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1766–1778, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):: Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding (Du et al., EACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.eacl-main.129.pdf
Video:: https://aclanthology.org/2023.eacl-main.129.mp4

PDF Cite Search Video Fix data