skip to main content
10.1145/3219819.3220021acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System

Published: 19 July 2018 Publication History

Abstract

We propose a novel way to train ranking models, such as recommender systems, that are both effective and efficient. Knowledge distillation (KD) was shown to be successful in image recognition to achieve both effectiveness and efficiency. We propose a KD technique for learning to rank problems, called ranking distillation (RD). Specifically, we train a smaller student model to learn to rank documents/items from both the training data and the supervision of a larger teacher model. The student model achieves a similar ranking performance to that of the large teacher model, but its smaller model size makes the online inference more efficient. RD is flexible because it is orthogonal to the choices of ranking models for the teacher and student. We address the challenges of RD for ranking problems. The experiments on public data sets and state-of-the-art recommendation models showed that RD achieves its design purposes: the student model learnt with RD has less than an half size of the teacher model while achieving a ranking performance similar tothe teacher model and much better than the student model learnt without RD.

Supplementary Material

MP4 File (tang_compact_ranking_models.mp4)

References

[1]
Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormándi, George E. Dahl, and Geoffrey E. Hinton . 2018. Large scale distributed neural network training through online distillation. arXiv preprint arXiv:1804.03235 (2018).
[2]
Nima Asadi, Donald Metzler, Tamer Elsayed, and Jimmy Lin . 2011. Pseudo test collections for learning web search ranking functions International Conference on Research and development in Information Retrieval. ACM, 1073--1082.
[3]
Jimmy Ba and Rich Caruana . 2014. Do deep nets really need to be deep?. In Advances in neural information processing systems. 2654--2662.
[4]
Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien . 2009. Semi-supervised learning (chapelle, o. et al., eds.; 2006){book reviews}. IEEE Transactions on Neural Networks Vol. 20, 3 (2009), 542--542.
[5]
Xu Chen, Zheng Qin, Yongfeng Zhang, and Tao Xu . 2016. Learning to rank features for recommendation over multiple categories International ACM SIGIR conference on Research and Development in Information Retrieval. 305--314.
[6]
Yuntao Chen, Naiyan Wang, and Zhaoxiang Zhang . 2017. DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer. arXiv preprint arXiv:1707.01220 (2017).
[7]
Eunjoon Cho, Seth A Myers, and Jure Leskovec . 2011. Friendship and mobility: user movement in location-based social networks International Conference on Knowledge Discovery and Data Mining. ACM, 1082--1090.
[8]
Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, and Yann LeCun . 2015. The loss surfaces of multilayer networks. In Artificial Intelligence and Statistics. 192--204.
[9]
Paul Covington, Jay Adams, and Emre Sargin . 2016. Deep neural networks for youtube recommendations. ACM Conference on Recommender systems. 191--198.
[10]
Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W Bruce Croft . 2017. Neural Ranking Models with Weak Supervision. arXiv preprint arXiv:1704.08803 (2017).
[11]
Fernando Diaz . 2016. Learning to Rank with Labeled Features. In International Conference on the Theory of Information Retrieval. ACM, 41--44.
[12]
Ignacio Fernández-Tob'ıas, Iván Cantador, Marius Kaminskas, and Francesco Ricci . 2012. Cross-domain recommender systems: A survey of the state of the art Spanish Conference on Information Retrieval. sn, 24.
[13]
Ruining He and Julian McAuley . 2016. Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation International Conference on Data Mining. IEEE.
[14]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua . 2017. Neural collaborative filtering. In International Conference on World Wide Web. ACM, 173--182.
[15]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean . 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[16]
Cheng-Kang Hsieh, Longqi Yang, Yin Cui, Tsung-Yi Lin, Serge Belongie, and Deborah Estrin . 2017. Collaborative metric learning. In International Conference on World Wide Web. 193--201.
[17]
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei . 2014. Large-scale video classification with convolutional neural networks IEEE conference on Computer Vision and Pattern Recognition. 1725--1732.
[18]
Kenji Kawaguchi . 2016. Deep learning without poor local minima. In Advances in Neural Information Processing Systems. 586--594.
[19]
Yoon Kim . 2014. Convolutional Neural Networks for Sentence Classification Conference on Empirical Methods on Natural Language Processing. ACL, 1756--1751.
[20]
Yoon Kim and Alexander M Rush . 2016. Sequence-level knowledge distillation. arXiv preprint arXiv:1606.07947 (2016).
[21]
Yehuda Koren, Robert Bell, and Chris Volinsky . 2009. Matrix factorization techniques for recommender systems. Computer, Vol. 42, 8 (2009).
[22]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton . 2012. Imagenet classification with deep convolutional neural networks Advances in Neural Information Processing Systems. 1097--1105.
[23]
Hui Li, Tsz Nam Chan, Man Lung Yiu, and Nikos Mamoulis . 2017. FEXIPRO: Fast and Exact Inner Product Retrieval in Recommender Systems International Conference on Management of Data. ACM, 835--850.
[24]
David C Liu, Stephanie Rogers, Raymond Shiau, Dmitry Kislyuk, Kevin C Ma, Zhigang Zhong, Jenny Liu, and Yushi Jing . 2017. Related pins at pinterest: The evolution of a real-world recommender system International Conference on World Wide Web. 583--592.
[25]
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockỳ, and Sanjeev Khudanpur . 2010. Recurrent neural network based language model. In Interspeech.
[26]
Nagarajan Natarajan, Inderjit S Dhillon, Pradeep K Ravikumar, and Ambuj Tewari . 2013. Learning with noisy labels. In Advances in neural information processing systems. 1196--1204.
[27]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng . 2016. Text Matching As Image Recognition. In AAAI Conference on Artificial Intelligence. AAAI Press, 2793--2799.
[28]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng . 2017. DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval. arXiv preprint arXiv:1710.05649 (2017).
[29]
Steffen Rendle and Christoph Freudenthaler . 2014. Improving pairwise learning for item recommendation from implicit feedback International Conference on Web Search and Data Mining. ACM, 273--282.
[30]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme . 2009. BPR: Bayesian personalized ranking from implicit feedback Conference on Uncertainty in Artificial Intelligence. AUAI Press, 452--461.
[31]
Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio . 2014. Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014).
[32]
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl . 2001. Item-based collaborative filtering recommendation algorithms International Conference on World Wide Web. ACM, 285--295.
[33]
Jiaxi Tang and Ke Wang . 2018. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding ACM International Conference on Web Search and Data Mining.
[34]
Christina Teflioudi, Rainer Gemulla, and Olga Mykytiuk . 2015. Lemp: Fast retrieval of large entries in a matrix product International Conference on Management of Data. ACM, 107--122.
[35]
Hao Wang, Naiyan Wang, and Dit-Yan Yeung . 2015. Collaborative deep learning for recommender systems International Conference on Knowledge Discovery and Data Mining. ACM, 1235--1244.
[36]
Jason Weston, Samy Bengio, and Nicolas Usunier . 2010. Large scale image annotation: learning to rank with joint word-image embeddings. Machine learning, Vol. 81, 1 (2010), 21--35.
[37]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power . 2017. End-to-End Neural Ad-hoc Ranking with Kernel Pooling International ACM SIGIR conference on Research and Development in Information Retrieval. 55--64.
[38]
Quan Yuan, Gao Cong, and Aixin Sun . 2014. Graph-based point-of-interest recommendation with geographical and temporal influences International Conference on Information and Knowledge Management. ACM, 659--668.
[39]
Hanwang Zhang, Fumin Shen, Wei Liu, Xiangnan He, Huanbo Luan, and Tat-Seng Chua . 2016. Discrete collaborative filtering. In International Conference on Research and Development in Information Retrieval. ACM, 325--334.
[40]
Yan Zhang, Defu Lian, and Guowu Yang . 2017. Discrete Personalized Ranking for Fast Collaborative Filtering from Implicit Feedback. AAAI Conference on Artificial Intelligence. AAAI Press, 1669--1675.
[41]
Zhiwei Zhang, Qifan Wang, Lingyun Ruan, and Luo Si . 2014. Preference preserving hashing for efficient recommendation International Conference on Research and Development in Information Retrieval. ACM, 183--192.
[42]
Ke Zhou and Hongyuan Zha . 2012. Learning binary codes for collaborative filtering. International Conference on Knowledge Discovery and Data Mining. ACM, 498--506.
[43]
Xiaojin Zhu . 2005. Semi-supervised learning literature survey. (2005).

Cited By

View all
  • (2025)Invariant debiasing learning for recommendation via biased imputationInformation Processing & Management10.1016/j.ipm.2024.10402862:3(104028)Online publication date: May-2025
  • (2024)Post-Training Attribute Unlearning in Recommender SystemsACM Transactions on Information Systems10.1145/370198743:1(1-28)Online publication date: 6-Nov-2024
  • (2024)FINEST: Stabilizing Recommendations by Rank-Preserving Fine-TuningACM Transactions on Knowledge Discovery from Data10.1145/369525618:9(1-22)Online publication date: 1-Nov-2024
  • Show More Cited By

Index Terms

  1. Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
      July 2018
      2925 pages
      ISBN:9781450355520
      DOI:10.1145/3219819
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 July 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. knowledge transfer
      2. learning to rank
      3. model compression
      4. recommender system

      Qualifiers

      • Research-article

      Funding Sources

      • Natural Sciences nad Engineering Research Council of Canada

      Conference

      KDD '18
      Sponsor:

      Acceptance Rates

      KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)124
      • Downloads (Last 6 weeks)8
      Reflects downloads up to 19 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Invariant debiasing learning for recommendation via biased imputationInformation Processing & Management10.1016/j.ipm.2024.10402862:3(104028)Online publication date: May-2025
      • (2024)Post-Training Attribute Unlearning in Recommender SystemsACM Transactions on Information Systems10.1145/370198743:1(1-28)Online publication date: 6-Nov-2024
      • (2024)FINEST: Stabilizing Recommendations by Rank-Preserving Fine-TuningACM Transactions on Knowledge Discovery from Data10.1145/369525618:9(1-22)Online publication date: 1-Nov-2024
      • (2024)A Survey on Recommender Systems Using Graph Neural NetworkACM Transactions on Information Systems10.1145/369478443:1(1-49)Online publication date: 26-Nov-2024
      • (2024)Distributed Recommendation Systems: Survey and Research DirectionsACM Transactions on Information Systems10.1145/369478343:1(1-38)Online publication date: 6-Sep-2024
      • (2024)A Self-Distilled Learning to Rank Model for Ad Hoc RetrievalACM Transactions on Information Systems10.1145/368178442:6(1-28)Online publication date: 25-Jul-2024
      • (2024)Mitigating Sample Selection Bias with Robust Domain Adaption in Multimedia RecommendationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680615(7581-7590)Online publication date: 28-Oct-2024
      • (2024)Distillation vs. Sampling for Efficient Training of Learning to Rank ModelsProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672527(51-60)Online publication date: 2-Aug-2024
      • (2024)Unbiased, Effective, and Efficient Distillation from Heterogeneous Models for Recommender SystemsACM Transactions on Recommender Systems10.1145/3649443Online publication date: 23-Feb-2024
      • (2024)Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language ModelsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688118(507-517)Online publication date: 8-Oct-2024
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media