Skip to main content

Advertisement

Log in

CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Multi-label text classification (MLTC) aims to assign one or more labels to each document. Previous studies mainly use the label co-occurrence matrix obtained from the training set to establish the correlation between labels, but this approach ignores the noise in label co-occurrence, and applies the ungeneralizable label co-occurrence relationship to model testing and validation. In addition, labelling co-occurrence relationship globally lacks attention to a specific document, which results in the loss of the local label co-occurrence relationship. To address this issue, we introduced a new multi-label text classification model in this study, presenting CoocNet, which adopts a two-step label detection to effectively tackle the challenge of modeling label co-occurrence relations. The model first captures the global co-occurrence relationships of labels using the label co-occurrence matrix and denoises the label noise through the label denoising attention mechanism, and then uses a contrast learning strategy to capture the local label co-occurrence relationships among specific different documents. In particular, we unify the co-occurrence labeling into an auxiliary training task that runs parallel to the multi-label classification task. The new task supervises the learning of sentence representations for documents by leveraging the modeled label co-occurrence relationships, enhancing the model’s generalization ability. Another novelty is that the auxiliary task is only active during model training, thereby preventing label co-occurrence relationships from interfering with the model’s predictions outside the training phase. The experimental results on three benchmark datasets (Reuters-21578, AAPD, and RCV1) demonstrate that our model outperforms the existing state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
€32.70 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The datasets generated during or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Adhikari A, Ram A, Tang R, Lin J (2019) Docbert: Bert for document classification. arXiv:1904.08398

  2. Ameer I, Bölücü N, Siddiqui MHF, Can B, Sidorov G, Gelbukh A (2023) Multi-label emotion classification in texts using transfer learning. Expert Syst Appl 213(118):534

    Google Scholar 

  3. Apté C, Damerau F, Weiss SM (1994) Automated learning of decision rules for text categorization. ACM Trans Inf Syst (TOIS) 12(3):233–251

    Article  Google Scholar 

  4. Cai L, Song Y, Liu T, Zhang K (2020) A hybrid bert model that incorporates label semantics via adjustive attention for multi-label text classification. Ieee Access 8:152,183–152,192

  5. da Costa LS, Oliveira IL, Fileto R (2023) Text classification using embeddings: a survey. Knowl Inf Syst 65(7):2761–2803

    Article  Google Scholar 

  6. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  7. Du J, Chen Q, Peng Y, Xiang Y, Tao C, Lu Z (2019) Ml-net: multi-label classification of biomedical texts with deep neural networks. J Am Med Inform Assoc 26(11):1279–1285

    Article  Google Scholar 

  8. Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. arXiv:2104.08821

  9. Gong J, Teng Z, Teng Q, Zhang H, Du L, Chen S, Bhuiyan MZA, Li J, Liu M, Ma H (2020) Hierarchical graph transformer-based deep learning model for large-scale multi-label text classification. IEEE Access 8:30,885–30,896

  10. Gunel B, Du J, Conneau A, Stoyanov V (2020) Supervised contrastive learning for pre-trained language model fine-tuning. arXiv:2011.01403

  11. Guo L, Zhang D, Wang L, Wang H, Cui B (2018) Cran: a hybrid cnn-rnn attention-based model for text classification. In: Conceptual modeling: 37th international conference, ER 2018, Xi’an, China, October 22–25, 2018, Proceedings 37, Springer, pp 571–585

  12. Ionescu RT, Butnaru AM (2019) Vector of locally-aggregated word embeddings (vlawe): A novel document-level representation. arXiv:1902.08850

  13. Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18,661–18,673

  14. Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882

  15. Lewis DD, Yang Y, Russell-Rose T, Li F (2004) Rcv1: A new benchmark collection for text categorization research. J Mach Learn Res 5(Apr):361–397

  16. Lin N, Qin G, Wang J, Yang A, Zhou D (2022) Research on the application of contrastive learning in multi-label text classification. arXiv:2212.00552

  17. Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  18. Liu J, Chang WC, Wu Y, Yang Y (2017) Deep learning for extreme multi-label text classification. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pp 115–124

  19. Liu M, Liu L, Cao J, Du Q (2022) Co-attention network with label embedding for text classification. Neurocomputing 471:61–69

    Article  Google Scholar 

  20. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692

  21. Ma K, Huang Z, Deng X, Guo J, Qiu W (2023) Led: Label correlation enhanced decoder for multi-label text classification. In: ICASSP 2023-2023 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 1–5

  22. Ma Q, Yuan C, Zhou W, Hu S (2021) Label-specific dual graph neural network for multi-label text classification. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 3855–3864

  23. Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning-based text classification: a comprehensive review. ACM Comput Surv (CSUR) 54(3):1–40

    Article  Google Scholar 

  24. Pal A, Selvakumar M, Sankarasubbu M (2020) Multi-label text classification using attention-based graph neural network. arXiv:2003.11644

  25. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237

  26. Qin Y, Lin Y, Takanobu R, Liu Z, Li P, Ji H, Huang M, Sun M, Zhou J (2020) Erica: improving entity and relation understanding for pre-trained language models via contrastive learning. arXiv:2012.15022

  27. Qiu X, Sun T, Xu Y, Shao Y, Dai N, Huang X (2020) Pre-trained models for natural language processing: A survey. Sci China Technol Sci 63(10):1872–1897

    Article  Google Scholar 

  28. Shimura K, Li J, Fukumoto F (2018) Hft-cnn: Learning hierarchical category structure for multi-label short text categorization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 811–816

  29. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems 30

  30. Vu HT, Nguyen MT, Nguyen VC, Pham MH, Nguyen VQ, Nguyen VH (2023) Label-representative graph convolutional network for multi-label text classification. Appl Intell 53(12):14,759–14,774

  31. Wang B, Hu X, Li P, Philip SY (2021) Cognitive structure learning model for hierarchical multi-label text classification. Knowl-Based Syst 218(106):876

    Google Scholar 

  32. Wang R, Dai X, et al (2022) Contrastive learning-enhanced nearest neighbor mechanism for multi-label text classification. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 672–679

  33. Wang Z, Wang P, Huang L, Sun X, Wang H (2022) Incorporating hierarchy into text encoder: a contrastive learning approach for hierarchical text classification. arXiv:2203.03825

  34. Weng W, Li YW, Liu JH, Wu SX, Chen CL (2021) Multi-label classification review and opportunities. J Netw Intell 6(2):255–275

    Google Scholar 

  35. Xiao L, Huang X, Chen B, Jing L (2019) Label-specific document representation for multi-label text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 466–475

  36. Xiao L, Zhang X, Jing L, Huang C, Song M (2021) Does head label help for long-tailed multi-label text classification. In: Proceedings of the AAAI conference on artificial intelligence vol 35, pp 14103–14111

  37. Xiao L, Xu P, Jing L, Zhang X (2022) Pairwise instance relation augmentation for long-tailed multi-label text classification. arXiv:2211.10685

  38. Xiao Y, Li Y, Yuan J, Guo S, Xiao Y, Li Z (2021) History-based attention in seq2seq model for multi-label text classification. Knowl-Based Syst 224(107):094

    Google Scholar 

  39. Yan Y, Liu F, Zhuang X, Ju J (2023) An r-transformer_bilstm model based on attention for multi-label text classification. Neural Process Lett 55(2):1293–1316

    Article  Google Scholar 

  40. Yang P, Sun X, Li W, Ma S, Wu W, Wang H (2018) Sgm: sequence generation model for multi-label classification. arXiv:1806.04822

  41. Yang S, Chen B (2023) Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence. IEEE Transactions on Neural Networks and Learning Systems

  42. Yang S, Chen B (2023) Snib: improving spike-based machine learning using nonlinear information bottleneck. IEEE Transactions on Systems, Man, and Cybernetics: Systems

  43. Yang S, Tan J, Chen B (2022) Robust spike-based continual meta-learning improved by restricted minimum error entropy criterion. Entropy 24(4):455

  44. Yang S, Wang H, Chen B (2023) Sibols: Robust and energy-efficient learning for spike-based machine intelligence in information bottleneck framework. IEEE Transactions on Cognitive and Developmental Systems

  45. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 1480–1489

  46. You R, Zhang Z, Wang Z, Dai S, Mamitsuka H, Zhu S (2019) Attentionxml: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification. Advances in Neural Information Processing Systems 32

  47. Yu SCL, He J, Basulto VG, Pan JZ (2023) Instances and labels: Hierarchy-aware joint supervised contrastive learning for hierarchical multi-label text classification. In: The 2023 conference on empirical methods in natural language processing. https://openreview.net/forum?id=S0eqbM16k2

  48. Zhang X, Luo Z, Du B, Wu Z (2023) L-rcap: Rnn-capsule model via label semantics for mltc. Appl Intell 53(12):14,961–14,970

Download references

Acknowledgements

This work is supported by the Fundamental Research Funds for the Central Universities (Grant D5000220192), the National Natural Science Foundation of China under Grant 61603233, Shaanxi Natural Science Basic Research Program under Grant 2022JM-206, and Xi’an Science and Technology planning project under Grant 21RGZN0008.

Author information

Authors and Affiliations

Authors

Contributions

Yi Li: Conceptualization of this study, Methodology, Experiments, Writing - Original draft preparation. Junge Shen: Conceptualization of this study, Funding application, manuscript revision. Zhaoyong Mao: Funding application, manuscript revision.

Corresponding author

Correspondence to Junge Shen.

Ethics declarations

Competing of interest

The authors declare that there is no conflict of interest.

Ethical and informed consent for data used

This study did not involve the collection or use of human subject data. All data used in this study were derived from publicly available literature, statistical sources, or simulated experimental results. Therefore, ethical approval and informed consent were not required for this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Shen, J. & Mao, Z. CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling. Appl Intell 54, 8702–8718 (2024). https://doi.org/10.1007/s10489-024-05379-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05379-0

Keywords

Navigation