Abstract
Multi-label text classification (MLTC) aims to assign one or more labels to each document. Previous studies mainly use the label co-occurrence matrix obtained from the training set to establish the correlation between labels, but this approach ignores the noise in label co-occurrence, and applies the ungeneralizable label co-occurrence relationship to model testing and validation. In addition, labelling co-occurrence relationship globally lacks attention to a specific document, which results in the loss of the local label co-occurrence relationship. To address this issue, we introduced a new multi-label text classification model in this study, presenting CoocNet, which adopts a two-step label detection to effectively tackle the challenge of modeling label co-occurrence relations. The model first captures the global co-occurrence relationships of labels using the label co-occurrence matrix and denoises the label noise through the label denoising attention mechanism, and then uses a contrast learning strategy to capture the local label co-occurrence relationships among specific different documents. In particular, we unify the co-occurrence labeling into an auxiliary training task that runs parallel to the multi-label classification task. The new task supervises the learning of sentence representations for documents by leveraging the modeled label co-occurrence relationships, enhancing the model’s generalization ability. Another novelty is that the auxiliary task is only active during model training, thereby preventing label co-occurrence relationships from interfering with the model’s predictions outside the training phase. The experimental results on three benchmark datasets (Reuters-21578, AAPD, and RCV1) demonstrate that our model outperforms the existing state-of-the-art methods.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The datasets generated during or analysed during the current study are available from the corresponding author on reasonable request.
References
Adhikari A, Ram A, Tang R, Lin J (2019) Docbert: Bert for document classification. arXiv:1904.08398
Ameer I, Bölücü N, Siddiqui MHF, Can B, Sidorov G, Gelbukh A (2023) Multi-label emotion classification in texts using transfer learning. Expert Syst Appl 213(118):534
Apté C, Damerau F, Weiss SM (1994) Automated learning of decision rules for text categorization. ACM Trans Inf Syst (TOIS) 12(3):233–251
Cai L, Song Y, Liu T, Zhang K (2020) A hybrid bert model that incorporates label semantics via adjustive attention for multi-label text classification. Ieee Access 8:152,183–152,192
da Costa LS, Oliveira IL, Fileto R (2023) Text classification using embeddings: a survey. Knowl Inf Syst 65(7):2761–2803
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Du J, Chen Q, Peng Y, Xiang Y, Tao C, Lu Z (2019) Ml-net: multi-label classification of biomedical texts with deep neural networks. J Am Med Inform Assoc 26(11):1279–1285
Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. arXiv:2104.08821
Gong J, Teng Z, Teng Q, Zhang H, Du L, Chen S, Bhuiyan MZA, Li J, Liu M, Ma H (2020) Hierarchical graph transformer-based deep learning model for large-scale multi-label text classification. IEEE Access 8:30,885–30,896
Gunel B, Du J, Conneau A, Stoyanov V (2020) Supervised contrastive learning for pre-trained language model fine-tuning. arXiv:2011.01403
Guo L, Zhang D, Wang L, Wang H, Cui B (2018) Cran: a hybrid cnn-rnn attention-based model for text classification. In: Conceptual modeling: 37th international conference, ER 2018, Xi’an, China, October 22–25, 2018, Proceedings 37, Springer, pp 571–585
Ionescu RT, Butnaru AM (2019) Vector of locally-aggregated word embeddings (vlawe): A novel document-level representation. arXiv:1902.08850
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18,661–18,673
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882
Lewis DD, Yang Y, Russell-Rose T, Li F (2004) Rcv1: A new benchmark collection for text categorization research. J Mach Learn Res 5(Apr):361–397
Lin N, Qin G, Wang J, Yang A, Zhou D (2022) Research on the application of contrastive learning in multi-label text classification. arXiv:2212.00552
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu J, Chang WC, Wu Y, Yang Y (2017) Deep learning for extreme multi-label text classification. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pp 115–124
Liu M, Liu L, Cao J, Du Q (2022) Co-attention network with label embedding for text classification. Neurocomputing 471:61–69
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692
Ma K, Huang Z, Deng X, Guo J, Qiu W (2023) Led: Label correlation enhanced decoder for multi-label text classification. In: ICASSP 2023-2023 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 1–5
Ma Q, Yuan C, Zhou W, Hu S (2021) Label-specific dual graph neural network for multi-label text classification. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 3855–3864
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning-based text classification: a comprehensive review. ACM Comput Surv (CSUR) 54(3):1–40
Pal A, Selvakumar M, Sankarasubbu M (2020) Multi-label text classification using attention-based graph neural network. arXiv:2003.11644
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237
Qin Y, Lin Y, Takanobu R, Liu Z, Li P, Ji H, Huang M, Sun M, Zhou J (2020) Erica: improving entity and relation understanding for pre-trained language models via contrastive learning. arXiv:2012.15022
Qiu X, Sun T, Xu Y, Shao Y, Dai N, Huang X (2020) Pre-trained models for natural language processing: A survey. Sci China Technol Sci 63(10):1872–1897
Shimura K, Li J, Fukumoto F (2018) Hft-cnn: Learning hierarchical category structure for multi-label short text categorization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 811–816
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems 30
Vu HT, Nguyen MT, Nguyen VC, Pham MH, Nguyen VQ, Nguyen VH (2023) Label-representative graph convolutional network for multi-label text classification. Appl Intell 53(12):14,759–14,774
Wang B, Hu X, Li P, Philip SY (2021) Cognitive structure learning model for hierarchical multi-label text classification. Knowl-Based Syst 218(106):876
Wang R, Dai X, et al (2022) Contrastive learning-enhanced nearest neighbor mechanism for multi-label text classification. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 672–679
Wang Z, Wang P, Huang L, Sun X, Wang H (2022) Incorporating hierarchy into text encoder: a contrastive learning approach for hierarchical text classification. arXiv:2203.03825
Weng W, Li YW, Liu JH, Wu SX, Chen CL (2021) Multi-label classification review and opportunities. J Netw Intell 6(2):255–275
Xiao L, Huang X, Chen B, Jing L (2019) Label-specific document representation for multi-label text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 466–475
Xiao L, Zhang X, Jing L, Huang C, Song M (2021) Does head label help for long-tailed multi-label text classification. In: Proceedings of the AAAI conference on artificial intelligence vol 35, pp 14103–14111
Xiao L, Xu P, Jing L, Zhang X (2022) Pairwise instance relation augmentation for long-tailed multi-label text classification. arXiv:2211.10685
Xiao Y, Li Y, Yuan J, Guo S, Xiao Y, Li Z (2021) History-based attention in seq2seq model for multi-label text classification. Knowl-Based Syst 224(107):094
Yan Y, Liu F, Zhuang X, Ju J (2023) An r-transformer_bilstm model based on attention for multi-label text classification. Neural Process Lett 55(2):1293–1316
Yang P, Sun X, Li W, Ma S, Wu W, Wang H (2018) Sgm: sequence generation model for multi-label classification. arXiv:1806.04822
Yang S, Chen B (2023) Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence. IEEE Transactions on Neural Networks and Learning Systems
Yang S, Chen B (2023) Snib: improving spike-based machine learning using nonlinear information bottleneck. IEEE Transactions on Systems, Man, and Cybernetics: Systems
Yang S, Tan J, Chen B (2022) Robust spike-based continual meta-learning improved by restricted minimum error entropy criterion. Entropy 24(4):455
Yang S, Wang H, Chen B (2023) Sibols: Robust and energy-efficient learning for spike-based machine intelligence in information bottleneck framework. IEEE Transactions on Cognitive and Developmental Systems
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 1480–1489
You R, Zhang Z, Wang Z, Dai S, Mamitsuka H, Zhu S (2019) Attentionxml: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification. Advances in Neural Information Processing Systems 32
Yu SCL, He J, Basulto VG, Pan JZ (2023) Instances and labels: Hierarchy-aware joint supervised contrastive learning for hierarchical multi-label text classification. In: The 2023 conference on empirical methods in natural language processing. https://openreview.net/forum?id=S0eqbM16k2
Zhang X, Luo Z, Du B, Wu Z (2023) L-rcap: Rnn-capsule model via label semantics for mltc. Appl Intell 53(12):14,961–14,970
Acknowledgements
This work is supported by the Fundamental Research Funds for the Central Universities (Grant D5000220192), the National Natural Science Foundation of China under Grant 61603233, Shaanxi Natural Science Basic Research Program under Grant 2022JM-206, and Xi’an Science and Technology planning project under Grant 21RGZN0008.
Author information
Authors and Affiliations
Contributions
Yi Li: Conceptualization of this study, Methodology, Experiments, Writing - Original draft preparation. Junge Shen: Conceptualization of this study, Funding application, manuscript revision. Zhaoyong Mao: Funding application, manuscript revision.
Corresponding author
Ethics declarations
Competing of interest
The authors declare that there is no conflict of interest.
Ethical and informed consent for data used
This study did not involve the collection or use of human subject data. All data used in this study were derived from publicly available literature, statistical sources, or simulated experimental results. Therefore, ethical approval and informed consent were not required for this study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Y., Shen, J. & Mao, Z. CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling. Appl Intell 54, 8702–8718 (2024). https://doi.org/10.1007/s10489-024-05379-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05379-0