CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling

Li, Yi; Shen, Junge; Mao, Zhaoyong

doi:10.1007/s10489-024-05379-0

CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling

Published: 02 July 2024

Volume 54, pages 8702–8718, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

381 Accesses
Explore all metrics

Abstract

Multi-label text classification (MLTC) aims to assign one or more labels to each document. Previous studies mainly use the label co-occurrence matrix obtained from the training set to establish the correlation between labels, but this approach ignores the noise in label co-occurrence, and applies the ungeneralizable label co-occurrence relationship to model testing and validation. In addition, labelling co-occurrence relationship globally lacks attention to a specific document, which results in the loss of the local label co-occurrence relationship. To address this issue, we introduced a new multi-label text classification model in this study, presenting CoocNet, which adopts a two-step label detection to effectively tackle the challenge of modeling label co-occurrence relations. The model first captures the global co-occurrence relationships of labels using the label co-occurrence matrix and denoises the label noise through the label denoising attention mechanism, and then uses a contrast learning strategy to capture the local label co-occurrence relationships among specific different documents. In particular, we unify the co-occurrence labeling into an auxiliary training task that runs parallel to the multi-label classification task. The new task supervises the learning of sentence representations for documents by leveraging the modeled label co-occurrence relationships, enhancing the model’s generalization ability. Another novelty is that the auxiliary task is only active during model training, thereby preventing label co-occurrence relationships from interfering with the model’s predictions outside the training phase. The experimental results on three benchmark datasets (Reuters-21578, AAPD, and RCV1) demonstrate that our model outperforms the existing state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Label-Wise Document Pre-training for Multi-label Text Classification

A Label Information Aware Model for Multi-label Text Classification

Article Open access 14 October 2024

Research on Multi-label Text Classification Method Based on tALBERT-CNN

Article Open access 13 December 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets generated during or analysed during the current study are available from the corresponding author on reasonable request.

References

Adhikari A, Ram A, Tang R, Lin J (2019) Docbert: Bert for document classification. arXiv:1904.08398
Ameer I, Bölücü N, Siddiqui MHF, Can B, Sidorov G, Gelbukh A (2023) Multi-label emotion classification in texts using transfer learning. Expert Syst Appl 213(118):534
Google Scholar
Apté C, Damerau F, Weiss SM (1994) Automated learning of decision rules for text categorization. ACM Trans Inf Syst (TOIS) 12(3):233–251
Article Google Scholar
Cai L, Song Y, Liu T, Zhang K (2020) A hybrid bert model that incorporates label semantics via adjustive attention for multi-label text classification. Ieee Access 8:152,183–152,192
da Costa LS, Oliveira IL, Fileto R (2023) Text classification using embeddings: a survey. Knowl Inf Syst 65(7):2761–2803
Article Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Du J, Chen Q, Peng Y, Xiang Y, Tao C, Lu Z (2019) Ml-net: multi-label classification of biomedical texts with deep neural networks. J Am Med Inform Assoc 26(11):1279–1285
Article Google Scholar
Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. arXiv:2104.08821
Gong J, Teng Z, Teng Q, Zhang H, Du L, Chen S, Bhuiyan MZA, Li J, Liu M, Ma H (2020) Hierarchical graph transformer-based deep learning model for large-scale multi-label text classification. IEEE Access 8:30,885–30,896
Gunel B, Du J, Conneau A, Stoyanov V (2020) Supervised contrastive learning for pre-trained language model fine-tuning. arXiv:2011.01403
Guo L, Zhang D, Wang L, Wang H, Cui B (2018) Cran: a hybrid cnn-rnn attention-based model for text classification. In: Conceptual modeling: 37th international conference, ER 2018, Xi’an, China, October 22–25, 2018, Proceedings 37, Springer, pp 571–585
Ionescu RT, Butnaru AM (2019) Vector of locally-aggregated word embeddings (vlawe): A novel document-level representation. arXiv:1902.08850
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18,661–18,673
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882
Lewis DD, Yang Y, Russell-Rose T, Li F (2004) Rcv1: A new benchmark collection for text categorization research. J Mach Learn Res 5(Apr):361–397
Lin N, Qin G, Wang J, Yang A, Zhou D (2022) Research on the application of contrastive learning in multi-label text classification. arXiv:2212.00552
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu J, Chang WC, Wu Y, Yang Y (2017) Deep learning for extreme multi-label text classification. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pp 115–124
Liu M, Liu L, Cao J, Du Q (2022) Co-attention network with label embedding for text classification. Neurocomputing 471:61–69
Article Google Scholar
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692
Ma K, Huang Z, Deng X, Guo J, Qiu W (2023) Led: Label correlation enhanced decoder for multi-label text classification. In: ICASSP 2023-2023 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 1–5
Ma Q, Yuan C, Zhou W, Hu S (2021) Label-specific dual graph neural network for multi-label text classification. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 3855–3864
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning-based text classification: a comprehensive review. ACM Comput Surv (CSUR) 54(3):1–40
Article Google Scholar
Pal A, Selvakumar M, Sankarasubbu M (2020) Multi-label text classification using attention-based graph neural network. arXiv:2003.11644
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237
Qin Y, Lin Y, Takanobu R, Liu Z, Li P, Ji H, Huang M, Sun M, Zhou J (2020) Erica: improving entity and relation understanding for pre-trained language models via contrastive learning. arXiv:2012.15022
Qiu X, Sun T, Xu Y, Shao Y, Dai N, Huang X (2020) Pre-trained models for natural language processing: A survey. Sci China Technol Sci 63(10):1872–1897
Article Google Scholar
Shimura K, Li J, Fukumoto F (2018) Hft-cnn: Learning hierarchical category structure for multi-label short text categorization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 811–816
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems 30
Vu HT, Nguyen MT, Nguyen VC, Pham MH, Nguyen VQ, Nguyen VH (2023) Label-representative graph convolutional network for multi-label text classification. Appl Intell 53(12):14,759–14,774
Wang B, Hu X, Li P, Philip SY (2021) Cognitive structure learning model for hierarchical multi-label text classification. Knowl-Based Syst 218(106):876
Google Scholar
Wang R, Dai X, et al (2022) Contrastive learning-enhanced nearest neighbor mechanism for multi-label text classification. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 672–679
Wang Z, Wang P, Huang L, Sun X, Wang H (2022) Incorporating hierarchy into text encoder: a contrastive learning approach for hierarchical text classification. arXiv:2203.03825
Weng W, Li YW, Liu JH, Wu SX, Chen CL (2021) Multi-label classification review and opportunities. J Netw Intell 6(2):255–275
Google Scholar
Xiao L, Huang X, Chen B, Jing L (2019) Label-specific document representation for multi-label text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 466–475
Xiao L, Zhang X, Jing L, Huang C, Song M (2021) Does head label help for long-tailed multi-label text classification. In: Proceedings of the AAAI conference on artificial intelligence vol 35, pp 14103–14111
Xiao L, Xu P, Jing L, Zhang X (2022) Pairwise instance relation augmentation for long-tailed multi-label text classification. arXiv:2211.10685
Xiao Y, Li Y, Yuan J, Guo S, Xiao Y, Li Z (2021) History-based attention in seq2seq model for multi-label text classification. Knowl-Based Syst 224(107):094
Google Scholar
Yan Y, Liu F, Zhuang X, Ju J (2023) An r-transformer_bilstm model based on attention for multi-label text classification. Neural Process Lett 55(2):1293–1316
Article Google Scholar
Yang P, Sun X, Li W, Ma S, Wu W, Wang H (2018) Sgm: sequence generation model for multi-label classification. arXiv:1806.04822
Yang S, Chen B (2023) Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence. IEEE Transactions on Neural Networks and Learning Systems
Yang S, Chen B (2023) Snib: improving spike-based machine learning using nonlinear information bottleneck. IEEE Transactions on Systems, Man, and Cybernetics: Systems
Yang S, Tan J, Chen B (2022) Robust spike-based continual meta-learning improved by restricted minimum error entropy criterion. Entropy 24(4):455
Yang S, Wang H, Chen B (2023) Sibols: Robust and energy-efficient learning for spike-based machine intelligence in information bottleneck framework. IEEE Transactions on Cognitive and Developmental Systems
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 1480–1489
You R, Zhang Z, Wang Z, Dai S, Mamitsuka H, Zhu S (2019) Attentionxml: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification. Advances in Neural Information Processing Systems 32
Yu SCL, He J, Basulto VG, Pan JZ (2023) Instances and labels: Hierarchy-aware joint supervised contrastive learning for hierarchical multi-label text classification. In: The 2023 conference on empirical methods in natural language processing. https://openreview.net/forum?id=S0eqbM16k2
Zhang X, Luo Z, Du B, Wu Z (2023) L-rcap: Rnn-capsule model via label semantics for mltc. Appl Intell 53(12):14,961–14,970

Download references

Acknowledgements

This work is supported by the Fundamental Research Funds for the Central Universities (Grant D5000220192), the National Natural Science Foundation of China under Grant 61603233, Shaanxi Natural Science Basic Research Program under Grant 2022JM-206, and Xi’an Science and Technology planning project under Grant 21RGZN0008.

Author information

Authors and Affiliations

Unmanned Systems Research Institute, Northwestern Polytechnical University, Xi’an City, China
Yi Li
Unmanned Systems Research Institute, Northwestern Polytechnical University, Xi’an City, China
Junge Shen & Zhaoyong Mao

Authors

Yi Li
View author publications
You can also search for this author in PubMed Google Scholar
Junge Shen
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyong Mao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Yi Li: Conceptualization of this study, Methodology, Experiments, Writing - Original draft preparation. Junge Shen: Conceptualization of this study, Funding application, manuscript revision. Zhaoyong Mao: Funding application, manuscript revision.

Corresponding author

Correspondence to Junge Shen.

Ethics declarations

Competing of interest

The authors declare that there is no conflict of interest.

Ethical and informed consent for data used

This study did not involve the collection or use of human subject data. All data used in this study were derived from publicly available literature, statistical sources, or simulated experimental results. Therefore, ethical approval and informed consent were not required for this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Shen, J. & Mao, Z. CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling. Appl Intell 54, 8702–8718 (2024). https://doi.org/10.1007/s10489-024-05379-0

Download citation

Accepted: 07 March 2024
Published: 02 July 2024
Issue Date: September 2024
DOI: https://doi.org/10.1007/s10489-024-05379-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Label-Wise Document Pre-training for Multi-label Text Classification

A Label Information Aware Model for Multi-label Text Classification

Research on Multi-label Text Classification Method Based on tALBERT-CNN

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing of interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

CoocNet: a novel approach to multi-label text classification with improved label co-occurrence modeling

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Label-Wise Document Pre-training for Multi-label Text Classification

A Label Information Aware Model for Multi-label Text Classification

Research on Multi-label Text Classification Method Based on tALBERT-CNN

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing of interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation