research-article

Open access

Topic-aware Neural Linguistic Steganography Based on Knowledge Graphs

Authors:

Zhongliang Yang,

Ru ZhangAuthors Info & Claims

ACM/IMS Transactions on Data Science, Volume 2, Issue 2

Article No.: 10, Pages 1 - 13

https://doi.org/10.1145/3418598

Published: 08 April 2021 Publication History

All formats PDF

Abstract

The core challenge of steganography is always how to improve the hidden capacity and the concealment. Most current generation-based linguistic steganography methods only consider the probability distribution between text characters, and the emotion and topic of the generated steganographic text are uncontrollable. Especially for long texts, generating several sentences related to a topic and displaying overall coherence and discourse-relatedness can ensure better concealment. In this article, we address the problem of generating coherent multi-sentence texts for better concealment, and a topic-aware neural linguistic steganography method that can generate a steganographic paragraph with a specific topic is present. We achieve a topic-controllable steganographic long text generation by encoding the related entities and their relationships from Knowledge Graphs. Experimental results illustrate that the proposed method can guarantee both the quality of the generated steganographic text and its relevance to a specific topic. The proposed model can be widely used in covert communication, privacy protection, and many other areas of information security.

References

[1]

Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, Rodney Michael Kinney, Sebastian Kohlmeier, Kyle Lo, Tyler C. Murray, Hsu-Han Ooi, Matthew E. Peters, Joanna L. Power, Sam Skjonsberg, Lucy Lu Wang, Christopher Wilhelm, Zheng Yuan, Madeleine van Zuylen, and Oren Etzioni. 2018. Construction of the literature graph in semantic scholar. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT’18).

[2]

Joost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, and Khalil Sima’an. 2017. Graph convolutional encoders for syntax-aware neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17).

[3]

Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2000. A neural probabilistic language model. J. Mach. Learn. Res. 3 (2000), 1137–1155.

Digital Library

[4]

Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. Retrieved from https://arxiv.org/abs/1904.10509.

[5]

Nopporn Chotikakamthorn. 1998. Electronic document data hiding technique using inter-character space. In Proceedings of the IEEE Asia-Pacific Conference on Circuits and Systems. Microelectronics and Integrating Systems. 419–422.

[6]

Michael J. Denkowski and Alon Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the Workshop on Statistical Machine Translation (WMT@ACL’14).

[7]

Abdelrahman Desoky. 2010. Comprehensive linguistic steganography survey. Int. J. Info. Comput. Secur. 4, 2 (2010), 164–197.

Digital Library

[8]

Jessica Fridricha. 2009. Steganography in Digital Media: Principles, Algorithms, and Applications. Cambridge University Press, Cambridge, UK.

Digital Library

[9]

Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. Retrieved from https://arxiv.org/abs/1603.06393.

[10]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9 (1997), 1735–1780.

Digital Library

[11]

Christophe Guyeux, Jean F. Couchot, and Raphael Couturier. 2015. STABYLO: Steganography with adaptive, Bbs, and binary embedding at low cost. Ann. Telecommun. 70, 9–10 (2015), 441–449.

[12]

Lucai Wang Jianjun Zhang, Jun Shen, and Haijun Lin. 2016. Coverless text information hiding method based on the word rank map. In Proceedings of the International Conference on Cloud Computing and Security, Vol. 10039. Springer, Cham.

[13]

Lucai Wang Jianjun Zhang, Yicheng Xie, and Haijun Lin. 2017. Coverless text information hiding method using the frequent words distance. In Proceedings of the International Conference on Cloud Computing and Security, Vol. 10602. Springer, Cham.

[14]

Rik Koncel-Kedziorski, Dhanush Bekal, Yi Luan, Mirella Lapata, and Hannaneh Hajishirzi. 2019. Text generation from knowledge graphs with graph transformers. Retrieved from https://arxiv.org/abs/1904.02342.

[15]

Chunfang Yang, Lingyun Xiang, Xinhui Wang, and Peng Liu. 2017. A novel linguistic steganography based on synonym run-length encoding. IEICE Trans. Info. Syst. 100, 2 (2017), 313–322.

[16]

Gang Luo, Lingyun Xiang, Xingming Sun, and Bin Xia. 2014. Linguistic steganalysis using the features derived from synonym frequency. Multimedia Tools Appl. 71, 3 (2014), 1893–1911.

Digital Library

[17]

Anandaprova Majumder and Suvamoy Changder. 2013. A novel approach for text steganography: Generating text summary using reflection symmetry. Procedia Technol. 10, 10 (2013), 112–120.

[18]

H. Hernan Moraldo. 2014. An approach for text steganography based on markov chains. Retrieved from https://arxiv.org/abs/1409.0915.

[19]

Brian Murphy and Carl Vogel. 2007. The syntax of concealment: Reliable methods for plain text information hiding. In Proceedings of the SPIE, Vol. 6505. Springer, Cham, 752–762.

[20]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2001. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the Association for Computational Linguistics (ACL’01).

Digital Library

[21]

Ning Qian. 1999. On the momentum term in gradient descent learning algorithms. Neural Netw.: Offic. J. Int. Neural Netw. Soc. 12 1 (1999), 145–151.

Digital Library

[22]

Alec Radford. 2018. Improving language understanding by generative pre-training.

[23]

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners.

[24]

Claude E. Shannon. 1949. Communication theory of secrecy systems. Bell Syst. Tech. J. 28 (1949), 656–715.

[25]

A. N. Shniperov and K. A. Nikitina. 2016. A text steganography method based on Markov chains. Autom. Control Comput. Sci. 50 (2016), 802–808.

[26]

Gustavus J. Simmons. 1983. The prisoners’ problem and the subliminal channel. In Proceedings of the International Cryptology Conference (CRYPTO’83).

Digital Library

[27]

Dilip K. Yadav Susmita Mahato, and Danish A. Khan. 2020. A modified approach to data hiding in microsoft word documents by change-tracking technique. J. King Saud Univ. Comput. Info. Sci. 32 (Feb. 2020), 216–224.

[28]

Martin Jaggi, Tina Fang, and Katerina Argyraki. 2017. Generating steganographic text with LSTMs. Commun. ACM (May 2017). Retrieved from https://arxiv.org/abs/1705.10742.

[29]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Conference and Workshop on Neural Information Processing Systems (NIPS’17).

Digital Library

[30]

Yonghui Dai, Weihui Dai, Yue Yu, and Bin Deng. 2010. Text steganography system using Markov chain source model and des algorithm. J. Softw. 5, 7 (2010), 785–792.

[31]

Andreas Westfeld and Andreas Pfitzmann. 1999. Attacks on steganographic systems. In Information Hiding.

Digital Library

[32]

Zhongliang Yang, Yuting Hu, Yongfeng Huang, and Yujin Zhang. 2019. Behavioral security in covert communication systems. Retrieved from https://arxiv.org/abs/1910.09759.

[33]

Zhongliang Yang, Yongfeng Huang, and Yu-Jin Zhang. 2019. A fast and efficient text steganalysis method. IEEE Signal Process. Lett. 26 (2019), 627–631.

[34]

Jian Yuan, Yongfeng Huang, and Shanyu Tang. 2011. Steganography in inactive frames of VoIP streams encoded by source codec. IEEE Trans. Info. Forensics Secur. 6, 2 (June 2011), 296–306.

Digital Library

[35]

Fufang Li, Yubo Luo, Yongfeng Huang, and Chinchen Chang. 2016. Text steganography based on ci-poetry generation using Markov chain model. KSII Trans. Internet Info. Syst. 10, 9 (2016), 4568–4584.

[36]

Rohan Harit, Xianyi Chen, Zhili Zhou, Huiyu Sun, and Xingming Sun. 2015. Coverless image steganography without embedding. In Proceedings of the International Conference on Cloud Computing and Security. Springer, Cham, 123–132.

[37]

Yongfeng Huang, Zhongliang Yang, and Xueshun Peng. 2017. A sudoku matrix-based method of pitch period steganography in low-rate speech coding. In Proceedings of the International Conference on Security and Privacy in Communication Systems. Springer, Cham, 752–762.

[38]

Ziming Chen, Yongfeng Huang, Zhongliang Yang, Xiaoqing Guo, and Yu-Jin Zhang. 2018. RNN-Stega: Linguistic steganography based on recurrent neural networks. IEEE Trans. Info. Forensics Secur. (Sep. 2018), 1280–1295.

[39]

Zachary M. Ziegler, Yuntian Deng, and Alexander M. Rush. 2019. Neural linguistic steganography. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and International Joint Conference on Natural Language Processing (EMNLP/IJCNLP’19).

Index Terms

Topic-aware Neural Linguistic Steganography Based on Knowledge Graphs
1. Security and privacy
  1. Cryptography
  2. Human and societal aspects of security and privacy
    1. Privacy protections

Recommendations

Hi-Stega: A Hierarchical Linguistic Steganography Framework Combining Retrieval and Generation
Neural Information Processing
Abstract
Due to the widespread use of social media, linguistic steganography which embeds secret message into normal text to protect the security and privacy of secret message, has been widely studied and applied. However, existing linguistic steganography ...
A linguistic steganography based on word indexing compression and candidate selection

In this paper, a novel linguistic steganography with high imperceptibility and undetectability is proposed via secret message compression and candidate text selection. The length of the practical embedded payload can be reduced by the proposed word ...
Comprehensive linguistic steganography survey

Contemporary steganography approaches suffer from many serious deficiencies; generally, they attempt to hide data as detectable and suspicious noise in a cover that is assumed to look innocent. In addition, steganography approaches found in literature ...

Comments

Information & Contributors

Information

Published In

cover image ACM/IMS Transactions on Data Science

ACM/IMS Transactions on Data Science Volume 2, Issue 2

May 2021

149 pages

ISSN:2691-1922

DOI:10.1145/3454114

Editor:
Beng Chin Ooi
National University of Singapore, Singapore

Issue’s Table of Contents

Copyright © 2021 Association for Computing Machinery.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 April 2021

Accepted: 01 August 2020

Revised: 01 June 2020

Received: 01 February 2020

Published in TDS Volume 2, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
Natural Science Foundation of Hubei Province

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
857
Total Downloads

Downloads (Last 12 months)223
Downloads (Last 6 weeks)31

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents