An improved speech steganalysis based on VoIP using deep learning approach

Document Type : -

Authors

1 PhD student, Malek Ashtar University of Technology, Tehran, Iran

2 Associate Professor, Malek Ashtar University of Technology, Tehran, Iran

Abstract

Today, Voice over Internet Protocol (VoIP) is widely used in real-time communication and social networks and has become a suitable carrier for steganography methods. To confronting these threats, many steganalysis methods have been invented, among the proposed solutions, the combination of signal processing and machine learning methods has made it possible to create steganalysis methods with high accuracy. In this paper, a combined approach of speech signal processing methods and artificial intelligence algorithms is used. In this research, first, data pre-processing is done on compressed audio signal with G.729 codec, which extracts intra-frame features and inter-frame correlations with good resolution. Then the obtained results are given to a deep learning network to train cover data from stego data. The results of the implementation include the improvement in both the detection accuracy and the computation time. This method has been analyzed for two important steganography families, QIM and PMS, and the proposed method has been tested and implemented for different embedding rates. Another important point is the real-time test of the presented method, which for 1000 millisecond files, the response time was less than 5 millisecond, which shows the high speed of the proposed model in the execution phase.

Keywords

Main Subjects


Smiley face

[1]     Shamalizadeh Baei, M. A.; Norozi, Z.; Sabzinezhad, M.; Karami, M. R. “Designing an Image Steganography Algorithm Based on Entropy and ELSB2”; Adv. Defence Sci. & Technol. 2018, 02, 39-50 (In Persian).
[2]     Petitcolas, F.A.; Anderson, R.J.; Kuhn, M.G. “Information Hiding-A Survey”; IEEE Int. Joint Conf. Neural Networks 1999, 87(7), 1062-1078.
[3]     Tacticus, A. “How to Survive Under Siege”; Clarendon Press. 1990. https://doi.org/10.1109/5.771065. 
[4]     Lemma, A.N.; Aprea, J.; Oomen, W.; van de Kerkhof, L. “A Temporal Domain Audio Watermarking Technique”; IEEE Trans. Signal Process. 2003, 51, 1088-1097. https://doi.org/ 10.1109/TSP.2003.809372.
[5]     Jafari, S. M; Sadnejad, S. R.; Saryazdi, S.; Jamshidi, V. “A New Audio Steganography Algorithm Based on Sample Clustering“; 7th ISCISC’10, 2010 (In Persian). https://doi.org/10.11591/ijece.v12i1.pp320-330.
[6]     ITU, G. “Coding of Speech at 8 kbit/s Using Conjugatestructure Algebraic-Code-Excited Linear-Prediction (CSACELP)“; 1996.
[7]     Chen, B.; Wornell, G. W. “Quantization Index Modulation: A Class of Provably Good Methods for Digital Watermarking and Information Embedding”; IEEE Trans. Inform. Theory 2001, 47, 1423-1443. https://doi.org/ 10.1109/18.923725.
[8]     Yan, S.; Tang, G.; Chen, Y. “Incorporating Data Hiding into G.729 Speech Codec”;        Multimed. Tools Appl. 2016, 75, 11493-11512.
[9]     Ren, Y.; Wu, H.; Wang, L. “An AMR Adaptive Steganography Algorithm Based on Minimizing Distortion”; Multimed. Tools Appl. 2018, 77, 12095-12110. https://doi.org/10.1007/s11042-015-2865-1.
[10]  Huang, Y.; Liu, C.; Tang, S.; Bai, S. ”Steganography Integration into a Low-Bit Rate Speech Codec”; IEEE Trans. Inf. Foren. Sec. 2012, 7, 1865-1875. https://doi.org/10.1109/ TIFS.2012.2218599.
[11]  Huang, Y.; Tao, H.; Xiao, B.; Chang, C. “Steganography in Low Bit-Rate Speech Streams Based on Quantization Index Modulation Controlled by Keys”; Sci. China Tech. Sci. 2017, 60, 1585-1596. https://doi.org/10.1007/s11431-016-0707-3.
[12]  Liu, P.; Li, S.; Wang, H. “Steganography Integrated into Linear Predictive Coding for Low Bit-Rate Speech Codec”; Multimed. Tools Appl. 2017, 76, 2837-2859. https://doi.org/ 10.1007/s11042-016-3257-x.
[13]  Xiao, B.; Huang, Y.; Tang, S. “An Approach to Information Hiding in Low Bit-Rate Speech Stream”; IEEE Glob. Telecomm. Conf. 2008, 1-5. https://doi.org/ 10.1109/ GLOCOM.2008.ECP.375.
[14]  Huang, Y.F.; Tang, S.; Yuan, J. “Steganography in Inactive Frames of VoIP Streams Encoded by Source Codec”; IEEE Trans. Inf. Foren. Sec. 2011, 6, 296-306. https://doi.org/ 10.1109/TIFS.2011.2108649.
[15]  Liu, J.; Zhou, K.; Tian, H. “Least-Significant-Digit Steganography in Low Bitrate Speech”; IEEE ICC 2012, 1133-1137.
[16]  Lin, R.S. “An Imperceptible Information Hiding in Encoded Bits of Speech Signal”; IEEE IIH-MSP 2015, 37-40.
[17]  Xu, S.; Tian, H.; Quan, H.; Lu, J. “A Novel Global-Local Representations Network for Speech Steganalysis”; 5th Int. Conf. on AI and Pattern Recognition 2022, 945-949.
[18]  Wang, J.; Yang, J.; Gao, F.; Xu, P. “Steganalysis of Compressed Speech Based on Global and Local Correlation Mining”; IEEE Access.  2022, 10, 78472-78483. https:// doi.org/10.1109/ACCESS.2022.3194051.
[19]  Qiu, Y.; Tian, H.; Tang, L.; Mazurczyk, W.; Chang, C.C. “Steganalysis of Adaptive Multi-Rate Speech Streams with Distributed Representations of Codewords”; J. Inf. Sec. App. 2022, 68, 103250. https://doi.org/10.1016/j.jisa. 2022.103250.
[20]  Li, S.; Wang, J.; Liu, P. “General Frame-Wise Steganalysis of Compressed Speech Based on Dual-Domain Representation and Intra-Frame Correlation Leaching”; IEEE/ACM Trans. ASLAB. 2022, 30, 2025-2035.
[21]  Qiu, Y.; Tian, H.; Li, H.; Chang, C. C.; Vasilakos, A. V. “Separable Convolution Network With Dual-Stream Pyramid Enhanced Strategy for Speech Steganalysis”; IEEE Trans. Inf. Foren. Sec. 2023.
[22]  Yang, Z.; Yang, H.; Chang, C.C.; Huang, Y.; Chang, C.C. “Real-time Steganalysis for Streaming Media Based on Multi-channel Convolutional Sliding Windows”; Knowl-Based Syst. 2022.
[23]  Ren, Y.; Liu, D.; Liu, C.; Xiong, Q.; Fu, J.; Wang, L. “A Universal Audio Steganalysis Scheme based on Multiscale Spectrograms and DeepResNet”; IEEE Trans. Depend. Sec. 2022, 20, 665-679. https://doi.org/10.1109/TDSC.2022. 3141121.
[24]  Wu, Z.; Guo, J. “MFPD-LSTM: A Steganalysis Method Based on Multiple Features of Pitch delay Using RNN-LSTM”; J. Inf. Sec. App. 2023, 74, 103469. https://doi.org/ 10.1016/j.jisa.2023.103469.
[25]  Hu, Y.; Huang, Y.; Yang, Z.; Huang, Y. “Detection of Heterogeneous Parallel Steganography For Low Bit-Rate Voip Speech Streams”; Neurocomputing 2021, 419, 70-79. https://doi.org/10.1016/j.neucom.2020.08.002.
[26]  Li, S.; Wang, J.; Liu, P.; Wei, M.; Yan, Q. “Detection of Multiple Steganography Methods in Compressed Speech Based on Code Element Embedding, Bi-LSTM and CNN with Attention Mechanisms”; IEEE/ACM TASLAP 2021, 29, 1556-1569. https://doi.org/10.1109/TASLP.2021.3074752.
[27]  Yang, H.; Yang, Z.; Bao, Y.; Liu, S.; Huang, Y. “FCEM: A Novel Fast Correlation Extract Model for Real Time Steganalysis of VOIP Stream via Multi-head Attention”; In IEEE ICASSP. 2020, 2822-2826. https://doi.org/10.1109/ ICASSP40776.2020.9054361. 
[28]  Yang, H.; Yang, Z.; Huang, Y. “Steganalysis of VoIP Streams with CNN-LSTM Network”; ACM Workshop on Information Hiding and Multimedia Security 2019, 204-209. https:// doi.org/ 10.1145/3335203.3335735.
[29]  Wang, H.; Yang, Z.; Hu, Y.; Yang, Z.; Huang, Y. “Fast Detection of Heterogeneous Parallel Steganography for Streaming Voice”; ACM Workshop on Information Hiding and Multimedia Security 2021, 137-142. https://doi.org/ 10.1145/3437880.3460404.
[30]  Yang, H.; Yang, Z.; Bao, Y.; Liu, S.; Huang, Y. “Fast Steganalysis Method for VoIP Streams”; IEEE Signal Proc. Let. 2019, 27, 286-290. https://doi.org/10.1109/LSP.2019. 2961610.
[31]  “Vector Quantization - Mohamed Qasem” http://mqasem.net/vector-quantization/ (accessed Oct. 25, 2021).
[32]   ITU, T.S.S.O. “Dual Rate Speech Coder for Multimedia Communication Transmitting at 5.3 and 6.3 kbit/s”; Recommendation g, 723. 1996.
[33]  Lin, Z.; Huang, Y.; Wang, J.; “RNN-SM: Fast steganalysis of VoIP streams using recurrent neural network”; IEEE Trans. Inf. Foren. Sec. 2018, 13, 1854-1868. https://doi.org/10.1109/ TIFS.2018.2806741.