COMPARATIVE ANALYSIS OF VOICE ACTIVITY DETECTION METHODS IN SPEECH SIGNAL PROCESSING

Authors

  • Kamoliddin Shukurov Author
  • Umidjon Khasanov Author
  • Shokhrukhmirzo Kholdorov Author

Keywords:

speech signal, VAD, energy-based approach, spectral features, statistical modeling, deep learning, filtering.

Abstract

This paper analyzes various Voice Activity Detection (VAD) methods, which play a crucial role in speech signal processing. The main objective of VAD algorithms is to distinguish speech segments from silence and background noise. The paper discusses different approaches, including energy-based methods, spectral feature-based algorithms, statistical modeling techniques, and modern machine learning models, particularly deep neural networks. The advantages and limitations of each method are addressed, with a special emphasis on their practical applicability. In our experiments, since signal filtering and noise suppression are applied prior to VAD, energy-based approaches demonstrated high effectiveness and reliability.

Downloads

Download data is not yet available.

References

N. N. Lokhande, P. S. Vikhe, N. S. Nehe, “Voice Activity Detection Algorithm for Speech Recognition Applications,” Int. Conf. in Computational Intelligence (ICCIA), 2011

K.-Q. Wang, T.-L. Hou, C.-L. Chin, “Voice Activity Detection Using Spectral Entropy in Bark-Scale Wavelet Domain,” Oriental COCOSDA Conference, 2009

J. Sohn, N. S. Kim, W. Sung, “A Statistical Model-Based Voice Activity Detection,” IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1–3, 1999

R. M. Patil, C. M. Patil, “Unveiling the State-of-the-Art: A Comprehensive Survey on Voice Activity Detection Techniques,” 2025

S. Rajanayagam, M. A. Ingrisch, P. Müller, et al., “Enhancing Voice Activity Detection for an Elderly-Centric Self-Learning Conversational Robot Partner in Noisy Environments,” Proc. ICAIIT, Apr 2025

S. Li, Y. Li, T. Feng, J. Shi, P. Zhang, “Voice Activity Detection Using a Local-Global Attention Model,” Applied Acoustics, vol. 195, p.108802, 2022

K. Tripathi, C. V. Kumar, P. Wasnik, “Attention Is Not Always the Answer: Optimizing Voice Activity Detection with Simple Feature Fusion,” arXiv preprint arXiv:2306.00910, 2025

X.-L. Zhang, M. Xu, “AUC Optimization for Deep Learning-based Voice Activity Detection,” Proc. Interspeech, 2022

Downloads

Published

2025-09-15