A Review of Speech Signal Enhancement Techniques
A Review of Speech Signal Enhancement Techniques
A Review of Speech Signal Enhancement Techniques
In human being the interaction is using vocal communication
i.e. voice. This is the motive for the researchers to carry out
research in the domain of Digital Speech Signal Processing.
The field Digital speech processing is a sub domain of Digital
signal processing. Each signal associated with a speech
communication always contains a noise. The purpose behind
speech enhancement is to enhance the understandability and
comprehensibility of speech signal [1]. For achieving a good
performance of Speech enabled system it is necessary to have
speech signals without noise, high quality and clarity. Every
time it is very difficult to have speech signals without any
background noise [2]. In a natural environment there is always
some amount of ECHO. Acoustically echo less room are
generally used for capturing the Echoless Speech [3]. During
a study it was observed that the signals are affected by
background noise and it affects the accuracy of the system. To Fig 1: Basic steps of speech enhancement system [5]
increase the accuracy of the system we need to filter the
background noise from speech signal acquired. The aim of 3. TYPES OF NOISE AND ITS
speech signal enhancement techniques is reducing background
In this section the review different types of noise removal
In digital speech signal processing the speech enhancement is techniques is described. The speech signal can be degraded
having great impact. With the help of mathematical approach because of the noise such as be periodic noise, wide band
and simulation there are many techniques using which speech noise, and interfering speech.
signal enhancement is performed [4].
A. Periodic Noise and its Removal Techniques
In this paper an overview of speech enhancement algorithms
used for enhancement of digital speech signal are presented. Stationary filters, adaptive filters, or transform domain filters
The paper is organized as follows the section 2 explains what are used for removing the periodic noise. First approach is
is meant by speech enhancement; section 3 describes types of stationary in which a bank of notch filters such as twin T-
noise because of which the speech signal can be degraded and filters can be used as a comb filter for removal of periodic
noise. Second is adaptive filters, in which a forward
International Journal of Computer Applications (0975 – 8887)
Volume 139 – No.14, April 2016
prediction error filter can be used as an inverse filter which The estimation of power spectrum of noisy speech can be
will filter out periodic noise. Third one is transform domain in done as:
which periodic noise spectrum can be observed and
manipulated. The periodic components can be identified by --------------- (3)
inspection of the spectrum. Where are the statistical average values of
B. Wide Band Noise and its Removal Techniques during non-speech period, so eq. (4) - (5) shows the enhanced
speech signal amplitude.
Spectral Subtraction method (SS) and adaptive cancellation
are used for removal of Wide band noise. In spectral -------- (4)
subtraction method, estimated noise spectrum is subtracted -------------- (5)
from the spectrum of the noisy speech. And with the help of
adaptive cancellation the noise correlated with signal can be Combined with the phase of the noisy signal to synthesize the
removed. The correlated signal may be obtained as the signal again
estimated channel in the absence of signal. Adaptive filter
whose impulse response must be such that the filtered channel ------------------- (6)
noise matches the signal noise may be tuned to remove noise. The reverse short-time Fourier transform is performed to
The coefficients are updated until output reaches minima. transform the signals into time domain. Traditional spectral
C. Interfering Speech and its removal techniques subtraction calculation assessing uproarious vitality
throughout no speech stage, in any case, it can't upgrade noise
When two speech signals are interfering Speech enhancement throughout speech stage. Additionally the method obliges a
techniques are not useful. If we are able to identify different VAD that may not work extremely well under low SNR.
pitches the voices of different speakers can be isolated. We
must track voiced segments In order that pitch separation 2. Spectral Subtraction with Over subtraction Model:
works. For recovering desired speaker’s harmonics a comb (SSOM)
filter can be used provided pitch values are already known. In In order to come down with the musical noise effect SSOM
order to isolate voice of different speakers a transform domain procedure was introduced. The perception of musical noise
technique can also be used. Assuming that pitch values of can be reduced using this. This Method does the subtraction
speakers are known, we may find Discrete Fourier Transform of an overestimate of the noise power spectrum and present
(DFT) of the mixed signal and track the harmonics of the the resultant spectral components from going below a preset
fundamental frequencies of the two speakers. We have to minimum spectral floor value.
simply take IDFT of the isolated DFTs to get individual 3. Non-Linear Spectral Subtraction: (NSS)
speaker’s voices if we can isolate the DFT outputs [6]. This method is based on combination of the two ideas first
one is The use of an extended noise and an over subtraction
4. SPEECH ENHANCEMENT model and second is Non-linear implementation of the
METHODS subtraction process, considering that the subtraction process
There are so many different methods used for speech must depend on the SNR of the frame, to go to apply less
enhancement some of them are as follows. They can be subtraction with high SNRs and vice versa [8].
divided in to two basic categories as: Single Channel
Enhancing Techniques and Multi-Channel Enhancing b) Multi Chanel Enhancement Techniques
Techniques. The systems which are of this kind are more complex one as
compare to single channel systems. This systems takes
a) Single Chanel Enhancement Techniques advantage of available multiple signal inputs to the system
This technique is a common for real time applications such as and uses noise reference in adaptive noise cancellation device.
mobile communication, hearing aids etc. as generally there is These systems can do better for non-stationary noises than
no second channel present. This method gives the limited single channel systems by considering the spatial properties of
performance as it improves the quality of noisy signal at the the noise source and the signal, also limitations inherent to
cost of some intelligibility. Also as compare to multichannel single channel systems [9].
system this system is easier and cost effective. Generally this
system uses different statistics of speech and unwanted noise 1. Adaptive Noise Cancellation
[7]. This method is one of the powerful speech enhancement
techniques.Which is based on the auxiliary channels
1. Spectral Subtraction Method availability, which is known as reference path, where a
It is one of the basic methods used for speech enhancement. In correlated sample or reference of the contaminating noise is
the spectral subtraction it is assumed that a signal is formed present. Following an adaptive algorithm, this reference input
by two additive components. The speech contains noise can will be filtered in order to subtract the output of this filtering
be expressed as process which is in the main path, where noisy speech is
present. The adaptive noise cancellation (ANC) cancels the
-------------------- (1)
primary unwanted noise r(n) with is help of introducing a
Where is time, is the uncorrupted speech signal, is cancelling anti-noise of equal amplitude but opposite phase by
the additive noise signal and is the corrupted speech using a reference signal. The reference signal generated is
signal available for processing. The observed signal is split derived from one or more sensors located at points which are
into overlapping frames using the application of a window near the noise and interference sources at the point where the
function and implemented in the short-time Fourier transform interest signal is weak or undetectable [10].
(STFT) magnitude domain. Also in the frequency domain this
can be represented as
------------------ (2)
International Journal of Computer Applications (0975 – 8887)
Volume 139 – No.14, April 2016
International Journal of Computer Applications (0975 – 8887)
Volume 139 – No.14, April 2016
IJCATM : www.ijcaonline.org 26