Tuesday, October 16, 2007
Arabic Speech Recognition based on Combined Classifier
I'll publish paper that concerned with recognition of isolated Arabic's words by using a combined classifier. A combined classifier is based on a number of Back-Propagation/LVQ neural networks with different parameters. Some feature extraction methods are examined. It gives promising result that can compete with the traditional HMM-based speech recognition approaches.
Tuesday, August 07, 2007
Neural Network Fixed size input vectors
When I'm trying to use Neural Networks such as LVQ or Backpropagation , I faced this problem and I spent almost month trying to find good and easy solution.
One method called time normalization using Dynamic Time Warping , it stretching or pressing the input speech duration but it may cause signal distortion.
You can see example written in matlab here : http://www.ee.columbia.edu/~dpwe/resources/matlab/dtw/
There is another good and easy method depend on Window techniques , for more information you can see this thesis Application of a Back-Propagation Neural Network to Isolated-Word Speech Recognition.
One method called time normalization using Dynamic Time Warping , it stretching or pressing the input speech duration but it may cause signal distortion.
You can see example written in matlab here : http://www.ee.columbia.edu/~dpwe/resources/matlab/dtw/
There is another good and easy method depend on Window techniques , for more information you can see this thesis Application of a Back-Propagation Neural Network to Isolated-Word Speech Recognition.
Monday, May 21, 2007
Window Function
Window Function is one of the steps to perform Speech Analysis
Speech wave must be multiple by an apporpriate time window when we extract N-sample interval from speech wave for calculating most processing on speech.
It has two effects :
the Hamming window is usually used for speech analysis is defind as
Speech wave must be multiple by an apporpriate time window when we extract N-sample interval from speech wave for calculating most processing on speech.
It has two effects :
- it gradually attenuates the amplitude at both ends of the extraction interval to prevent an abrupt change at the endpoints.
- it produces the convolution for the fourier transform of the window function and the speech spectrum.
the Hamming window is usually used for speech analysis is defind as
Hanning window and Rectangular window
A/E Speech DataBase
Arabic/English Speech Database
This is the first step to build Speech Recognition system , We should create a database for training and testing any algorithmes you used.
English Speech Database
After a lot of searches on the internet , I found that most of speech databases are not free and you should buy it , this site: http://www.ldc.upenn.edu/ contain many speech databases for example TIMIT database.
But Finally I found one Free here Speech separation challenge
http://www.dcs.shef.ac.uk/~martin/SpeechSeparationChallenge.htm
The training and development sets are drawn from a closed set of 34 talkers of both genders. this document describes the corpus in more detail
training data from individual talkers: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
The last challenge ,We need to segment the speech wav files to words , I do it manually using very efficient program "Adobe Audition 2.0 ".
Arabic Speech Database
it is a big problem to create your own database , if you work alone and your friends doesn't want to help you and Disheartening.
I'm thinking to new method to build database without need anybody , ISA I will explain the new method after I make sure that it work properly.
**you can build your own Arabic Speech database from the Holy Quran recording.
This is the first step to build Speech Recognition system , We should create a database for training and testing any algorithmes you used.
English Speech Database
After a lot of searches on the internet , I found that most of speech databases are not free and you should buy it , this site: http://www.ldc.upenn.edu/ contain many speech databases for example TIMIT database.
But Finally I found one Free here Speech separation challenge
http://www.dcs.shef.ac.uk/~martin/SpeechSeparationChallenge.htm
The training and development sets are drawn from a closed set of 34 talkers of both genders. this document describes the corpus in more detail
training data from individual talkers: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
The last challenge ,We need to segment the speech wav files to words , I do it manually using very efficient program "Adobe Audition 2.0 ".
Arabic Speech Database
it is a big problem to create your own database , if you work alone and your friends doesn't want to help you and Disheartening.
I'm thinking to new method to build database without need anybody , ISA I will explain the new method after I make sure that it work properly.
**you can build your own Arabic Speech database from the Holy Quran recording.
Subscribe to:
Posts (Atom)