class SpeechHMM

This class implements a special case of Hidden Markov Models that can be used to do connected word speech recognition for small vocabulary, using embedded training.

Inheritance:

Public Fields
int n_models: the number of basic phoneme models
HMM** models: the basic phoneme models
DataSet* data: a dataset for initialization
EMTrainer** model_trainer: if an initial alignment is given and an emtrainer for each model then it is used to train the models after kmeans during reset
MeasurerList* initial_models_trainer_measurers: as well as a measurer of this trainer
LexiconInfo* lexicon: the acceptable lexicon
Sequence* targets: the current target sequence, with start and end words/phonemes
int add_to_targets: number of words to add
bool** word_transitions: true if the given transition is a transition between words
int max_n_states: the maximum number of states in the graph (used for allocation)
int* states_to_model_states: the relation between model states and SpeechHMM states
int* states_to_model: the relation between models and SpeechHMM states
int* states_to_word: the relation between words and SpeechHMM states
bool phoneme_targets: are targets expressed in words or phonemes?

Public Methods
SpeechHMM(int n_models_, HMM** models_, LexiconInfo* lex_, EMTrainer** model_trainer_ = NULL): In order to create a SpeechHMM, we need to give a vector of n_models_ HMMs as well as their corresponding name, a lexicon, an optional log_word_entrance_penalty and an optional trainer that can be used to initialize each model independently
virtual void prepareTrainModel(Sequence* input): this method prepare the transition graph associated with a given training sentence
virtual int addWordToModel(int word, int current_state): this method is used by prepareTrainModel to prepare the model.
virtual void addConnectionsBetweenWordsToModel(int word, int next_word, int current_state, int next_current_state, real log_n_next): this method is used by prepareTrainModel to prepare the model.
virtual int nStatesInWord(int word): this methods returns the number of states in a given word
virtual int nStatesInWordPronunciation(int word, int pronun): this methods returns the number of states in a given word pronunciation

Inherited from HMM:

Public Fields
int n_states
real prior_transitions
Distribution** states
Distribution** shared_states
real** transitions
real** log_transitions
real** dlog_transitions
real** transitions_acc
Sequence* log_alpha
Sequence* log_beta
Sequence* arg_viterbi
int last_arg_viterbi
Sequence* viterbi_sequence
Sequence* log_probabilities_s
bool initialize

Public Methods
virtual void printTransitions(bool real_values=false, bool transitions_only=false)
virtual void logAlpha(Sequence* inputs)
virtual void logBeta(Sequence* inputs)
virtual void logViterbi(Sequence* inputs)
virtual void decode(Sequence* input)
virtual void logProbabilities(Sequence* inputs)

Inherited from Distribution:

Public Fields
real log_probability
Sequence* log_probabilities

Public Methods
virtual real logProbability(Sequence* inputs)
virtual real viterbiLogProbability(Sequence* inputs)
virtual real frameLogProbability(int t, real* f_inputs)
virtual real viterbiFrameLogProbability(int t, real* f_inputs)
virtual void eMIterInitialize()
virtual void iterInitialize()
virtual void eMSequenceInitialize(Sequence* inputs)
virtual void sequenceInitialize(Sequence* inputs)
virtual void eMAccPosteriors(Sequence* inputs, real log_posterior)
virtual void frameEMAccPosteriors(int t, real* f_inputs, real log_posterior)
virtual void viterbiAccPosteriors(Sequence* inputs, real log_posterior)
virtual void frameViterbiAccPosteriors(int t, real* f_inputs, real log_posterior)
virtual void eMUpdate()
virtual void update()
virtual void eMForward(Sequence* inputs)
virtual void viterbiForward(Sequence* inputs)
virtual void frameBackward(int t, real* f_inputs, real* beta_, real* f_outputs, real* alpha_)
virtual void viterbiBackward(Sequence* inputs, Sequence* alpha)
virtual void frameDecision(int t, real* decision)

Public Members
Returns the decision of the distribution

Inherited from GradientMachine:

Public Fields
int n_inputs
int n_outputs
Parameters* params
Parameters* der_params
Sequence* beta

Public Methods
virtual void forward(Sequence* inputs)
virtual void backward(Sequence* inputs, Sequence* alpha)
virtual void setPartialBackprop(bool flag=true)
virtual void frameForward(int t, real* f_inputs, real* f_outputs)
virtual void loadXFile(XFile* file)
virtual void saveXFile(XFile* file)

Inherited from Machine:

Public Fields
Sequence* outputs

Public Methods
virtual void reset()
virtual void setDataSet(DataSet* dataset_)

Inherited from Object:

Public Fields
Allocator* allocator

Public Methods
void addOption(const char* name, int size, void* ptr, const char* help="")
void addIOption(const char* name, int* ptr, int init_value, const char* help="")
void addROption(const char* name, real* ptr, real init_value, const char* help="")
void addBOption(const char* name, bool* ptr, bool init_value, const char* help="")
void addOOption(const char* name, Object** ptr, Object* init_value, const char* help="")
void setOption(const char* name, void* ptr)
void setIOption(const char* name, int option)
void setROption(const char* name, real option)
void setBOption(const char* name, bool option)
void setOOption(const char* name, Object* option)
void load(const char* filename)
void save(const char* filename)
void* operator new(size_t size, Allocator* allocator_=NULL)
void* operator new(size_t size, Allocator* allocator_, void* ptr_)
void operator delete(void* ptr)

Documentation

This class implements a special case of Hidden Markov Models that can be used to do connected word speech recognition for small vocabulary, using embedded training.
It contains a set of phoneme models (represented by HMMs), a lexicon of words (which are sequences of phonemes)

int n_models

the number of basic phoneme models

HMM** models

the basic phoneme models

DataSet* data

a dataset for initialization

EMTrainer** model_trainer

if an initial alignment is given and an emtrainer for each model then it is used to train the models after kmeans during reset

MeasurerList* initial_models_trainer_measurers

as well as a measurer of this trainer

LexiconInfo* lexicon

the acceptable lexicon

Sequence* targets

the current target sequence, with start and end words/phonemes

int add_to_targets

number of words to add

bool** word_transitions

true if the given transition is a transition between words

int max_n_states

the maximum number of states in the graph (used for allocation)

int* states_to_model_states

the relation between model states and SpeechHMM states

int* states_to_model

the relation between models and SpeechHMM states

int* states_to_word

the relation between words and SpeechHMM states

bool phoneme_targets

are targets expressed in words or phonemes?

SpeechHMM(int n_models_, HMM** models_, LexiconInfo* lex_, EMTrainer** model_trainer_ = NULL)

In order to create a SpeechHMM, we need to give a vector of n_models_ HMMs as well as their corresponding name, a lexicon, an optional log_word_entrance_penalty and an optional trainer that can be used to initialize each model independently

virtual void prepareTrainModel(Sequence* input)

this method prepare the transition graph associated with a given training sentence

virtual int addWordToModel(int word, int current_state)

this method is used by prepareTrainModel to prepare the model. It adds a given word to the current graph.

virtual void addConnectionsBetweenWordsToModel(int word, int next_word, int current_state, int next_current_state, real log_n_next)

this method is used by prepareTrainModel to prepare the model. It adds the connections between words.

virtual int nStatesInWord(int word)

this methods returns the number of states in a given word

virtual int nStatesInWordPronunciation(int word, int pronun)

this methods returns the number of states in a given word pronunciation

Direct child classes:: SimpleDecoderSpeechHMM


Author:: Samy Bengio (bengio@idiap.ch)

Alphabetic index HTML hierarchy of classes or Java

This page was generated with the help of DOC++.

class SpeechHMM

Inheritance:

Public Fields

Public Methods

Inherited from HMM:

Public Fields

Public Methods

Inherited from Distribution:

Public Fields

Public Methods

Public Members

Inherited from GradientMachine:

Public Fields

Public Methods

Inherited from Machine:

Public Fields

Public Methods

Inherited from Object:

Public Fields

Public Methods

Documentation