class SpeechHMM

This class implements a special case of Hidden Markov Models that can be used to do connected word speech recognition for small vocabulary, using embedded training.

Inheritance:


Public Fields

[more]int n_models
the number of basic phoneme models
[more]HMM** models
the basic phoneme models
[more]DataSet* data
a dataset for initialization
[more]EMTrainer** model_trainer
if an initial alignment is given and an emtrainer for each model then it is used to train the models after kmeans during reset
[more]MeasurerList* initial_models_trainer_measurers
as well as a measurer of this trainer
[more]LexiconInfo* lexicon
the acceptable lexicon
[more]Sequence* targets
the current target sequence, with start and end words/phonemes
[more]int add_to_targets
number of words to add
[more]bool** word_transitions
true if the given transition is a transition between words
[more]int max_n_states
the maximum number of states in the graph (used for allocation)
[more]int* states_to_model_states
the relation between model states and SpeechHMM states
[more]int* states_to_model
the relation between models and SpeechHMM states
[more]int* states_to_word
the relation between words and SpeechHMM states
[more]bool phoneme_targets
are targets expressed in words or phonemes?

Public Methods

[more] SpeechHMM(int n_models_, HMM** models_, LexiconInfo* lex_, EMTrainer** model_trainer_ = NULL)
In order to create a SpeechHMM, we need to give a vector of n_models_ HMMs as well as their corresponding name, a lexicon, an optional log_word_entrance_penalty and an optional trainer that can be used to initialize each model independently
[more]virtual void prepareTrainModel(Sequence* input)
this method prepare the transition graph associated with a given training sentence
[more]virtual int addWordToModel(int word, int current_state)
this method is used by prepareTrainModel to prepare the model.
[more]virtual void addConnectionsBetweenWordsToModel(int word, int next_word, int current_state, int next_current_state, real log_n_next)
this method is used by prepareTrainModel to prepare the model.
[more]virtual int nStatesInWord(int word)
this methods returns the number of states in a given word
[more]virtual int nStatesInWordPronunciation(int word, int pronun)
this methods returns the number of states in a given word pronunciation


Inherited from HMM:

Public Fields

oint n_states
oreal prior_transitions
oDistribution** states
oDistribution** shared_states
oreal** transitions
oreal** log_transitions
oreal** dlog_transitions
oreal** transitions_acc
oSequence* log_alpha
oSequence* log_beta
oSequence* arg_viterbi
oint last_arg_viterbi
oSequence* viterbi_sequence
oSequence* log_probabilities_s
obool initialize

Public Methods

ovirtual void printTransitions(bool real_values=false, bool transitions_only=false)
ovirtual void logAlpha(Sequence* inputs)
ovirtual void logBeta(Sequence* inputs)
ovirtual void logViterbi(Sequence* inputs)
ovirtual void decode(Sequence* input)
ovirtual void logProbabilities(Sequence* inputs)


Inherited from Distribution:

Public Fields

oreal log_probability
oSequence* log_probabilities

Public Methods

ovirtual real logProbability(Sequence* inputs)
ovirtual real viterbiLogProbability(Sequence* inputs)
ovirtual real frameLogProbability(int t, real* f_inputs)
ovirtual real viterbiFrameLogProbability(int t, real* f_inputs)
ovirtual void eMIterInitialize()
ovirtual void iterInitialize()
ovirtual void eMSequenceInitialize(Sequence* inputs)
ovirtual void sequenceInitialize(Sequence* inputs)
ovirtual void eMAccPosteriors(Sequence* inputs, real log_posterior)
ovirtual void frameEMAccPosteriors(int t, real* f_inputs, real log_posterior)
ovirtual void viterbiAccPosteriors(Sequence* inputs, real log_posterior)
ovirtual void frameViterbiAccPosteriors(int t, real* f_inputs, real log_posterior)
ovirtual void eMUpdate()
ovirtual void update()
ovirtual void eMForward(Sequence* inputs)
ovirtual void viterbiForward(Sequence* inputs)
ovirtual void frameBackward(int t, real* f_inputs, real* beta_, real* f_outputs, real* alpha_)
ovirtual void viterbiBackward(Sequence* inputs, Sequence* alpha)
ovirtual void frameDecision(int t, real* decision)

Public Members

o Returns the decision of the distribution


Inherited from GradientMachine:

Public Fields

oint n_inputs
oint n_outputs
oParameters* params
oParameters* der_params
oSequence* beta

Public Methods

ovirtual void forward(Sequence* inputs)
ovirtual void backward(Sequence* inputs, Sequence* alpha)
ovirtual void setPartialBackprop(bool flag=true)
ovirtual void frameForward(int t, real* f_inputs, real* f_outputs)
ovirtual void loadXFile(XFile* file)
ovirtual void saveXFile(XFile* file)


Inherited from Machine:

Public Fields

oSequence* outputs

Public Methods

ovirtual void reset()
ovirtual void setDataSet(DataSet* dataset_)


Inherited from Object:

Public Fields

oAllocator* allocator

Public Methods

ovoid addOption(const char* name, int size, void* ptr, const char* help="")
ovoid addIOption(const char* name, int* ptr, int init_value, const char* help="")
ovoid addROption(const char* name, real* ptr, real init_value, const char* help="")
ovoid addBOption(const char* name, bool* ptr, bool init_value, const char* help="")
ovoid addOOption(const char* name, Object** ptr, Object* init_value, const char* help="")
ovoid setOption(const char* name, void* ptr)
ovoid setIOption(const char* name, int option)
ovoid setROption(const char* name, real option)
ovoid setBOption(const char* name, bool option)
ovoid setOOption(const char* name, Object* option)
ovoid load(const char* filename)
ovoid save(const char* filename)
ovoid* operator new(size_t size, Allocator* allocator_=NULL)
ovoid* operator new(size_t size, Allocator* allocator_, void* ptr_)
ovoid operator delete(void* ptr)


Documentation

This class implements a special case of Hidden Markov Models that can be used to do connected word speech recognition for small vocabulary, using embedded training.

It contains a set of phoneme models (represented by HMMs), a lexicon of words (which are sequences of phonemes)

oint n_models
the number of basic phoneme models

oHMM** models
the basic phoneme models

oDataSet* data
a dataset for initialization

oEMTrainer** model_trainer
if an initial alignment is given and an emtrainer for each model then it is used to train the models after kmeans during reset

oMeasurerList* initial_models_trainer_measurers
as well as a measurer of this trainer

oLexiconInfo* lexicon
the acceptable lexicon

oSequence* targets
the current target sequence, with start and end words/phonemes

oint add_to_targets
number of words to add

obool** word_transitions
true if the given transition is a transition between words

oint max_n_states
the maximum number of states in the graph (used for allocation)

oint* states_to_model_states
the relation between model states and SpeechHMM states

oint* states_to_model
the relation between models and SpeechHMM states

oint* states_to_word
the relation between words and SpeechHMM states

obool phoneme_targets
are targets expressed in words or phonemes?

o SpeechHMM(int n_models_, HMM** models_, LexiconInfo* lex_, EMTrainer** model_trainer_ = NULL)
In order to create a SpeechHMM, we need to give a vector of n_models_ HMMs as well as their corresponding name, a lexicon, an optional log_word_entrance_penalty and an optional trainer that can be used to initialize each model independently

ovirtual void prepareTrainModel(Sequence* input)
this method prepare the transition graph associated with a given training sentence

ovirtual int addWordToModel(int word, int current_state)
this method is used by prepareTrainModel to prepare the model. It adds a given word to the current graph.

ovirtual void addConnectionsBetweenWordsToModel(int word, int next_word, int current_state, int next_current_state, real log_n_next)
this method is used by prepareTrainModel to prepare the model. It adds the connections between words.

ovirtual int nStatesInWord(int word)
this methods returns the number of states in a given word

ovirtual int nStatesInWordPronunciation(int word, int pronun)
this methods returns the number of states in a given word pronunciation


Direct child classes:
SimpleDecoderSpeechHMM
Author:
Samy Bengio (bengio@idiap.ch)

Alphabetic index HTML hierarchy of classes or Java



This page was generated with the help of DOC++.