Training Context-Dependent Distributions

The polyphone collection process can easily produce hundreds of thousands, even millions of polyphones. It is obviously not feasible to use a fully continuous HMM model for each of them. Eventually we will want to use continuous density HMMs but for a smaller number of models. It is, however, possible to train hundreds of thousands of mixture weights distributions, such that they can be clustered into fewer afterwards. This is what we are going to do in this step. The complete script can be found as usual in the scripts thread.

The startup looks a bit different from the starups that we had so far, because we now have to incorporate the ptrees:

[FeatureSet fs] setDesc @/home/islpra0/IslData/featDesc 
               fs setAccess @/home/islpra0/IslData/featAccess

[CodebookSet cbs fs]     read ../step8/codebookSet 
[DistribSet dss cbs]       read ../step10/distribSet 
[PhonesSet ps]          read ../step2/phonesSet 
[Tags tags]            read ../step2/tags 
Tree dst ps:PHONES ps tags dss 
       dst.ptreeSet read ../step10/ptreeSet 
       dst read ../step10/distribTree 

SenoneSet sns [DistribStream str dss dst] 

[TmSet tms]              read ../step2/transitionModels 
[TopoSet tps sns tms]         read ../step2/topologies 
[Tree tpt ps:PHONES ps tags tps]        read ../step2/topologyTree 
[DBase db] open ../step1/db.dat ../step1/db.idx -mode r 
[Dictionary diction ps:PHONES tags] read ../step4/convertedDict 

fs FMatrix LDAMatrix
fs:LDAMatrix.data bload ../step5/ldaMatrix

AModelSet amo tpt ROOT 
HMM hmm diction amo 
Path path

We now load the last codebook weights. Remember that we don't have any distribution weights yet. We could load the context independent distribution weights and initialize every context dependent distribution with its corresponding context-independent distribution, but experiments have shown that this is not necessary. It is fine to not load any distribution weights and thus start with equally distributed values (i.e. every distribution value will be 1/16). Besides, the only reason why we are training these distributions is to cluster them later into fewer which will have to be trained anew, anyway:

cbs load ../step9/codebookWeights.3 
cbs createAccus 
dss createAccus

We use the same Tcl procedure "forcedAlignment" that we used last time when training along labels and use a "regular" training loop. Feel free to test the resulting system using this script.