Janus Recognition Toolkit

The Janus Recognition Toolkit (JRTk) is a general-purpose speech recognition toolkit developed at the Interactive Systems Labs at Carnegie Mellon University and Karlsruhe Institute of Technology. Commercial and research liscenses are available.

Introduction

The Janus Recognition Toolkit (JRTk) is a general-purpose speech recognition toolkit useful for both research and application development and is part of the JANUS speech-to-speech translation system.

The JRTk provides a flexible Tcl/Tk script based environment which enables researchers to build state-of-the-art speech recognizers and allows them to develop, implement, and evaluate new methods. It implements an object oriented approach that unlike other toolkits is not a set of libraries and precompiled modules but a programmable shell with transparent, yet efficient objects.

Since version 5 JRTk features the IBIS decoder, a one-pass decoder that is based on a re-entrant single pronunciation prefix tree and makes use of the concept of linguistic context polymorphism. It is therefore able to incorporate full linguistic knowledge at an early stage. It is possible to decode in one pass, using the same engine in combination with a statistical n-gram language model as well as context- free grammars. It is also possible to use the decoder to rescore lattices in a very efficient way.

Features

JRTk features state-of-the art techniques for pre-processing, acoustic modeling, and search.

Acoustic Pre-Processing

Processing of various, frequent audio formats
Flexible short-term fourier analysis
Flexibly configurable Mel-frequency scaled cepstral coefficients calculation
Minimimum variance distortion response processing
LPC processing
Mean and variance normalization

Acoustic Modeling

EM Training, label training, Viterbi training
incremental growing of Gaussians
Semi-tied covariances
MMIE training, bMMIE training
Speaker adaptive training

Decoding

Single pass decoder
Flexible language model interface for n-gram language models and grammars
lattice generation and manipulation
lattice rescoring
consensus decoding
confusion network combination

Projects

The JRTk is used for speech recognition in many on-going projects as well as past ones.

Current Projects

Lecture Translator

Past Projects

EU-BRIDGE
EVEIL-3D
TC-STAR
C-STAR
FAME
View4You
CHIL
PF-STAR
NESPOLE!
VERBMOBIL
BABEL
Quaero
SFB 588 Humanoid Robots

License Information

The JRTk is liscensed by Carnegie Mellon University. Commercial as well as research liscenses are available. Terms and conditions as well as further information can be inquired by contacting Prof. Alex Waibel at ahw∂cs.cmu.edu.

Publications

JRTk Articles
Title	Author	Source
JANUS 93: Towards Spontaneous Speech Translation	Monika Woszczyna, N. Aoki-Waibel, Finn Dag Buø, Noah Coccaro, Keiko Horiguchi, Thomas Kemp, Alon Lavie, Arthur McNair, Thomas Polzin, Ivica Rogina, Carolyn Rose, Tanja Schultz, Bernhard Suhm, M. Tomita, Alex Waibel	International Conference on Acoustics, Speech, and Signal Processing 1994, ICASSP 1994, Adelaide, Australia, 01. April 1994
A One Pass-Decoder Based On Polymorphic Linguistic Context Assignment	Hagen Soltau, Florian Metze, Christian Fügen, Alex Waibel	Automatic Speech Recognition and Understanding Workshop 2001, ASRU 2001, Trento, Italy, 25. October 2011