The Janus Recognition Toolkit (JRTk) is a general-purpose speech recognition toolkit developed at the Interactive Systems Labs at Carnegie Mellon University and Karlsruhe Institute of Technology. Commercial and research liscenses are available.
Introduction
The Janus Recognition Toolkit (JRTk) is a general-purpose speech recognition toolkit useful for both research and application development and is part of the JANUS speech-to-speech translation system.
The JRTk provides a flexible Tcl/Tk script based environment which enables researchers to build state-of-the-art speech recognizers and allows them to develop, implement, and evaluate new methods. It implements an object oriented approach that unlike other toolkits is not a set of libraries and precompiled modules but a programmable shell with transparent, yet efficient objects.
Since version 5 JRTk features the IBIS decoder, a one-pass decoder that is based on a re-entrant single pronunciation prefix tree and makes use of the concept of linguistic context polymorphism. It is therefore able to incorporate full linguistic knowledge at an early stage. It is possible to decode in one pass, using the same engine in combination with a statistical n-gram language model as well as context- free grammars. It is also possible to use the decoder to rescore lattices in a very efficient way.
Features
JRTk features state-of-the art techniques for pre-processing, acoustic modeling, and search.
Acoustic Pre-Processing
- Processing of various, frequent audio formats
- Flexible short-term fourier analysis
- Flexibly configurable Mel-frequency scaled cepstral coefficients calculation
- Minimimum variance distortion response processing
- LPC processing
- Mean and variance normalization
Acoustic Modeling
- EM Training, label training, Viterbi training
- incremental growing of Gaussians
- Semi-tied covariances
- MMIE training, bMMIE training
- Speaker adaptive training
Decoding
- Single pass decoder
- Flexible language model interface for n-gram language models and grammars
- lattice generation and manipulation
- lattice rescoring
- consensus decoding
- confusion network combination
The JRTk is used for speech recognition in many on-going projects as well as past ones.
Current Projects
Past Projects
The JRTk is liscensed by Carnegie Mellon University. Commercial as well as research liscenses are available. Terms and conditions as well as further information can be inquired by contacting Prof. Alex Waibel at ahw∂cs.cmu.edu.
JRTk Articles
Title |
Author |
Source |
A One Pass-Decoder Based On Polymorphic Linguistic Context Assignment |
Hagen Soltau, Florian Metze, Christian Fügen, Alex Waibel
|
Automatic Speech Recognition and Understanding Workshop 2001, ASRU 2001, Trento, Italy, 25. October 2011
|
Recognition Of Conversational Telephone Speech Using The Janus Speech Engine |
Torsten Zeppenfeld, Michael Finke, Klaus Ries, Martin Westphal, Alex Waibel
|
International Conference on Acoustics, Speech, and Signal Processing 1997, ICASSP 1997, Munich, Germany, 01. April 1997
|
JANUS III: Speech-To-Speech Translation In Multiple Languages |
Alon Lavie, Alex Waibel, Lori Levin, Michael Finke, Donna Gates, Marsal Gavalda, Torsten Zeppenfeld, Puming Zhan
|
International Conference on Acoustics, Speech, and Signal Processing 1997, ICASSP 1997, Munich, Germany, 01. April 1997
|
JANUS II Translation Of Spontaneous Conversational Speech |
Alex Waibel, Michael Finke, Donna
Gates, Marsal Gavalda, Thomas Kemp, Alon Lavie, Lori Levin, Uwe Meier,
Laura Tomokiyo, Arthur McNair, Ivica Rogina, Kaori Shima, Tilo Sloboda,
Monika Woszczyna, Torsten Zeppenfeld, Puming Zhan
|
IEEE International Conference On Acoustics, Speech And Signal Processing 1996, ICASSP 1996, Atlanta, USA, 01. May 1996
|
End-To-End Evaluation In Janus: A Speech-to-Speech Translation System |
Donna Gates, Alon Lavie, Lori Levin, Alex Waibel, Marsal Gavalda, Laura Tomokiyo, Monika Woszczyna, Puming Zhan
|
12th European Conference on Artificial Intelligence, ECAI 1996, Budapest, Hungary, 01. August 1996
|
Translation Of Conversational Speech With Janus-II |
Alon Lavie, Alex Waibel, Lori Levin, Donna Gates, Marsal Gavalda, Torsten Zeppenfeld, Puming Zhan, Oren Glickman
|
4th International Conference on Spoken Language Processing 1996, ICSLP 1996, Philadelphia, USA, 01. October 1996
|
JANUS II: Towards Spontaneous Spanish Speech Recognition |
Puming Zhan, Klaus Ries, Marsal Gavalda, Donna Gates, Alon Lavie, Alex Waibel
|
4th International Conference on Spoken Language Processing 1996, ICSLP 1996, Philadelphia, USA, 01. October 1996
|
JANUS II: Towards Multi-Lingual Spoken Language Translation |
Bernhard Suhm, Petra Geutner, Thomas
Kemp, Alon Lavie, Laura Tomokiyo, Arthur McNair, Ivica Rogina, Tanja
Schultz, Tilo Sloboda, Wayne Ward, Monika Woszczyna, Alex Waibel
|
01. January 1995
|
JANUS 93: Towards Spontaneous Speech Translation |
Monika Woszczyna, N. Aoki-Waibel, Finn
Dag Buø, Noah Coccaro, Keiko Horiguchi, Thomas Kemp, Alon Lavie, Arthur
McNair, Thomas Polzin, Ivica Rogina, Carolyn Rose, Tanja Schultz,
Bernhard Suhm, M. Tomita, Alex Waibel
|
International Conference on Acoustics, Speech, and Signal Processing 1994, ICASSP 1994, Adelaide, Australia, 01. April 1994
|
Recent Advances In Janus: A Speech Translation System |
Thomas Polzin, Noah Coccaro, N.
Aoki-Waibel, Monika Woszczyna, M. Tomita, J. Tsutsumi, Ivica Rogina,
Carolyn Rose, Alex Waibel, Arthur McNair, Alon Lavie, A. Eisele, Tilo
Sloboda, Wayne Ward
|
European Conference on Speech Communication and Technology 1993, Eurospeech 1993, Berlin, Germany, 26. January 1993
|
Testing Generality In Janus: A Multi-Lingual Speech Translation System |
Louise Osterholtz, Joe Tebelskis,
Ivica Rogina, Hiroaki Saito, Charles Augustine, Arthur McNair, Alex
Waibel, Monika Woszczyna, Tilo Sloboda
|
IEEE International Conference on Acoustics, Speech, and Signal
Processing 1992, ICASSP 1992, San Francisco, USA, 26. January 1992
|