Forums > MaxMSP

looking for [recognize]

December 15, 2007 | 5:14 pm

Hi!

"recognize" is a multiuser and multilanguage speech recognition and
segmentation system.
I’ve been working on it for a Forsythe dance project. But since it
has not been used yet in one of his projects, I do not really know
what to do politically. For the moment, I simply removed it, although
I spent a lot of time on it, although the patch does not do the piece
and although I’m not only hired as a maxer.
Wait a little bit, it will be online soon.

Olivier.



zlp
June 22, 2008 | 8:15 am

Last year somebody was looking for Olivier Pasquet’s recognize.mxj object. I’m working right now on a project that would benefit greatly from it. So Olivier, any chance it can be released into the wild?

It’s based on the Sphinx4 java speech recognition engine, which I could use via shell-scripting or try to encapsulate in my own mxj. But I’m having huge troubles understanding how to add "HUB4" models to enable a complete vocabulary (It defaults to a very small vocabulary), so if anybody has done this please chime in. (Off-topic, but exciting!)

-Zach Poff

http://www.zachpoff.com


June 24, 2008 | 1:35 pm

Hello,

"HUB4" is a large vocabulary trigram database with approximately
64000 english words. This means more memory will be needed to load
the language model (language_model.arpaformat.DMP).
If you are experiencing a "out of memory" with MXJ, adapt the
following lines of Max5/Cycling ’74/java/max.java.config.txt :

max.jvm.option -Xms64m
max.jvm.option -Xmx256m

Nevertheless, I was experiencing problems with French language.
Unless I do something wrong, I have the feeling MXJ is limited with
memory even when the max.java.config.txt configuration is changed.

If you load hub4.config.xml then no grammar file is needed. A grammar
file is a simple syntax used for a small amount of words to securely
recognize. Instead, hub4.config.xml asks to load the unigram language
model called hub4.flat_unigram.lm based on a vocabulary of 993
English words. This file contains "statistics" about a text and is
generatyed by the CMU-Cambridge Statistical Language Modeling Toolkit.

I hope this helps a little… ;)

O..////


Viewing 3 posts - 1 through 3 (of 3 total)