looking for [recognize]


    Dec 15 2007 | 5:14 pm
    Hi!
    "recognize" is a multiuser and multilanguage speech recognition and
    segmentation system.
    I've been working on it for a Forsythe dance project. But since it
    has not been used yet in one of his projects, I do not really know
    what to do politically. For the moment, I simply removed it, although
    I spent a lot of time on it, although the patch does not do the piece
    and although I'm not only hired as a maxer.
    Wait a little bit, it will be online soon.
    Olivier.

    • Jun 22 2008 | 8:15 am
      Last year somebody was looking for Olivier Pasquet's recognize.mxj object. I'm working right now on a project that would benefit greatly from it. So Olivier, any chance it can be released into the wild?
      It's based on the Sphinx4 java speech recognition engine, which I could use via shell-scripting or try to encapsulate in my own mxj. But I'm having huge troubles understanding how to add "HUB4" models to enable a complete vocabulary (It defaults to a very small vocabulary), so if anybody has done this please chime in. (Off-topic, but exciting!)
      -Zach Poff
    • Jun 24 2008 | 1:35 pm
      Hello,
      "HUB4" is a large vocabulary trigram database with approximately
      64000 english words. This means more memory will be needed to load
      the language model (language_model.arpaformat.DMP).
      If you are experiencing a "out of memory" with MXJ, adapt the
      following lines of Max5/Cycling '74/java/max.java.config.txt :
      max.jvm.option -Xms64m
      max.jvm.option -Xmx256m
      Nevertheless, I was experiencing problems with French language.
      Unless I do something wrong, I have the feeling MXJ is limited with
      memory even when the max.java.config.txt configuration is changed.
      If you load hub4.config.xml then no grammar file is needed. A grammar
      file is a simple syntax used for a small amount of words to securely
      recognize. Instead, hub4.config.xml asks to load the unigram language
      model called hub4.flat_unigram.lm based on a vocabulary of 993
      English words. This file contains "statistics" about a text and is
      generatyed by the CMU-Cambridge Statistical Language Modeling Toolkit.
      I hope this helps a little... ;)
      O..////