Speech recognition and segmentation. Recognize is a speech to text based on Sphinx4. It translates the incoming audio signal into text and can give dates of words and phonemes in a buffer~. Uses JSGF grammar file and ngram model languages. No special voice learning is needed and potentially multilingual.

    • Jan 09 2010 | 12:06 am
    • Jan 25 2010 | 11:02 pm
      o nice! thank you so much for working on this project! i was going to try to implement an earlier speech to text program i had found (but never tested) in a performance piece coming in april. I'll get into this in march, and i'll definitely link you the results. thank you so much!
    • Feb 02 2010 | 12:58 pm
      There were too many dl from my website. You can now download it from : http://rapidshare.com/files/344796157/recognize.zip
    • Feb 14 2010 | 6:09 am
    • Feb 27 2010 | 9:28 am
      Hi pasquet! Thanks for your implements.
      I've installed op.recognize and when i try to load a text file, the message "op.recognize-> error allocating" shows up in the Max window. What's going on?
      I couldn't solve this problem, maybe you know what is wrong ;-(
    • Mar 02 2010 | 9:22 am
      Hello !
      I wrote this error message when you have a mistake in the loading of data. It seems you are using the right extension files.
      Make sure you have installed "sphinx4max.jar" AND "jsapi.jar" in your Max java/lib folder. If yes, what did you write in the gram textfile you loaded ? You should have a look indide the digits.gram example. Maybe the problem is there. Another question : are you using the example file or did you make a new one ?
    • Mar 04 2010 | 6:23 am
      New version with bug fixes and more efficiency : http://rapidshare.com/files/358682924/recognize.zip
    • Mar 08 2011 | 9:03 am
      Hi Olivier, Wonderful and very useful implementation! I'm experiencing the same problem as Gilson. Getting "op.recognize-> error allocating" message when I try to load the example .gram and .lm files. This happens on a Windows system. I also tested on Mac and everything seems to work fine there. Any ideas?
    • Mar 24 2011 | 4:05 am
      Oohoh really ? I'll quickly have a look if there is a problem on Windows.
      Are you sure you did put the jar file into the lib folder and if you accepted the SUN conditions running the shell script ? You can find the installation information in the read_me.txt file.
    • Apr 05 2011 | 6:33 am
      I am having trouble unpacking it? I have windows so this terminal.app thing doesnt apply, I've moved the files to the classes and lib folder in the java folder....
      Now what dooo I do? :( Kindest Regards,
    • Apr 09 2011 | 5:22 am
      I've got a little further, what is this error?
      (mxj~) Class op.recognize is not a subclass of com/cycling74/msp/MSPObject
    • Apr 13 2011 | 5:17 pm
      There is an error in the max window saying "wrong arguments"... ?
    • Apr 14 2011 | 8:17 am
      Hello Rob,
      You are using mxj~ and not mxj. I guess it is the reason.
    • Apr 14 2011 | 8:19 am
      Hello batman,
      You probably do not use the right arguments. Do you have this problem when you load the help patch ?
    • Apr 14 2011 | 2:24 pm
      Thank you for your reply, I am attempting to make an installment for a university project! It says that there is no helpfile for op.recognize when I press "alt" and click the mxj op.recognize object. In-fact, when I type "mxj op.recognize", in the max window it says straight away "wrong arguments" before I have even done anything!
    • Apr 15 2011 | 9:12 am
      Hey Olivier!
      First of all, you did a great job on that object. It works very well.
      Is it possible to make the speech recognition in real time? I need it to sync a text written behind an actor with the actors voice.
      Also, I am german. Do you know where I can find a german dictionary to feed your object with?
      kind regards, Heiko
    • Jun 15 2011 | 11:02 am
      Hi Olivier,
      This is indeed a great work, and a big resource. I have installed and am not able to make it work.
      How do I give a voice input ? I am a novice, just started my research on speech recognition.
      It'll be really helpful if you can tell me how exactly this works.
    • Jun 20 2011 | 6:59 am
      > Is it possible to make the speech recognition in real time? I need it to sync a text written behind an actor with the actors voice.< yes and no. :) Since it is not a continuous audio stream, you have to find a way to stop the recording, then do the process. I did used 2 buffers in parallel. The most difficult part of the job is finding the right moment when to stop the recording. Envelope follower ?
    • Jun 20 2011 | 7:01 am
      > How do I give a voice input ?< You have to use buffer~ and record~. record~ would record the audio to the buffer~. When you have finished recording, you stop the recording and bang the object. I hope this helps.
    • Jun 20 2011 | 7:04 am
      > In-fact, when I type “mxj op.recognize”, in the max window it says straight away “wrong arguments” before I have even done anything!< You need to give it at least one argument; the buffer name to which it is related.
    • Aug 25 2013 | 5:45 pm
      I'm getting op.recognize-> error allocating when trying to load any of the example language files. I am on windows. Copied the right files to the classes and lib dirs. Anyone figure this out yet?
      This looks good right? MXJ System CLASSPATH: C:\Program Files (x86)\Cycling '74\Max 6.1\Cycling '74\java\lib\jitter.jar C:\Program Files (x86)\Cycling '74\Max 6.1\Cycling '74\java\lib\jode-1.1.2-pre-embedded.jar C:\Program Files (x86)\Cycling '74\Max 6.1\Cycling '74\java\lib\jsapi.jar C:\Program Files (x86)\Cycling '74\Max 6.1\Cycling '74\java\lib\max.jar C:\Program Files (x86)\Cycling '74\Max 6.1\Cycling '74\java\lib\sphinx4max.jar MXJClassloader CLASSPATH: C:\Program Files (x86)\Cycling '74\Max 6.1\Cycling '74\java\classes\ Jitter initialized Jitter Java support installed op.recognize based on CMU SphinX4 _Olivier Pasquet _2006 - rev 2009
    • Aug 26 2013 | 12:45 am
      I'm getting op.recognize-> error allocating when trying to load any of the example language files. I am on windows. Copied the right files to the classes and lib dirs. Anyone figure this out yet?
      This looks good right? MXJ System CLASSPATH: C:Program Files (x86)Cycling '74Max 6.1Cycling '74javalibjitter.jar C:Program Files (x86)Cycling '74Max 6.1Cycling '74javalibjode-1.1.2-pre-embedded.jar C:Program Files (x86)Cycling '74Max 6.1Cycling '74javalibjsapi.jar C:Program Files (x86)Cycling '74Max 6.1Cycling '74javalibmax.jar C:Program Files (x86)Cycling '74Max 6.1Cycling '74javalibsphinx4max.jar MXJClassloader CLASSPATH: C:Program Files (x86)Cycling '74Max 6.1Cycling '74javaclasses Jitter initialized Jitter Java support installed op.recognize based on CMU SphinX4 _Olivier Pasquet _2006 - rev 2009
    • Dec 13 2013 | 4:19 pm
      does look good, i'm on osx 10.6.8 and have comparable things, i succeed loading digits.gram, alas not hellongram.trigram.lm errors reported :class not found !java.lang.ClassNotFoundException: edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model and op.recognize-> error allocating Property exception component:'lexTreeLinguist' property:'acousticModel' - component 'hub44' is missing edu.cmu.sphinx.util.props.InternalConfigurationException: component 'hub44' is missing
    • Dec 27 2013 | 10:38 pm
      is the source code available anywhere? I may end up following in Oliver's footsteps and wrap Sphynx for max, but if the work has already been done, it would be nice to just debug what's there.
    • Dec 28 2013 | 2:43 am
      you should get in touch directly with him then :) (through his website)
    • Oct 30 2015 | 10:56 pm
      Hi Vichug. I get the same problem as you - did you ever find a solution?
    • Oct 31 2015 | 7:38 am
      Hey, sorry, it's been some time... but i'm not sure i ever succeeded tbh... i think i was hoping for Metameta to succed in recompiling the sources...
    • Nov 02 2015 | 11:48 am
      Thanks for letting me know. I'll try on a different computer and mention if I have any luck.
    • Nov 13 2016 | 1:48 pm
      I am trying to install op.recognize into max 6.1 but having issue when it comes to go to the lib directory in the terminal. I put the files into the folders as indicated in the readme.txt. What should i do afterward? When i type the directory (/Applications/Max 6.1/Cycling '74/java/lib) in the termnial i have a message saying " No such file or directory"...
      Help please!!
      Thanks in advance Aline
    • Jan 12 2017 | 5:18 pm
      Been trying to get op.recognize to work on Max 7 under Windows. No success so far, but i figured i'd share my steps here, and maybe someone has any additional insight?
      I don't use Max/MSP regularly, apologies if something seems basic, it all feels equally daunting to me.
      This is what i figured out so far: - need same bits version of java and max installed - need to place the library files under specific paths C:\Users\username\Documents\Max 7\Packages\recognize\java-classes\lib\jsapi.jar C:\Users\username\Documents\Max 7\Packages\recognize\java-classes\lib\sphinx4max.jar C:\Users\username\Documents\Max 7\Packages\recognize\java-classes\op\recognize$1.class C:\Users\username\Documents\Max 7\Packages\recognize\java-classes\op\recognize$2.class C:\Users\username\Documents\Max 7\Packages\recognize\java-classes\op\recognize$3.class C:\Users\username\Documents\Max 7\Packages\recognize\java-classes\op\recognize.class
      If there is no "Packages\recognize\java-classes\op\" you get this error message: Could not load class 'op.recognize'
      If the jar's are not found when max launches you'll get these kind of errors when running the patches using op.recognize: op.recognize-> file not loaded or not enabled or not ready yet
      If both are present on the paths noted above you get a much nicer: op.recognize-> error allocating Property exception component:'jsgfGrammar' property:'grammarLocation' - Bad URL C:\Users\username\Downloads\recognize\recognize\simple-examplesunknown protocol: c edu.cmu.sphinx.util.props.InternalConfigurationException: Bad URL C:\Users\username\Downloads\recognize\recognize\simple-examplesunknown protocol: c
      Googling this up i'm still a bit unsure what the problem is exactly, might be some mac/windows different path naming issue?! not sure if i can fix it on max or would require some patch on the sourcecode.
      The original sourcecode isn't available so i decompiled the classes but i never did a max external before so i'm wondering how to actually set the project up (i guess there is a tutorial for this somewhere) and what the issue might be exactly (old max version objects used?! mac/windows path differences?!) so i'm just going to end up making random basic tests to try to figure it out.
      Was hoping someone with more max externals knowledge would read this and give me some hint...
    • Mar 12 2018 | 2:40 pm
      Hey, has anyone ever found a solution to get this running on Windows? Thanks in advance!
    • Mar 12 2018 | 4:24 pm
      I have been looking at the issue of speech recognition lately.
      In my opinion, it is worth paying attention to the speech recognition system built into Chrome browser. It works on many platforms (inter alia MacOS and Windows) and it's rather robust. It's very easy to write a language recognition mechanism (actually a simple HTML + JavaScript code) in your chosen language (Chrome speech-to-text engine is multilingual) running "inside" the browser, and redirect data (recognised strings) to Max via Node.js and sockets.
      Of course from user's point of view it will be easier to bring speech recognition into Max via specialised external, but, if you are looking for an efficient and multilingual mechanism - I would suggest using the engine built into Chrome.
    • Mar 24 2019 | 11:06 am
      It's very easy to write a language recognition mechanism (actually a simple HTML + JavaScript code) in your chosen language (Chrome speech-to-text engine is multilingual) running "inside" the browser, and redirect data (recognised strings) to Max via Node.js and sockets.
      @yaniki Is there any chance you could create a demo patch for this process? Many thanks in advance.
    • Mar 27 2019 | 4:31 pm
      Ok, here is the example. It's very sketchy and there is a lot of room for further development/tuning, but the general idea is, as I think, explained.
      The entire mechanism is based on 3 elements:
      1) A "webpage" (a HTML document with JavaScript) using speech-to-text engine built-in into Google Chrome (theoretically any browser handling WebKit STT should work - however, if I'm not wrong, currently only Chrome supports this).
      2) Simple Node script for messaging from Chrome to Max.
      3) Simple Max patch showing how to receive and process data sent from Chrome.
      The webpage is based on P5js (https://p5js.org) framework - but it should be relatively simple to edit the code a bit and to remove this dependency from the example if you want to. JavaScript code responsible for handling the speech-to-text processing is located in the "sketch.js" file (in the "SpeechToText" pseudoclass).
      The mssaging system is build on top of the Socket.io library for the Node.js. I have not tested this code in the Node version built into Max, but it should work, too. However I recommend to start with standalone Node, because it is a proven solution. Remember to install Socket.io for Node.
      Example Max patch is using the same HTML/JavaScript document, which is used for the speech-to-text processing. This document - if loaded inside Max - serves just as a "data router" receiving data from Chrome and sending them to the parent patcher. Check Max documentation for details about communication between content of the [jweb] and Max patcher. You may also check my simple solution for messaging between Max and P5js available here: https://www.paweljanicki.jp/projects_maxandp5js_en.html
      How to use it:
      1) Execute the "socketio_bridge.js" script with Node.js to establish communication mechanism (I assume, you're dealing with basics of the Node.js already).
      2) Start "max_client.maxpat" patch with Max (be sure, Max Console is opened).
      3) Load "index.html" file into google Chrome browser. Allow for microphone access. Say something to feed the speech-to-text converter - you should see detected words in the browser window, in your terminal (running and monitoring Node) and in Max patcher (and console). It will be also a good idea to open Chrome console to monitor eventual problems.
      As I mentioned: it is a very simple and sketchy solution: you can switch to the Node build into Max, clear the code, and make it more error-proof.
      Chrome speech-to-text mechanism is a little bit chimerical, but you can master it and get it to cooperate. Typical problems are: refusal to work if there is no access to the internet (so be sure, you are connected) and automatic deactivation (of the speech-to-text detector) after some time when nothing was detected (you can prevent it by adding automatic webpage restart on error [check "sketch.js" for tips] and enabling - for this particular document! - access to the microphone without a request from the browser - this last action will save the time spent on clicking to confirm access to the microphone).
      Ufff... ok, have fun ;-)
    • Apr 14 2019 | 6:06 pm
      Hello Yaniki, I would be very interested in trying your chrome speech to text tool, however when I load "socketio_bridge.js" into the js objec, Max module says "js: socketio_bridge.js: Javascript TypeError: http.createServer is not a function, line 5". I am new to js and I don't know if I am doing something wrong or if there is a typo in the code. Any help much appreciated. Thanks!
    • Apr 14 2019 | 9:18 pm
      @CHUPILCON "socketio_bridge.js" is a simple script that allows communication between MaxMSP and Chrome. It should be launched from Node.js (as I wrote in the instructions in the previous post), not from [js] object in Max.
    • Jun 12 2019 | 4:49 am
      @YANIKI Amazing, thank you! I only just found your response and it works like a charm! Thank you so much for your direction. This gives me so many areas to research but also a working model that I can start to mess around with. Thank you for your very clear instructions, they were very helpful. Can't thank you enough! Very much appreciated.
    • Jun 12 2019 | 7:49 am
      Dear SNICKERS
      Thank you for the kind words and feedback - I've already thought that nobody would be interested :-). I am glad that my solution works well (actually, information that this mechanism works well on computers that are used by other people is very valuable) - I should finally sort this code and switch to communication via Node.js built into Max, to simplify te project and make it more elegant.
    • Jun 12 2019 | 12:00 pm
      @yaniki i think speech recognition (and speech synthesis) in Max are two lacking areas, and probably lot of users will be interested by this solution that seems easy enough to use ! though i'm concerned that this only works with an internet connection, does it use an online google service, or is it something that is really built into Chrome and just need online mode for authorization or something ?
    • Jun 12 2019 | 12:18 pm
      The "obligatory" connection to the Internet is puzzling me: theoretically, speech detection is built into browser, but does not really work if we're not online (a good starting point for conspiracy theories).
    • Jun 29 2019 | 10:24 am
      @Yaniki thank you for sharing your work! I tried it out and it works great. However, i cant figure out how to avoid the request from the browser. Chrome doesn't allow to change the permission settings, so i have to keep clicking allow. i am pretty new at this so i might misunderstood something. Thanks again!
    • Aug 15 2019 | 11:54 pm
      Hello to the readers,
      Here is a newbie question. I'm trying to start the socketio_bridge.js by typing "node socketio_bridge.js" in the terminal of a Windows 10 machine (please take a look to the extract below). But it does not starting and I can not link index.html to max_client.maxpat. Any suggestions? Thank you greatly!
      C:\Users\Alfredo\Desktop\maxchromestt>node socketio_bridge.js
      Server started, listening on port 8081.
          throw err;
      Error: Cannot find module 'socket.io'
          at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
          at Function.Module._load (internal/modules/cjs/loader.js:562:25)
          at Module.require (internal/modules/cjs/loader.js:692:17)
          at require (internal/modules/cjs/helpers.js:25:18)
          at Object.<anonymous> (C:\Users\Alfredo\Desktop\maxchromestt\socketio_bridge.js:14:10)
          at Module._compile (internal/modules/cjs/loader.js:778:30)
          at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
          at Module.load (internal/modules/cjs/loader.js:653:32)
          at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
          at Function.Module._load (internal/modules/cjs/loader.js:585:3)
    • Aug 16 2019 | 8:11 am
      Hello 2-XITE
      Unfortunately, I don't have access to a Windows computer right now, so I'll guess... ;-). Did you installed socket.io NPM package for Node? If no, type and execute:
      npm install socket.io
      in your terminal window.
    • Aug 16 2019 | 8:40 am
      Good morning Yaniki,
      No, I did not installed (did not know how) , but after your advice I was succeed to manage the tool to work! Thank you for this important support, it works!
      With compliments,
    • May 04 2020 | 5:36 pm
      Hey! Great Work!!
      I have a doubt... I can't make it work in Max MSP. I followed the instructions. But I have never use Node.js or javascript before, this is my first time. I think that I'm doing something wrong trying to execute the code. How do you execute "socketio_bridge.js" in node.js? Google Chrome its detecting words already but I can't make them appear in MaxMSP. I tried to just write "socketio_bridge.js" and press enter, I also tried to copy-paste the code and press enter, but Im not sure this is the right way. How do I open "socketio_bridge.js" in node.js? I know its a very noob question. I thought it was going to be easy following the steps, but I underestimated it... I guess its a good oportunity to start learning javascript. But if anyone of you can guide me on this one, I will really appreciate it very much.
    • May 04 2020 | 7:12 pm
      Hi Mario
      First, you have to install Node.js from this page: https://nodejs.org/en/ (and probably to restart your computer)
      Then you need to install socketio. To do this, open a terminal window and type:
      npm install socket.io
      Press enter, wait till the library will be installed.
      Once you installed Node and socketio you can use the bridge. To execute socketio_bridge.js: navigate to the folder with the project, in your terminal window type
      (add one "space" character) after the "node" (so, it should be "node "), drag socketio_bridge.js into terminal window, and press enter.
      Alternatively you can use Node for Max (instead of standalone version, but I didn't tested this).
    • Oct 13 2020 | 11:12 am
      Thank-you very much Pawel I've worked it out. Speech to text works pretty well in various languages (set in the sketch.js script) and the transmission to max is strait. I have a few questions related more to Chrome Speech Api : Is there a way to set the time out to truncate separate answers ? How could I lock the mike on in GoogleChrome ? At the moment, GoogleChrome always stop the access to the microphone after a minute or two and asks to allow access manually. Is there a way to use Firefox instead of Chrome with the same tool ? Best regards Roland
    • Oct 13 2020 | 2:25 pm
      Hi Roland, thanks for kind words.
      Unfortunately, Chrome's Speech API is rather unique and has no equivalent in Firefox.
      I don't know why Chrome locks up the microphone after a minute - I have not encountered such behavior before, maybe it's some mechanism in the privacy settings? All I can say now is that the current version of Chrome is running continuously on my computer macOS - in fact, there is a fragment in the script that automatically restarts the speech recognition system after the browser automatically exits it, and it has worked fine so far.