Forum Made Regexp Tutorial
The "regexp is no fun!" thread posted by Stefan Tiedje is a very useful thread an my thinking was that it would be very useful to combine this information into a max patch.
I’m still very much learning regexps myself so it would be difficult to include al the information with clear descriptions myself. Therefore I propose we put together, collectively, a max patch that could be included with max or placed in the share section or distributed in some way to people who are struggling with the regexp object.
I already threw together some examples (it’s about lunchtime if that explains my examples!) which I hope are easy to understand. There’s very little there so far but I will add some more.
If you can help then please contribute something to the patch and repost it on here! This way we can also check that nobody has written anything daft or that doesn’t make sense which I certainly am prone to doing.
If you contribute something feel free to add your name to the contributors subpatch.
I hope you agree with me that this is a good idea!
Excellent idea, perhaps in your examples you should include some [sprintf]s to set the "re" and "substitute" attributes.
I have some specific tools that utilise [regexp] which I could include but it looks to me that you want to start this tutorial from the bottom up, so explaining some of the regex characters in detail might be better first. Let me see what I can come up with.
Great to have you on board. I think an explanation from the bottom up is a good idea but some specific examples with a bit of documentation could be useful. I am working on modifying something I have been using for a project into a tutorial which I will add later.
Looking forward to seeing what you come up with.
Well today is my day off work so I’d be happy to write a few tutorials. If more people start helping out, which I’m sure they will, then you might have to go all editorial on us and format things for a bit of continuity so that people can make sense of each contributors patching style.
It might be an idea to do this in Max4 so a wider group of people will be able to benefit from it, what do you think?
Here’s some examples on the forum. The first is parsing a webpage to find an IP address and the second is filtering items from the end of a list.
Great to have you onboard too Stefan! Looking forward to seeing some examples.
I added your ip address abstraction Luke (very useful) and wrote a quick explanation. I haven’t done the other yet, I have to do some other work first. Perhaps you could write a short explanation when you get chance?
P.s. That CT collective is brilliant. Don’t give me ideas! Haha.
here’s the patch again, with a bunch of path and extension examples added (i posted them on the pvs. thread, so I thought I’d clean them up a bit integrate them into this master file)
Quote: email@example.com wrote on Wed, 10 December 2008 14:42
> It might be an idea to do this in Max4 so a wider group of
> people will be able to benefit from it, what do you think?
Oh yes, please!
I’m still working with 4.6.3.
I missed that. Yes I think a Max4 version would be a good idea. I have Max4 installed on this machine but almost al my patches are in Max5 now.
If people want to contribute in Max4 now I’ll add them to both versions.
Quote: Philippe Gruchet wrote on Wed, 10 December 2008 20:19
> Quote: firstname.lastname@example.org wrote on Wed, 10 December 2008 14:42
> > It might be an idea to do this in Max4 so a wider group of
> > people will be able to benefit from it, what do you think?
> Oh yes, please!
> I’m still working with 4.6.3.
Just a bit of tidying/editorial. I’ll get on with a Max4 version when I have time.
Quote: fairesigneaumachiniste wrote on Wed, 10 December 2008 23:11
> I’ll get on with a Max4 version when I have time.
Philippe, seeing as you can download the maxpat file rather than just copying the compressed text version you should look into the supercollider 5to4 converter written by Fredrik Olofsson. You can find it here:
Quote: email@example.com wrote on Fri, 12 December 2008 02:55
> Philippe, seeing as you can download the maxpat file rather than just copying the compressed text version you should look into the supercollider 5to4 converter written by Fredrik Olofsson. You can find it here:
Hmm… seems to convert Max5 patcher into SC only with "MaxPat.sc".
Is there’s something I do not understand?
Philippe Gruchet wrote:
> Hmm… seems to convert Max5 patcher into SC only with "MaxPat.sc".
> Is there’s something I do not understand?
*.sc is a SC source file.
Just open MaxPat.html in SuperCollider (which is the help file), find
the line "GUI.dialog.getPaths" and double-click just above this line (it
should select the whole block of text), close to the opened parenthesis.
Then press "enter" (*not* return), and let the magic happen ;)
Your converted file is in the same folder as the max5 one.
learning regular expressions, I download a text file into a text object and in order to parse it with my my regexp, I needed to output all the lines at once, not one line after another.
Is there a way to output the textfile in one shot?
(line per line my parsing doen’t work)
each time I saw examples, the text to be parsed was in a message box and not in a text object…
thanks for your help.
The "dump" message should do it.
in fact sometimes the dump does not work,
here is an example:
----------begin_max5_patcher---------- 942.3ocyWE0jhiBD9Y8WAkub2Umah.IQyVolqt+.2C2qWs0VwDTYGBjJPVct s1+6Gzj33LmtSlQGcePCcSSyW+QSC7swilrTsiomf9H5ePiF8swiFApbJF0I OZRU9tBQtFLaRESqyWylL02mgsy.50LSuNYaEWJXFX.jNk7RvL0xu7gjdCWo jFYdEC54Oa34hCbgp0z6CbmVuJyC0LOhmLA8oCbkl+uPG3YAy5TWmaJ1vkq+ bCqv3GTBw1Khjtv8gRBhmhvIAyPexMhuOdr6uomIYjI3x6uaiwT+wvvsa2Fz vZMrFcPgpJKD5DkUxzEM7ZCWIu6uereDWidP01fz1+JXnUpFzxFV981v.IYa 0SQKa0bocpmhVwk4xBKwgxkkHt7qLs4Q63xBQaoSt15bkzZle.LvbsQUbuN. 0M4tI1rggDrbXLqEpk1QT2n9Juj0fTq5b6iSJWZQWUtKD.OZXEajJgZ8CVzK Zc5s9TAtcqpQT9KZTEqjm+TmXwrwa7iwFS2ERkb6z2lKzAYgGRYnLdkk6ses iVv5oPzeYwXVnWGJaKuzr4NLYVVnuIJaCiudi4NZbVXWSzfVuZaDmvjvFlew RGBXRG10ocdCVyWkE5FKJKrGwhb45VWSl7CsVztWtOSxNkrBUqDRmRG39Jxh a5FKBM1skBGAenwIfzrEW38VksU0CsRSzMkPhgRLDZJvGd53BUo4kBb5MMvw 9LAZpyJT7rqYjStoQ9BXUlRIWf.2Vr0FP+u7+F1Z1t59Ja+Zvu+G+VessixQ 3ivQu5zi3SRRSe1uyHcIdNvWTXaCdwkk1fFCjehhes7C8GwOKsU2ce4tx4uA 5YVDTGIAxshRNK14TUUsWv3viucGzGrqRLzcco215McLD0m.Qnum667mg2su yKLz7p4+bssaF0eS3CO19hmXUp1JEJaxU2UmVwXk5mc4Inc3aO6K98tne5Px 9R86R8E8iurIeegaBZEkhglogesDB4TDhcl+r8F9M7cu87Lfe72Gjd7xWfag a89rGiB9yo+obl+51cPrOE.MaO5Jg2AAOL4.ivOwlM7xRFzcO3q3k0JaU5NL bhEvgBozAfH2IMWQHE8N.oj4tmPSRvvBKltW5bwZ7..67qJ84lM7KfH2cqth Px8luKMKQhhfEwT+Rp+ZrGWBS1a40faehMuXbfifBOTr+jMRxQjHy6kh6jN2 ED5.BD5qLPlGEPh2+fJ773fdoyDsICH64TrN9cIgdHHhdFHxJ78w+G.ME6wq -----------end_max5_patcher-----------
I ‘m impatient to see your tutorial
I can’t get the reuters web address you included to work but personally when using [jit.uldl] I find it easier to stick to jitter objects to do regex searches. It means you don’t have to download to and then read from your HD. Here’s an example.
----------begin_max5_patcher---------- 552.3ocwV10aaBCEF9Z3WgkuZSiAXHDRqpl19cTkU4.tAmYrQ3CKzU0+6C6D ZaRFIrpzzabh+Hmy66i+3jGccvKTsLMFcM5VjiyitNN1gLC3rsuCtj1lInZ6 xvKZ.PIwdalphBYEb4x6pYYvlvjR7C8PoIl1nv9Vz7s+BYSIWJXfMZjWFT0. 6OJO2lQ0hUeMNrOiaVG7PEaS5vKnxkXzbyrO45ZZ7FoUjr0cwtOv.q05.7JN 32HxEGwiStx3qjXS6DxAd7dkDjzRqBw+nlSeNViv8QuJHZ9erAgD0kiCfRT5 fPoyC2URgZdK1Cgea3ojo0zkrC3StZsTnn4nB.ptNHX8509PASvWV.xNNsPI .selpLPz7KVvJ5uo5rZdEDvk4rV+BnTf5E2IPb7VDa7OhL6+mwQG8D1HX7zA Y7Ygp+CmO05YxzolOl8g4bRxY14G45lFp8Ak9gxtyNGgLjHKSHSlXQS5G0EO R7vnw6cBO0rkr1Jz2qYna.dWV+1mt8m2L+Ke9j2gHQaNCkFcd4Ux34EYjOTM 9dmcFa5bOWvNMNCsE0RNym9hGOMCeSzbWhYEMVvk6W825ay36hQspoNqOM86 onnm8dNSCbIE3c+2fWsntKJnW.TAOOmYmu2Vk77JEWBaEw.apiVSg6jtAzD4 hpISAjSpISk7KolRGCmBur6cwiQSc0jPjKKmHmPSwWVNEG9Nr2MKzOwCck8I sjY8eeuGLbex8uPHiJN6 -----------end_max5_patcher-----------
thanks, I will see later my dump problem..
|fairesigneaumachiniste wrote on Wed, 10 December 2008 16:11|
|Just a bit of tidying/editorial. I’ll get on with a Max4 version when I have time.|
Great patch—very handy! I especially like the IP one since it doesn’t need java/mxj installed to do the work. This way people running a networking-based patch (like a chat room or something) don’t have to find their IP and set it manually.
still experimenting with regexp, I would need clues for deleting (or detecting for the moment) the lines which are repeated in a textfile.
thanks for any help
I’ve been working on a "regexp in max" tutorial for a little while, it’s not complete but here is what I have to share so far. The patch is in max5 JSON format so it should be possible to use the supercollider converter to view it in max4. If anyone else wants to chime in with some corrections/examples/questions/pointers then feel free.
great, thanks again.
I keep on training with regexp (or jit.str.regexpr)
those days, I try to parse html code in order to get news titles in plain text.
I tried this but doesn’t work
there may be a dedicated tool for html parsing?
----------begin_max5_patcher---------- 681.3ocwU00aSCCE841eEWEjfgnoIMcscqzkowj1XOrOPShAhLgbRtswiD6H amQGS6+N1NMPYqL1fIwCwt9buww2y8bbutcKmX9bT5.igOAsZcc6VsrPFfVK V2xofLOImHso4jvKJPlxoScLENWYwO3EoPN8KHn3vLTApLpDJIBEvmpWfPlp HebDKx4.8CrmfvRvNvj3viiIEjIdwg.JHRTZy94Oq+luNRTIgKkcMHEKPDYn hKnIQNQLBiqiHfRAONGK.Z86FyySAEYlYEQAEjq.ohWZiIUBJaFHPIRDIYaG wZpjbJCS3ULa4LbAHM0Vb73Kb60j3TNSwHEnMxNBJIuIBqpfxxQkko58SPdk pA0eoMQR+lcS542sAtjnRxzGvOKvDUcaIHXfNLzaXeyTv5AloQZL3by6bS61 lgNOv1GC+ptXtS2Sfyv4kvDBjIvoakoTkqs83tuZ6WFtlYbhGIz4tbR+AOIj xf6mTpyTcUIVyHNNct0Cb98veaVSe91oQ0j4F+kzWAJkjY3c3uIozKAaZaQR SoJJmQxc0peZRNFtbThKkwPQ3xT8XOuoHQUoUkcSjEbFUKw6pcZdyx4wjbcS S5E36uom+PO+08nL2oV+iK23cbq8MtZ8sakz8R6uJbabJdZufPaI2BhbpODU xYt6r26tX2i1+3ONZ59CNibxGd+aOTc7Yhx2bxA9yOM8ncfvmFmpQ6.SjkD1 BNPxqDIX3tYBpTQ0vmlPQ82.Nrtxm3YRN7AaL8ejhvfUJB68XEg+YQ2fQlw0 WXY6+zZYK0WjoVgmbi+W2SYK4.+glo99qzlYoLaC8V+mic+L3+JOTqTVP3M8 Zv+GLeJpEPLhwtsTRlakVJoLZZJxVtnJnokbM6s3P7a5JOzyzs9bq9Lsw+vQ Ru3l1e2YLzu6 -----------end_max5_patcher-----------
Have a look at [detox] by jasch. It parses xml style tags and might be useful to you. Otherwise, here’s an expression that works:
Be careful with your messages though, if they include reserved characters like ,commas, "quotes" and ;semicolons; you will probably need to escape them.
whatever I do i allways get errors:
"us" and "rhetoric" (that are preceded by semicolons in my message) get "no such object" in the max windows
Moreover, as soon as I try to delete reserved character I get the same errors.. arggh.
That’s because the ;semicolon; is used to send a message to a [receive] object (or max itself) and if you don’t have [receive] objects called "us" and "rhetoric" it can’t do this. Either remove the semicolons or try escaping them like this ; then see how you get on.
when semicolons are manualy escaped , no errors but nothing goes out of my regexp.
detox works but fires out the
is this syntax right to substitute the tags before parsing the whole message?
[regexp @substitute %0] it seems to be lazy but how to make it greedy? if i use a + it will only concern the previous character?
[regexp + @substitute %0] i want to link those 3 characters: ""
sorry for beeing so dumb about regexp.
To group character without storing as a match you can use brackets as shown below, it’s called a non-capturing reference. If you’re still having problems then post your re-worked patch.
here is my patch,
thanks to your explantions I managed to get what I desired..
but one problem persists when html code contains an ‘apostrophe’
----------begin_max5_patcher---------- 1518.3oc0ZssbaaCD8Y4uBLps1ISckI.3EwDK6j1WZeJ88pNY.IgkPBunP.F Y2z7uWbQxlxAwVzjTl0drHWPB5CNX2EmETe4nQiiJtlxGCdE3u.iF8kiFMR2 jpgQarGMNibcbJgquswwEYYzbw3SMWSPuVnae8RZNPrjBlOFbh5i3kjRRrfV BHqVQIkb.KGP.bYmo4wTvoKokTUayG+tHRF4DtrSfWb7OfCesp4khrTPbQB8 kmNO+O.KIelBRXWcEKtJUvnbfn.PKIb4yP.VyDKAkzEzqAIUT0kTPgSyXwEo E47ISlrEworbZbQUtF1taZjknGDEQe3W7b2dmWUjKxIYT8kdaIijt8J4UYr7 TpPSIv6ZrnRrsUmZODN6ezODn6jsMupjpXBhfUj+9RZrvLG3iBk2BvMzScvY yGf+dauHh3kr7E6zi.cOlpd1.7Te0g.npSp970iNR8wosb1Mo.bSQkjTY4eT QskzS3xIy0jaTbsrWQRVUy4h0ElIBJ+RqTNxBkiOHTtMxSyytPMqgbzTIN7I Rd4z0xQy2vcBPj52wVF2vNYbisMtctcbatSwMqnlA83HR9hwmduiOjOFxEpH FnKVyVZxBNsaYIsOyJS7+74u9kf2vqh3BlnRPAmXg7b85Dxyqgjmhv14uGN3 TGU54iM7mSqXtLJmSVPsDbtNOsfjHSYJV8pyNa850SjrLexhhhEozIxvyyT1 WlSSlIp3GuLcFM+3OMqPk28XB+8eJob1R0IIkkQy9zwxQvLGPFQTxt1F02T+ VjUpG1Tp+gn5.cjLzwDPGp4cUXcmxzbpvBcflN7nCer1kCiQ5CnId8.cjTks xFe3M73Cb.xjoeplO76E2iySXeFnusYQEI2bQMadNSpBRbwuS3f+Tt1OKQtL K3WURj9HPq+QK7YtTlDeEkFuTIA52HrR0por7qJJioIpEXk5cxKTpF3WBdWo TVDWo+Q5Xp5Pl9JxElWUVDKAorCDk3nzT8cn0DIjHGHed.RN3skjH.Qnxxlv xIo.oNoyOSh5K1W0RH2g2bMzWKhxCZ78g5U08b53I63TonVad+CvjiXzNd+8 S1f4iG9N2yG+fN1pwhpfgRCs4LIzEBC7UhvCQPO+v5mAqILu97uySb9OZwN+ qCB7cCfZ8+dtgNn5mU+ecuFHoCc7M59PSm1KARO0rlmGcwN9VmeVzEOq9WR. oyeFcwCjC06a8XfCP8Cv.sjerqojHn1xG0wS8JlA.sjDE5L7nDT.tdRz1ov7 6Uqn5jukNBeVpSrdMhLk26CPNgNlRCc6+JDkgY+qJHa2RD+IGatQngeMhaXN iWEbSkh8C0saR0WL4mu7kyk+btElC68+ElyybXylSz4EWq2hQatV3AXFJOGy 9X42iE43GXiMFfhdQNt02Uu1I586DX8AlXBWTNQTvuIKpH0B0DzIARnmRfzi G7fczEFD31trNUYQTa0B48rL1UqUsOaqIDWeae85mMmSKw4GsowYHV47FFw0 rKZN8gDGUHix3JVps7p3mEoNRP89sa248s1iscDYTK6F1GKdqHrpzDa4VfOK wWMmdLdRX6xB08WGmbuWDo94oZeWNiWTIqmaCV19ZT.vagQBUpILW+l0pcSp MkDbWUyKYIxZIq+VixXIqJjoN1.BjQJKLzryIlC6Z4hLR2blZwBt48wAUK6n s9N9DMZThdjQoZc3FMJCBzKKZFPHH9VqVhV2cAhcz50PzN0UCOrlgUhc1Z0R vpJ.+QAqpVhl.1VhIk.7G2o1Yma5NLAsiIHVyY9HiOpLM+VqVh1v8f.wdGTB bu7.gGTHg1G+rlNmdPvj6vCS5f1CGlBFdoHt2RZ1wTXyfz1ErLRYBCu03.jf nodYaPWn4K.Sf+sVskWg8.uh6GdEuGqRDbP8JC5gXWCes4kY6Fft0psK7i59 Y51BImtGRnondw4SsGXO9x9MDqdZ3g70RTLJ9zFsEq6SPsaCwpQuLBa9tmDX zkZ0xH0Ra0VcMcOmuEccMm608Tda8B5dx6d0xdzWO5+P9.ODa -----------end_max5_patcher-----------
PS: same problem with quotation mark (" in html)