Forums > MaxMSP

regexp question (truly exact match ?…)

December 17, 2010 | 12:13 am

hi,

after having done some serious research in the forum and the web concerning regexp code i still haven’t found the solution for this (i hope) very trivial problem:

i want to check, if a sentence (i.e. a string) contains the word "there".

as i am looking for an exact match i surround it in b boundaries so the object looks like this:

regexp (\bthere\b)

the problem is, that strings like "there’s a xxx" are also matched. could anybody help me modifying the regexp so that apostrophes are excluded ?

(the general issue here is that i really miss something like a NOT operator in regexp… or did i just overlook something ?)

thanks for any hint !


December 17, 2010 | 12:20 am

The caret or ^ is a not operator just within a pair of square brackets []

I forget exactly how regexp works a lot of the time so whenever I want to write a regexp I end up here:

http://www.regular-expressions.info/reference.html

HTH

Alex


December 17, 2010 | 12:43 am

I’m not really familiar with regexp but it seems to me you have to include the space’s in front of, and after "there" so it will only include "stand-alone" there’s

Like this; " there "

Good luck!

FRid


December 17, 2010 | 12:45 am

hi, alex
thanks a lot for your reply. only … it doesn’t work ;-)

the problem here is that with this caret in brackets REGEXP searches for an extra element, whereas i want it to match exactly after "there" and not match after "there’s"

i.e. the formula

============================
regexp (\bthere[^']\b)
============================

wouldn’t match "there" anymore

mmmhhh….


December 17, 2010 | 1:38 am

You probably need to roll your own version of \b possibly making use of the pipe operator to match several slightly different scenarios.

Start by enumerating *exactly* when you want it to match and when not.

This might work:

(\At|[^'\w]t)her(e\Z|e[^'\w])

This should match at the start of the input or after a non-word character except ‘
Same kind of thing at the end.

Might have got it wrong though – haven’t tested it.

A.


December 17, 2010 | 1:51 am

yeah ! that really seem’s to work ! i did no heavy testing yet, but the example strings behave like they should!

thanks a lot, alex !

(but hey, what a monster of regexp code, really ugly to look at …)

ciao

oliver


December 17, 2010 | 1:52 am

you also *might* be able to do it with numbers through [atoi]. Run the whole string through [atoi] and you’ll get a list of the "number-chars", then try [zl sub] to compare the list to the word "there" (also run through [atoi], which gives 32 116 104 101 114 101 32, including spaces at beginning and end). So you’d be looking for that group of values in the master list, and [zl sub] will give you "found: 1" and "position: 10" or wherever.

– Pasted Max Patch, click to expand. –

December 17, 2010 | 2:58 am

Don’t bother with the b and stuff..

If you only want to check for "there" in a sentence then all you need to do is tell the regexp that it should look for "there" where ‘something’ obviously is whatever it is; its not of your concern.

So: "^*.there.*$" (no "", but see snippet below).

This means: From the start of the sentence (^) check for whatever kind of character (.* (. any character and * "one or x appearances")) until you get to a point where some letter combination needs to be there (‘there’). After this combination any kind of characters can appear until the end of the sentence ($).

And because you want to pinpoint to ‘there’ you want to use () so that it becomes a back reference.

Alas, here’s my proof of example:

– Pasted Max Patch, click to expand. –

Edit: The trigger object doesn’t have to be there, but I used it here to make my example as easy to understood as possible.


December 17, 2010 | 4:03 am

That last example will match "theremin" and "weathered" though. You probably need to include all the punctuation you don’t mind preceeding/following the word in non-capturing brackets (or the option of it being the very first or last character) like so:

regexp (?:[\s({[]|^)(there)(?:[]\s}).,!?’;:]|$)


December 17, 2010 | 2:58 pm

hi, regexp gurus !

thanks a lot for joining into this discussion and for all your nice help. now i already gained way more inside into the whole thing than i had before.

concerning the simplicity of the task i always re-learn that regular expression is really quite a beast to tame …

cheers, guys !

oliver


December 17, 2010 | 3:00 pm

sorry, should have been "insight"

greetings from the land of PISA loosers ;-)


Viewing 11 posts - 1 through 11 (of 11 total)