regexp question (truly exact match ?...)

dobyhal's icon

hi,

after having done some serious research in the forum and the web concerning regexp code i still haven't found the solution for this (i hope) very trivial problem:

i want to check, if a sentence (i.e. a string) contains the word "there".

as i am looking for an exact match i surround it in b boundaries so the object looks like this:

regexp (\bthere\b)

the problem is, that strings like "there's a xxx" are also matched. could anybody help me modifying the regexp so that apostrophes are excluded ?

(the general issue here is that i really miss something like a NOT operator in regexp... or did i just overlook something ?)

thanks for any hint !

AlexHarker's icon

The caret or ^ is a not operator just within a pair of square brackets []

I forget exactly how regexp works a lot of the time so whenever I want to write a regexp I end up here:

HTH

Alex

FRid's icon

I'm not really familiar with regexp but it seems to me you have to include the space's in front of, and after "there" so it will only include "stand-alone" there's

Like this; " there "

Good luck!

FRid

dobyhal's icon

hi, alex
thanks a lot for your reply. only ... it doesn't work ;-)

the problem here is that with this caret in brackets REGEXP searches for an extra element, whereas i want it to match exactly after "there" and not match after "there's"

i.e. the formula

============================
regexp (\bthere[^']\b)
============================

wouldn't match "there" anymore

mmmhhh....

AlexHarker's icon

You probably need to roll your own version of \b possibly making use of the pipe operator to match several slightly different scenarios.

Start by enumerating *exactly* when you want it to match and when not.

This might work:

(\At|[^'\w]t)her(e\Z|e[^'\w])

This should match at the start of the input or after a non-word character except '
Same kind of thing at the end.

Might have got it wrong though - haven't tested it.

A.

dobyhal's icon

yeah ! that really seem's to work ! i did no heavy testing yet, but the example strings behave like they should!

thanks a lot, alex !

(but hey, what a monster of regexp code, really ugly to look at ...)

ciao

oliver

seejayjames's icon
Max Patch
Copy patch and select New From Clipboard in Max.

you also *might* be able to do it with numbers through [atoi]. Run the whole string through [atoi] and you'll get a list of the "number-chars", then try [zl sub] to compare the list to the word "there" (also run through [atoi], which gives 32 116 104 101 114 101 32, including spaces at beginning and end). So you'd be looking for that group of values in the master list, and [zl sub] will give you "found: 1" and "position: 10" or wherever.

ShelLuser's icon

Don't bother with the b and stuff..

If you only want to check for "there" in a sentence then all you need to do is tell the regexp that it should look for "there" where 'something' obviously is whatever it is; its not of your concern.

So: "^*.there.*$" (no "", but see snippet below).

This means: From the start of the sentence (^) check for whatever kind of character (.* (. any character and * "one or x appearances")) until you get to a point where some letter combination needs to be there ('there'). After this combination any kind of characters can appear until the end of the sentence ($).

And because you want to pinpoint to 'there' you want to use () so that it becomes a back reference.

Max Patch
Copy patch and select New From Clipboard in Max.

Alas, here's my proof of example:

Edit: The trigger object doesn't have to be there, but I used it here to make my example as easy to understood as possible.

Luke Hall's icon

That last example will match "theremin" and "weathered" though. You probably need to include all the punctuation you don't mind preceeding/following the word in non-capturing brackets (or the option of it being the very first or last character) like so:

regexp (?:[\s({[]|^)(there)(?:[]\s}).,!?';:]|$)

dobyhal's icon

hi, regexp gurus !

thanks a lot for joining into this discussion and for all your nice help. now i already gained way more inside into the whole thing than i had before.

concerning the simplicity of the task i always re-learn that regular expression is really quite a beast to tame ...

cheers, guys !

oliver

dobyhal's icon

sorry, should have been "insight"

greetings from the land of PISA loosers ;-)