Pick words with specific letters from a text
I want to find words in a text that contain certain letters and output them in a separate list or coll.
The idea is to input the text, choose a letter, and then have an output of all the words that have those letters within the text.
[regexp] seems to be a very powerful tool to handle this kind of stuff.
I also discovered the [atoi] and [itoa] objects which can comparing words by turning them into integers.
[coll] seems to be an option too, but i can figure out how to look for elements within all the list I have collected in the object.
Either I'm missing something really simple and basic, or this is a task that its harder than it seems
I will really appreciate your help! :D
Maybe something like this? Good luck and have fun,
Hens Zimmerman
Hi Hens!
Thank you very much for your reply. That's really helpful. I was wondering if there is also a way to be able to choose a particular letter in the [regexp] expression via a slider or something like that.
I just found out about regexp a couple days ago and its syntax still puzzles me.
Sure, you were on the right path with [itoa]:
@hz37 very nice... any chance you could state the [sprintf] and [regexp] args in simple language to get a better sense of what they are doing? My backreference chops are weak.
@hz37 Oh wow! That's quite an elegant solution. It took me quite some time but I was able to achieve the same using combine and prepend objects.
And I agree with @metamax, a little explanation would be really nice. [sprintf] seems to be able to take variables like symbols or integers/floats whereas [regexp] seems to work better with messages, but both use the same syntax. I've seen tutorials (https://cycling74.com/tools/regular-expression-tutorial/#.V9jIgjZUNOo) and other references, but the syntax seems quite complex.
Thank thanks a lot!
HZ37, thanks - what a fantastic patch, giving me some ideas for doing more work with text in Max.
Happy to be of service.
[sprintf] is a kind gesture from Cycling '74 to anyone who comes from a C programming background. It prints formatted text with arguments to a buffer. You can supply a mixture of text and placeholders, and [sprintf] will fill in the placeholders. So for instance, if I were to enter the age of my cat as a variable, I could say:
[sprintf "My cat is %d years old"]
You then connect a number to [sprintf] and that number will take the place of %d. Similarly, %s can be used for text:
[sprintf "You are %s, and that's why I will %s you"]
The latter takes two string arguments. Of course you can freely mix all kinds of placeholders and they have extra formatting options as well. You can find references on the internet about your options:
As for [regex], that's like learning a new language. The cool thing is, if you understand regular expressions, you can use them in most computer languages. C, C++, Perl, php, Max, python, they all support regular expressions. And they are so super useful. The input string of [regex] can look pretty complex, so it helps to put in comments to explain your thinking. For instance, this is a real world regular expression from something I wrote, albeit in C++:
^TIME\\s*CODE\\s*FORMAT:\\s+(\\d+)\\s+.*$
It says from the beginning of the line of text (^) find the word TIME in caps, followed by 0 or more (*) whitespace characters (\\s), followed by the word CODE in caps, followed again by zero or more whitespace characters, followed by the word FORMAT and a colon, followed by at least one but maybe more whitespace characters, followed by at least one but maybe more digits and we want to find those digits so we put them between parentheses, followed by at least one whitespace character and 0 or more dots and then the end of line($).
Why bother?
1) You can use a regexp to match a line, so you recognize a certain line of text, e.g. marking the beginning of a section. In that case you won't need to pick out a certain character of more characters for further processing. The line of text functions as a semaphor.
2) To pick out specific parts of a line of text. In this case you will put certain things between parentheses and use one of Max' outlets of the [regex] to extract those fields. The [regex] help has a nice example of extracting the numbers of an IP address. By including all the other things you encounter in a line of text, you sort of do two things at once: match the line of text and extract the parts you're interested in.
3) Almost any implementation of regular expressions will also let you substitute. A bit like [sprintf], where you supply an element that needs to be replaced by something else.
You can find whole books about regular expressions, e.g. the O'Reilly owls book: http://shop.oreilly.com/product/9780596528126.do
Regular expressions are so super useful that they are like owning a hammer and finding all kinds of things that need to be solved with a hammer. Start gentle with a tutorial like this: http://www.regular-expressions.info/quickstart.html. And before you know it, your needs grow and you find out you can match word boundaries, character sets, everything except a certain something, etc.
Have fun!
Hens Zimmerman
How did I miss this reply?! Thank you, Hens. That was well-crafted. I am enriched.