fairesigneaumachiniste

As standard C does not come with a regular expressions library (as far as I know) so what would be the best way to implement these expressions I use in the regexp object:

[regexp (-M|ms) @substitute " "] - substitutions

There may be a way to make it simpler. I want to search a list of various atoms which I have modified simplelist to do and then match elements using regular expressions and output them in a different form.

So using the two regexp I can convert -M34.6ms to 34.6.

Perhaps this is possible using 'if then' but I can't think how to split the symbols up without regular expressions.

substringssubsitutionsregular-expressions-in-c

Here's a patch in max which demonstrates roughly what I'm trying to do:

I want to write this in an object as I have lots of different atoms to match of varying formats which would be criminally inefficient in Max (and I want to practice my dev skills! ;)

On Nov 25, 2008, at 3:56 AM, fairesigneaumachiniste wrote:

> As standard C does not come with a regular expressions library (as 

> far as I know) so what would be the best way to implement these 

> expressions I use in the regexp object:

I could suggest you use the PCRE library (

is what the regexp max object does, but if the max regexp object is 

slow, then you will not see any performance improvement in your own 

object which uses PCRE. So what would I suggest? A tight string 

walking function of your own. or sscanf/strtok/etc might be useful if 

you have a decent expectation of what the input is like.

From your simple example you probably can do something along the 

lines of the following quicko email client code

void stripnumber(char *dst, const char *src)

    // keep numbers, spaces, and periods

        case '7':        

            *dst++ = c; // copy char from input to output

    *dst = ''; // null terminate output

Here's a simple tutorial on working with strings in C:

http://www.eskimo.com/~scs/cclass/notes/sx8.html

sscanf, strtok, + other string tutorials:

http://crasseux.com/books/ctutorial/sscanf.html

http://www.gnu.org/software/libtool/manual/libc/Finding-Tokens-in-a-String.html

http://www.gnu.org/software/libtool/manual/libc/String-and-Array-Utilities.html

You can also get into finite automata implementations for regular 

expressions which can be much faster than the typical backtracking 

algorithms like perl and pcre use. Here's one reasonably clear paper 

on the subject with code samples if you're feeling really nerdy.

http://swtch.com/~rsc/regexp/regexp1.html

If you get deeper into string processing in C, you'll also need to pay 

attention to UTF-8 unicode representation as well if you want to 

Hope this gets you started. If you have further questions about this 

stuff, I'd suggest you search online. Obviously lots of info out there.

> As standard C does not come with a regular expressions library (as  
> far as I know) so what would be the best way to implement these  
> expressions I use in the regexp object:

I could suggest you use the PCRE library (http://www.pcre.org), which  
is what the regexp max object does, but if the max regexp object is  
slow, then you will not see any performance improvement in your own  
object which uses PCRE. So what would I suggest? A tight string  
walking function of your own. or sscanf/strtok/etc might be useful if  
you have a decent expectation of what the input is like.

 From your simple example you probably can do something along the  
lines of the following quicko email client code

void stripnumber(char *dst, const char *src)
{
	char c;

	// keep numbers, spaces, and periods
	// strip everything else
	while (c = *src++) {
		switch (c) {
		case ' ':
		case '.':
		case '0':
		case '1':
		case '2':
		case '3':
		case '4':
		case '5':
		case '6':
		case '7':		
		case '8':
		case '9':
			*dst++ = c; // copy char from input to output
			break;
		default:
			//skip
		}
	}

Here's a simple tutorial on working with strings in C:
http://www.eskimo.com/~scs/cclass/notes/sx8.html

sscanf, strtok, + other string tutorials:
http://crasseux.com/books/ctutorial/sscanf.html
http://www.gnu.org/software/libtool/manual/libc/Finding-Tokens-in-a-String.html
http://www.gnu.org/software/libtool/manual/libc/String-and-Array-Utilities.html

You can also get into finite automata implementations for regular  
expressions which can be much faster than the typical backtracking  
algorithms like perl and pcre use. Here's one reasonably clear paper  
on the subject with code samples if you're feeling really nerdy.

If you get deeper into string processing in C, you'll also need to pay  
attention to UTF-8 unicode representation as well if you want to  
handle non ASCII characters:

Hope this gets you started. If you have further questions about this  
stuff, I'd suggest you search online. Obviously lots of info out there.

I'll also add that on OSX, there is a regex library installed, I believe it is the POSIX implementation, see regex.h

That said, from what I have read, I would be careful as to how you use it because it can be slow. I have not witnessed this yet in my implementation of it in my external.

You can search for substrings with () very easily using an array of offsets to the matches. I then use the standard string functions to build up what I want.

However, if efficiency is what you want, I'd follow the advise and create very specific functions with the c string commands that are going to be ULTIMATELY faster.

That said, from what I have read, I would be careful as to how you use it because it can be slow.  I have not witnessed this yet in my implementation of it in my external.

You can search for substrings with () very easily using an array of offsets to the matches.  I then use the standard string functions to build up what I want.

However, if efficiency is what you want, I'd follow the advise and create very specific functions with the c string commands that are going to be ULTIMATELY faster.


Substrings/subsitutions/regular expressions in C