route message with a space

Garrett's icon

Hi

I've been trying to use Jasch's detox object to parse an xml file however the xml is not under my control and quite poorly formatted with several missing closing tags so the object won't parse correctly.

I'm now investigating alternative ways to strip out the data from the xml that I want, I've tried a jsui script but I'm terrible at javascript and I'd rather avoid it if at all possible. I'm attempting to use a text object, traverse the xml line by line and search using route but I'm not sure how to search for a string with a space and double quotes (see patch below). How do I search for a tag like so

? I have tried strchr as well but just keeps crashing max.

thanks in advance
Garrett

max v2;
#N vpatcher 10 59 884 670;
#P origin 0 -243;
#P window setfont "Sans Serif" 9.;
#P window linecount 1;
#P hidden message 429 291 22 196617 set;
#P hidden newex 377 290 48 196617 loadbang;
#P comment 243 183 71 196617 1) Click dump;
#P comment 81 56 307 196617 1) Copy and paste this into text;
#P number 193 162 35 9 0 0 0 3 0 0 0 221 221 221 222 222 222 0 0 0;
#P message 193 181 45 196617 line $1;
#P window linecount 0;
#P message 377 351 305 196617;
#P button 325 432 15 0;
#P button 307 432 15 0;
#P window linecount 1;
#P message 125 181 33 196617 clear;
#P message 159 181 33 196617 dump;
#P newex 158 212 40 196617 text;
#P newex 376 399 59 196617 print value;
#P newex 157 376 138 196617 route

;
#P newex 157 257 51 196617 route set;
#P newex 157 280 40 196617 t s b b;
#P window linecount 2;
#P comment 166 348 169 196617 3) The space here is the main issue , are the double quotes an issue?;
#P window linecount 4;
#P comment 80 75 599 196617

;
#P connect 6 0 3 0;
#P connect 3 0 2 0;
#P connect 2 0 4 0;
#P connect 7 0 6 0;
#P connect 8 0 6 0;
#P connect 12 0 6 0;
#P connect 13 0 12 0;
#P connect 4 0 9 0;
#P connect 4 1 10 0;
#P connect 11 0 5 0;
#P connect 6 0 11 0;
#P hidden connect 17 0 11 0;
#P connect 9 0 11 0;
#P hidden connect 16 0 17 0;
#P pop;

Luke Hall's icon

With [route] you'd need to use two in parallel: [route

Instead I'd recommend using the [regexp] object. It's a bit hard to wrap your head around but you can specify words or patterns of characters that it will match. You can use regular expressions in javascript with the [js] object if thats the way you end up going.

For example in your patch [regexp "

but it wouldn't match if there were no other attributes or they were in a different order. If you want to match any

tag with id="feedHeader" in it anywhere you could use [regexp "

|>)"] which seems to work ok even though it is a bit of a mess. I'd recommend searching the max forums as there are a few helpful [regexp] discussions. If you come across any more problems I'd be happy to take a look.

lh

Garrett's icon

Hi

Thanks for your response, you confirmed one of my worst fears (even more so than Javascript), regular expressions. I've done some before in php and they are a nightmare but I'm going to bite the bullet and tackle this tomorrow when I'm more alert - I may be back to take you up on further advice. Many thanks.

Garrett

Luke Hall's icon

A while ago I started writing a little tutorial/explanation/example patch in max but still haven't got round to finishing it. If you want to have a look at it as it is you can find it here. I hope it helps.

lh

Garrett's icon

Hi

Thanks for the link but I'm on max 4.6.3 so I can't look at the patch. I think I have my solution now cobbled together with some of yours, some of my work and a patch I found on the forum. Here is my test if it's useful for anyone else using regexp to parse html or xml. At the moment it searches for a div with a specific id attribute anywhere in that div, this could be changed to any tag and/or attribute quite easily. It is also quite forgiving about the use of double quotes, single quotes and no quotes around attribute values (some really bad web designer/programmers out there).

Thanks for your help and suggestions.

Garrett

Max Patch
Copy patch and select New From Clipboard in Max.


Dg's icon

Hi,

This could perhaps help: the dot lib has an interesting patch with native objects to parse xml tags, called: dot.xmlread2