The absolute best XML parser for Max/MSP was created by Ádám Siska and is available from http://www.sadam.hu/?q=node/7. As it warps the RapidXML parser, it is blazingly fast and has been selected after testing many other algorithms. The object is 99% W3C-compliant (in contrast to detox) and very robust.
I a personal communication, Ádám wrote:
I can't see the rapidXML object either. There is this separate folder rapidxml.1.13 including .hpp files, which I don't know what to do with.
I would be glad to try this library out!
Thank you all for your answers, which will give me some work ; )
yes, the [sadam.rapidXML] is going to be included in the next release of my library. I'm still doing some small fixes here and there (documentation, Windows builds etc), but probably it will be available in less than two weeks.
no, unfortunately there's no 'direct' command that could do that for you. You can, however, save the XML to a file on the disk (I would use a temporal file for that purpose) and pass the file's name to the part of your program that needs to access the same XML. You can also send a getTree message to sadam.rapidXML and collect the output messages, which you can use later to build the XML again using another instance of sadam.rapidXML. This might be the way to go if you can't solve the problem with a local temporal file, for instance, because you need to transmit the XML to another machine. If this was your case, you might catch the messages with the sadam.lzo object (since you're already using my library, it should already be installed) and send the messages compressed through the network. Since getTree gives you all the information about the XML, it should be more-or-less straightforward to rebuild the whole XML based on those messages.
Indeed, this is something that didn't come to my mind when I created this object, but could be a useful feature. I might extend the external with some easier way to serialize the information contained by it. However, since I'm developing the whole library in my spare time, I can't say when this could happen.
it's me again. I had my bad dreams for not having included this very simple and evident feature in my original release. So, I added it this morning. Attached is the new version (compiled for Mac OS X Intel machines). Now there's a 'serialize' message which will output the full XML as a single message on the rightmost outlet (a separate outlet was added for this functionality on the rightmost side, so it shouldn't affect old patches using sadam.rapidXML). Documentation, compiled version for other systems (Windows, PPC) and so on will come with the next release of the sadam library.
as you probably already know the Max SDK includes an API for XML parsing (ext_xmltree.h)
Certainly from a 3rd party developer's point of view it's a lot more convenient to use ext_xmltree.h,
since it's already integrated in the API, but did you ever try to wrap the ext_xmltree.h functions into an external ? Did you ever make any comparison with rapidXML ?
I know it's undocumented but I figured out how to use it... :)
It's a pretty simple parser if you ask me, probably based on a DOM implementation which would fit perfectly the Max object oriented paradigm.
But I think when it comes to XML Siska has more experience than I have, so I just wanted to ask his opinion... my curiosity...
unfortunately I didn't make a comparison to the routines included in ext_xmltree. In fact, when I developed this object, I didn't even know that ext_xmltree exsisted at all, since it was not mentioned anywhere in the SDK docs. On the other hand, in the original commission for this object (by the MaxScore team) overall speed was of key importance, therefore I decided to use from the very first moment a third-party solution that had already been used and tested by thousands of users and reported to be robust and fast. I compared around 15 parsers (among others RapidXML, VTD-XML, IrrXML, TinyXML/TiCPP, NanoXML, ezXML, libXML, ExPat, xmlPull API, FastXML, PugiXML, gSOAP, xmlLite, Xerces etc.) before choosing RapidXML, since at that time every benchmark reported RapidXML to be among the fastest ones and, on the other hand, it was very tiny and easy to include in an external (it consists of only 4 header files, which is nothing compared to the size of, for example, Xerces). Some benchmarks actually reported that the parsing speed of RapidXML is comparable to the time that a call to strlen() takes on the same input stream... Well, to be honest, I didn't benchmark these tools for myself, but believed what I found on the web. I know, it's a bad habit, but I just didn't have the time to build a dozen of parsers and benchmark all of them one-by-one.
I also decided to use a third-party tool because XMLs can be tricky some times, and I wanted to have something that won't crash with a mal-formatted XML or with XMLs with more sophisticated things, like CDATA elements and so on. Since RapidXML is used by HTC in their phones, witch means that it was probably tested by millions of users, I simply decided to trust it in this sense as well.
the XML standard itself defines all attributes as strings. If you look carefully at any XML, you will always find that attribute values are enclosed between quotation marks. This is something that I implemented directly in the object. On the other hand, it would be more 'Maxish' if the object would simply translate every integer and/or float value to a string when it comes to attribute values. To be honest, this is something that I missed from my mind when I created the object. But I'll take a note and add this as a feature to the next release. Until then, I suggest you to convert your numbers to symbols (using the [tosymbol] object) and use those symbols as attribute values.