Forums > MaxMSP

download text from wikipedia

October 21, 2012 | 7:14 am

Hello

I would like to be able to download some text from wikipedia and get it into a message box
I’ve looked at several post on the forum but i can’t figure out how to use jit.uldl properly ( i’ve always got error -1 )
any help would be really appreciated

thank you

PB


October 21, 2012 | 8:19 am

Hi,

here’s a short example using [sadam.tcpClient] (see http://cycling74.com/forums/topic.php?id=42930 ) that will download the page about Giovanni Pierluigi da Palestrina from the English Wikipedia. Unfortunately this will give you the whole HTML document, so you’ll need some additional processing there to get the actual text.

– Pasted Max Patch, click to expand. –

Hope this helps,
Ádám


October 21, 2012 | 8:36 am

thank you a lot Ádám
Unfortunatly the "additional processing" looks really tricky for me…
But I will work on it :)
thx again

PB


October 21, 2012 | 10:14 am

Hi again,

here’s a more advanced example. This will first search for a tag and throw everything that comes before that. Then it will parse the HTML with [sadam.rapidXML], which will give you structured access to the HTML document. This might not be the best choice for you (since XML is not the best representation for you, as you will probably see, since for example, formatting commands will be parsed as separate XML elements), but it might give you a good starting point.

– Pasted Max Patch, click to expand. –

HTH,
Ádám


October 21, 2012 | 11:17 am

really great! it will help me a lot!!

many thanks

PB


October 21, 2012 | 5:47 pm

Hi,

and here’s another version. This will get directly the printer-friendly version of the page, without the sidebar and other Wikipedia-related stuff, which means that there’s a lot less stuff to care about. However, the links are still structured as XML elements, which might still give some headaches…

– Pasted Max Patch, click to expand. –

HTH,
Ádám


October 22, 2012 | 7:42 am

thanks a lot Ádám!


October 26, 2012 | 9:41 pm

Hi,

how about an .aspx or .php address?


October 27, 2012 | 9:04 am

Hi,

there should be no difference. If you check the last example that I posted, it is executing a PHP query on Wikipedia. The point is, you always send a ‘GET’ command to the web server (which is quite broadly documented on the web with examples etc.) and then process whatever answer you get back.

HTH,
Ádám


Viewing 9 posts - 1 through 9 (of 9 total)