download text from wikipedia

    Oct 21 2012 | 7:14 am
    I would like to be able to download some text from wikipedia and get it into a message box I've looked at several post on the forum but i can't figure out how to use jit.uldl properly ( i've always got error -1 ) any help would be really appreciated
    thank you

    • Oct 21 2012 | 8:19 am
      here's a short example using [sadam.tcpClient] (see ) that will download the page about Giovanni Pierluigi da Palestrina from the English Wikipedia. Unfortunately this will give you the whole HTML document, so you'll need some additional processing there to get the actual text.
      Hope this helps, Ádám
    • Oct 21 2012 | 8:36 am
      thank you a lot Ádám Unfortunatly the "additional processing" looks really tricky for me... But I will work on it :) thx again
    • Oct 21 2012 | 10:14 am
      Hi again,
      here's a more advanced example. This will first search for a tag and throw everything that comes before that. Then it will parse the HTML with [sadam.rapidXML], which will give you structured access to the HTML document. This might not be the best choice for you (since XML is not the best representation for you, as you will probably see, since for example, formatting commands will be parsed as separate XML elements), but it might give you a good starting point.
      HTH, Ádám
    • Oct 21 2012 | 11:17 am
      really great! it will help me a lot!!
      many thanks
    • Oct 21 2012 | 5:47 pm
      and here's another version. This will get directly the printer-friendly version of the page, without the sidebar and other Wikipedia-related stuff, which means that there's a lot less stuff to care about. However, the links are still structured as XML elements, which might still give some headaches...
      HTH, Ádám
    • Oct 22 2012 | 7:42 am
      thanks a lot Ádám!
    • Oct 26 2012 | 9:41 pm
      how about an .aspx or .php address?
    • Oct 27 2012 | 9:04 am
      there should be no difference. If you check the last example that I posted, it is executing a PHP query on Wikipedia. The point is, you always send a 'GET' command to the web server (which is quite broadly documented on the web with examples etc.) and then process whatever answer you get back.
      HTH, Ádám