Hi again,
here's a more advanced example. This will first search for a tag and throw everything that comes before that. Then it will parse the HTML with [sadam.rapidXML], which will give you structured access to the HTML document. This might not be the best choice for you (since XML is not the best representation for you, as you will probably see, since for example, formatting commands will be parsed as separate XML elements), but it might give you a good starting point.