Strip HTML markup from web pages

runs on Windows
screenshot of HTML Stripper

A web page is a web page, in part at least, because of all the markup that's in there along with the text. If you're looking at it with a browser, that behind-the-scenes stuff needs to be in there. But if you're doing something else with it—maybe analyzing keyword density for your next SEO project, or sticking the text into a word processor to include in a report or some such, you need to get that extra gobbledegook out of the way and come back with just plain old text. You could do that with a text editor and a lot of time, or maybe you could check out a dedicated solution.

HTML Stripper was created by a guy looking to do just this. Just feed it a file and it opens it up, goes through and strips out all the HTML markup, and gives you back just the text. Once you've got that plain text file, you can go to town on it: edit it, analyze it, or whatever.

You can download HTML Stripper and use it on your Windows machine. It doesn't make any changes to the Registry, and doesn't spread stray DLLs all over your machine. And it's free, although the author asks that if you find it useful that you do something nice for somebody else.

Download HTML Stripper

Leave a Reply

You must be logged in to post a comment.