GSiteCrawler helps Google see your more obscure pages

screenshot of GSiteCrawler

For your website's pages to show up in search results in Google, Yahoo!, and the other search engines, their "robots" have to be able to find your site's pages. That stands to reason: how can they report on what they haven't seen? The "seeing" part, however, isn't always so easy.

Search engines run on a numbers game. They want to be able to report the greatest number of relevant results to their visitors while expending the least amount of effort on their part. Generally search engines find your site by following links from other sites; then find other pages by navigating through your site. There are some types of navigation that work well for your human visitors that just don't work for search engine robots.

Search engines can't click buttons, they can't follow JavaScript links, and they don't like big, long, nasty URLs like

   http://www.example.com/somepage?arg1=one&arg2=two&arg3=three…..

So how do you get those pages indexed?

The major search engines support what they call "site maps", a way that you can submit a list of your pages to Google, in effect telling them "these are the pages on my site that you should crawl." This site map file is a specially-formatted XML file that adheres to specific standards. While Google makes available a tool to help you do this, it is written in the Python language. That's nice, but if Python makes you think of an English comedy troupe rather than a computer program, it may not the solution for you.

GSiteCrawler is a Windows tool that generates site map files that can be used by Google and Yahoo!. You can load it onto your Microsoft web server, or presumably grab the log files from your Apache server, turn the crank, and generate that standards-compliant site map file. Much easier than learning Python.

GSiteCrawler is a Windows app, and will run on any 32-bit Windows platform—Win95 or later. It also requires Internet Explorer 5.5 or better.

Download GSiteCrawler

Comments are closed.