Search engines use automated robots to follow the links around the web and grab the content from the web pages they find. The robots are called spiders, and when they follow links they’re crawling the web (also called spidering). Google’s spider is called Googlebot, and you’ll see it listed as the user agent in your server logs. Once a search engine has gathered a site’s data and analyzed it the site is said to be indexed. To see whether your site is in the Google index, search Google for site:yourdomain.com.

New sites don’t always get listed right away. In some cases it can take several months for a new site to show up in the SERPS. Even when a site gets in the index, Many believe that Google puts new sites “in the sandbox” and won’t let them rank well for the initial few months.

The spider keeps on comin’

Once a site is in a search engine, the engine’s spider will periodically revisit it and re-index it from scratch. The engines understand that the Internet is dynamic and changing, so they constantly re-evaluate the pages in their indices. So not only will every engine probably find your site on its own the first time, it will keep visiting it over and over again on its own, too.

Google appears to visit most pages in its database at least once a month, though it may take longer. Some pages get visited every day. Sites with a higher PageRank (i.e., sites that have a lot of inbound links from other sites) get spidered more frequently than sites with a low PR. And sites which update more frequently get spidered more often than sites which rarely make updates. You can try to invite more frequent spider visits by updating your pages more frequently, even if the changes themselves are minor and negligible, though there is questionable advantage in doing so. This won’t necessarily let you test your page ranking ideas through trial and error any faster because even if an engine spiders your new content to see what you have on your page, it won’t necessarily figure out how those changes should affect your rank for weeks or months. And of course, more frequent spider visits by themselves do nothing for your rankings.

Remember then even if a search engine can find a page it might not be able to figure out what that page is about. Spiders eat words, so they have to be able to see the words on your site in order to index them. Spiders can’t read the text that’s in graphics. Any text that you want the spiders to read and index should be written out as text. At the very least, put any text that appears in graphics into the images’ ALT tags. Spiders are getting better at reading the text that’s in Flash but they’re still not very good at it. Make sure any Flash page you have has a “Skip this intro…” link that takes visitors (and spiders) to the text-rich content of your site.

Related posts:

  1. About Search Engine Spiders
  2. Maximizing your SEO results with using the big three search engines
  3. Search Engine Spider Food
  4. Submission in Search Engines
  5. Internal Link Structure, Footer Links Optimization for Search Engines