If you’re not familiar with GSI, let’s start with Google’s own definition of their new supplemental index.
“A supplemental result is just like a regular web result, except that it’s pulled from our supplemental index.
We’re able to place fewer restraints on sites we crawl for this supplemental index than we do on sites that are crawled for our main index.
For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index.
If you’re a webmaster, please note that the index in which a site is included is completely automated; there’s no way to select or change the index in which a site appears. Please also be assured that the index in which a site is included doesn’t affect its Page Rank.”
Although there is no direct way you can influence which pages of your site will be in the regular or supplemental index, you can take steps not to let your site get put into Google’s Supplemental Index (GSI) in the first place.
Google is telling webmasters if your site is difficult to crawl or deemed to be of low quality without much to differentiate it with unique content from other sites in your niche, your site will be placed in a second rate low-quality index.
However, even though it would appear that there is no direct way of influencing which of your site pages will be affected, there are certainly a few precautions you can take to avoid getting listed on GSI.
Basically, it’s about quality. If the Google crawler finds your site difficult to index or locates duplicate content chances are you’ll end up in “Supplemental Hell”.
(Note: Automatic PLR rewriting software does NOT cut it anymore and will more than likely get you GSI’ed quicker than you can say “Latent Semantic Indexing”!
So why do so many pages end up in Google’s Supplemental Index?
Well, here are a few things that spring to mind…
1. Using Duplicate Content (on the same site or eternally).
2. The page in question contains the same Title and META tags as other pages on your site.
3. Your site creates crawling problems due to redirects, java script navigation, or simply too many parameters used in the URL or session IDs.
4. Having loads of unrelated external links on one page (just don’t do it!), or not enough internal or external inbound links to add “weight” to the page.
5. Your web page no longer exists or is orphaned with no internal links pointing to it or buried too deep to be crawled properly.
… and a few steps to avoiding it!
Remove duplicate content from your website and keep your remaining content as fresh and unique as possible.
If you’re using PLR content – rewrite at least 30-50% of it.
Make sure your domain name appears consistent (i.e. with “www” or without “www”.)
Do not overcomplicate the site structure or navigation system.
Shorten any long URLs to something simple.
Increase relevant inbound links and use contextual linking where possible.
Use deep-linking (linking to other pages than just your index page), but try to keep within 2-3 levels.
Create and submit a Sitemap which will allow Google easy access to all your web pages and ensure that all your pages are indexed regularly and correctly.
The above is by no means an exhaustive list of avoiding Google’s Supplemental Index, but it is certainly enough to get started on.