WEBSITE

Developing an SEO-Friendly Website: Content Delivery and Search Spider Control (part 1)

1/4/2011 9:09:30 AM
On occasion, it can be valuable to show search engines one version of content and show humans a different version. This is technically called cloaking, and the search engines’ guidelines have near-universal policies restricting this. In practice, many websites, large and small, appear to use content delivery effectively and without being penalized by the search engines. However, use great care if you implement these techniques, and know the risks that you are taking.

1. Cloaking and Segmenting Content Delivery

Before we discuss the risks and potential benefits of cloaking-based practices, take a look at Figure 1, which shows an illustration of how cloaking works.

Figure 1. How cloaking works


Google’s Matt Cutts, head of Google’s webspam team, has made strong public statements indicating that all forms of cloaking (other than First Click Free) are subject to penalty. This was also largely backed by statements by Google’s John Mueller in a May 2009 interview, which you can read at http://www.stonetemple.com/articles/interview-john-mueller.shtml.

Google makes its policy pretty clear in its Guidelines on Cloaking (http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=66355):

Serving up different results based on user agent may cause your site to be perceived as deceptive and removed from the Google index.

There are two critical pieces in the preceding quote: may and user agent. It is true that if you cloak in the wrong ways, with the wrong intent, Google and the other search engines may remove you from their index, and if you do it egregiously, they certainly will. But in some cases, it may be the right thing to do, both from a user experience perspective and from an engine’s perspective.

The key is intent: if the engines feel you are attempting to manipulate their rankings or results through cloaking, they may take adverse action against your site. If, however, the intent of your content delivery doesn’t interfere with their goals, you’re less likely to be subject to a penalty, as long as you don’t violate important technical tenets (which we’ll discuss shortly).

What follows are some examples of websites that perform some level of cloaking:


Google

Search for google toolbar or google translate or adwords or any number of Google properties and note how the URL you see in the search results and the one you land on almost never match. What’s more, on many of these pages, whether you’re logged in or not, you might see some content that is different from what’s in the cache.


NYTimes.com

The interstitial ads, the request to log in/create an account after five clicks, and the archive inclusion are all showing different content to engines versus humans.


Wine.com

In addition to some redirection based on your path, there’s the state overlay forcing you to select a shipping location prior to seeing any prices (or any pages). That’s a form the engines don’t have to fill out.


Yelp.com

Geotargeting through cookies based on location is a very popular form of local targeting that hundreds, if not thousands, of sites use.


Amazon.com

At SMX Advanced 2008 there was quite a lot of discussion about how Amazon does some cloaking (http://www.naturalsearchblog.com/archives/2008/06/03/s-secret-to-dominating-serp-results/). In addition, Amazon does lots of fun things with its Buybox..com subdomain and with the navigation paths and suggested products if your browser accepts cookies.


Trulia.com

Trulia was found to be doing some interesting redirects on partner pages and its own site (http://www.bramblog.com/trulia-caught-cloaking-red-handed/).

The message should be clear. Cloaking isn’t always evil, it won’t always get you banned, and you can do some pretty smart things with it. The key to all of this is your intent. If you are doing it for reasons that are not deceptive and that provide a positive experience for users and search engines, you might not run into problems. However, there is no guarantee of this, so use these types of techniques with great care, and know that you may still get penalized for it.

2. When to Show Different Content to Engines and Visitors

There are a few common causes for displaying content differently to different visitors, including search engines. Here are some of the most common ones:


Multivariate and A/B split testing

Testing landing pages for conversions requires that you show different content to different visitors to test performance. In these cases, it is best to display the content using JavaScript/cookies/sessions and give the search engines a single, canonical version of the page that doesn’t change with every new spidering (though this won’t necessarily hurt you). Google offers software called Google Website Optimizer to perform this function.


Content requiring registration and First Click Free

If you force registration (paid or free) on users to view specific content pieces, it is best to keep the URL the same for both logged-in and non-logged-in users and to show a snippet (one to two paragraphs is usually enough) to non-logged-in users and search engines. If you want to display the full content to search engines, you have the option to provide some rules for content delivery, such as showing the first one to two pages of content to a new visitor without requiring registration, and then requesting registration after that grace period. This keeps your intent more honest, and you can use cookies or sessions to restrict human visitors while showing the full pieces to the engines.

In this scenario, you might also opt to participate in a specific program from Google called First Click Free, wherein websites can expose “premium” or login-restricted content to Google’s spiders, as long as users who click from the engine’s results are given the ability to view that first article for free. Many prominent web publishers employ this tactic, including the popular site, Experts-Exchange.com.

To be specific, to implement First Click Free, the publisher must grant Googlebot (and presumably the other search engine spiders) access to all the content they want indexed, even if users normally have to log in to see the content. The user who visits the site will still need to log in, but the search engine spider will not have to do so. This will lead to the content showing up in the search engine results when applicable. However, if a user clicks on that search result, you must permit him to view the entire article (all pages of a given article if it is a multiple-page article). Once the user clicks to look at another article on your site, you can still require him to log in.

For more details, visit Google’s First Click Free program page at http://googlewebmastercentral.blogspot.com/2008/10/first-click-free-for-web-search.html.


Navigation unspiderable to search engines

If your navigation is in Flash, JavaScript, a Java application, or another unspiderable format, you should consider showing search engines a version that has spiderable, crawlable content in HTML. Many sites do this simply with CSS layers, displaying a human-visible, search-invisible layer and a layer for the engines (and less capable browsers, such as mobile browsers). You can also employ the noscript tag for this purpose, although it is generally riskier, as many spammers have applied noscript as a way to hide content. Adobe recently launched a portal on SEO and Flash and provides best practices that have been cleared by the engines to help make Flash content discoverable. Take care to make sure the content shown in the search-visible layer is substantially the same as it is in the human-visible layer.


Duplicate content

If a significant portion of a page’s content is duplicated, you might consider restricting spider access to it by placing it in an iframe that’s restricted by robots.txt. This ensures that you can show the engines the unique portion of your pages, while protecting against duplicate content problems. We will discuss this in more detail in the next section.


Different content for different users

At times you might target content uniquely to users from different geographies (such as different product offerings that are more popular in their area), with different screen resolutions (to make the content fit their screen size better), or who entered your site from different navigation points. In these instances, it is best to have a “default” version of content that’s shown to users who don’t exhibit these traits to show to search engines as well.

Other  
 
Most View
Exchange Server 2010 : Working with Dynamic Distribution Groups (part 1) - Creating Dynamic Distribution Groups
Sony Vaio Duo 11 - A New Hybrid Touchscreen Tablet And Ultrabook Convertible (Part 2)
Some Of The Biggest Brands In The World Had Their Products (Part 11)
Beautiful Systems Audio Note - Hot Property (Part 1)
OpenGL on OS X : Full-Screen Rendering
Sharepoint 2013 : Using SkyDrive Pro - Accessing your SkyDrive Pro account, Saving files to SkyDrive Pro
Windows Server 2003 : Using the Indexing Service - Creating and Configuring Catalogs
Panasonic Lumix DMC-TZ40 Travel Zoom Camera Review (Part 2)
Thermaltake Chaser A31 - Mid Tower Computer Case
Launch Center Pro - Action stations!
Top 10
SQL Server 2012 : Consolidating Data Capture with SQLdiag - Getting Friendly with SQLdiag (part 2) - Using SQLdiag as a Service
SQL Server 2012 : Consolidating Data Capture with SQLdiag - Getting Friendly with SQLdiag (part 1) - Using SQLdiag as a Command-line Application
SQL Server 2012 : Consolidating Data Capture with SQLdiag - The Data Collection Dilemma, An Approach to Data Collection
SQL Server 2012 : Troubleshooting Methodology and Practices - Data Analysis, Validating and Implementing Resolution
SQL Server 2012 : Troubleshooting Methodology and Practices - Data Collection
SQL Server 2012 : Troubleshooting Methodology and Practices - Defining the Problem
SQL Server 2012 : Troubleshooting Methodology and Practices - Approaching Problems
Windows 8 : Accessing System Image Backup and Recovery Functionality with Windows Backup, Cloud Backup
Windows 8 : Using the Windows 8 Recovery Tools (part 2) - Push Button Reset
Windows 8 : Using the Windows 8 Recovery Tools (part 1) - Creating a System Recovery Disc, Booting to the Windows Recovery Environment