28/02/2013

Is Google winning the keyword stuffing war?


One of my pet, pet hates is content thievery. I don't mind the odd <blockquote>, providing it sends a link back to my original article, but that's about it.  Plagiarism sucks.

There are many types, but these are the three aspects of plagiarism that really grind my gears:

  • mechanical article spinning and subsequent keyword stuffing
  • curation without the curator adding their own take on the article
  • blatant plagiarism, passing others work off as your own

The article chosen that ground my gears to dust commits at least two, if not all three, of the aforementioned heinous crimes.

If you've ever been across to zebedeerox.com, you'll already have seen the rants I've had at Google for throwing obviously keyword-stuffed, non-native blog-posts that have been spun with software into my blog feed by way of alerts.

If ever there's a reason to dissuade you from buying article spinning software, just set up a Google alert in a popular niche and see what turns up in your inbox.

Google thinks its got spamming covered?  Not by a long shot.

Don't want to take my word for it? Want proof?  Here's double proof.

According to Mike Geary, you struggle to get any relevant abs or six pack keywords into AdWords to link back to a weight loss site because of the amount of sites out there in the niche that are just pure crap.

After reading the article in question, I know what Google and Mike mean.

So you'd think that the weight loss niche, as Google has already fired a warning shot across its bows, would be a niche they're monitoring consistently for keyword stuffing or duplicate content, wouldn't you?

The answer's yes, by the way. Or at least it should be.

When I was researching an article for my abs workout and nutrition site just the other night, I wanted clarification on the benefits of water for weight loss.

I knew the cleansing benefits for the intestines but was a tad uncertain of the relationship between the liver and kidneys in the whole hydration cycle.  So I put that exact search into Google.

This is the SERP, page 1:

SERP for keyword-stuffed weight loss article
SERP for keyword-stuffed weight loss article
Coming in at number four was an article on exercise4weightloss.com simply entitled Water and Weight loss.

No, I'm not going to credit it with a link as it doesn't deserve one.

However, it does highlight the benefit of having accurate titles and meta descriptions (breadcrumbs) for both SEO and human readership.

I clicked through as it looked like the perfect result for my query, even choosing it over results 1-3 in the SERP.

More keywords stuffed into article than stop words

After the first two sentences, I couldn't believe what I was reading.

Especially having moments ago read this Whiteboard Friday article by Rand on SEOMoz about the impact on SEO and readership of duplicate/low quality content versus unique content.

The article wasn't about keyword stuffing per se, but it did happen upon how Google is striving to reap out low quality and duplicate content since the Panda update.

However, in one of this week's Webmaster Tools videos, Matt Cutts reiterates that Google is taking action on keyword stuffing, treating it as web spam, which sort of ties up my argument quite nicely.

keyword stuffing in weight loss article
keyword stuffing in water benefits for weight loss article
So if low-value sites are being penalised, as per Rand in the SEOMoz video and Google is taking action against sites that are keyword stuffing as per Matt Cutts, how in the name of everything that is pure and quality content, did this article appear on Page 1 of Google:

The words water, weight and loss appear in the first 109 words more than all other words, even stop words apart from 'is' and the full stop itself.


How on earth can that be natural, quality content and not stuffing stuffing?  

Assessing verbs, adjectives and nouns against the stop word count is no way to measure keyword density, of course.

Nonetheless, anyone who's an active blogger will know that stop words far outweigh the rest of the content by volume in every quality article they've ever written.

Whilst I agree in principal with many of the arguments in Rand's duplicate content Whiteboard Friday article, in practise Google is yet to deliver the results against the spam it purports to be actively targeting.

As a matter of interest, I ran the immediately visible content on the landing page (wrapped around an AdSense unit, no less) through NoteTab Light to check the keyword density.

keyword density report for weight loss and water article
keyword density for the weight loss/water article run through NoteTab Light (Fookes Software)
Whilst Matt Cutts and every SEO worth their salt from here to the bottom of the deep, deep blue keeps reminding us that we should forget about a mythical 'perfect' keyword density, there has to be a point where the Google indexers state: that's keyword stuffing.  Or has there?

According to Rand in the video, there's certainly no mythical percentage that Google reads as "okay, you've hit the maximum duplicate content percentage we'll allow, so we'll penalise you."

So why would we think that there's a tipping point for keyword density, especially if the site hosting the keyword-stuffed article 'adds value', as Rand attests happens when Google checks for duplicate content?

If you've swiped someone else's content but your site has more social share activity, traffic or domain authority, Rand's even suggesting that the practise is fair game in Google's eyes.

Intrigued, I checked the Water and Weight loss article through Copyscape, thinking it was worth the five cents just to ease the developing itching curiosity burning away beneath my eyelids.

Have a stab at what I found?

Not only was the content not unique, but it also appeared four other times on the internet:

  • A snippet from somewhere else on the same site
  • 541 words straight, as in word-for-word, in a forum (which came first, chicken or egg? wouldn't like to bet)
  • and twice snippets appeared on a Nigerian 'news' site  

Why can't Google list duplicate content in the order it was published?

According to Rand - or it may have been in the comments below the article - Google cannot yet determine in its results which article was published first in the instances of duplicate content.

First of all, I'm no techy, but I'd say that was a load of old pigswill.

Rather, Google will choose not to list SERPs in that order as the articles that were the progenitors of the content may not be running ads that would make Google money.  Perhaps .gov or .ac sites, you know?

Domain authority sites, however, would be running some sort of AdSense program or attract so many visitors that sponsored ads would be clicked more times than castanets in a Flamenco dancer's world record bid for the longest solo performance outside of EspaƱa.

I digress; if I'd have published that content first, even in the knowledge that it was more stuffed than a turkey on December 25th, I'd have the date showing next to the post.

Webmaster 'Julie' (mm?) obviously hasn't the conviction to post dates next to her articles, a practise 'she' stopped in 2011, by the look of things.  Suspicious?  You bet, Watson.

What can we surmise from all of this evidence on keyword stuffing and duplicate content?

My advice to everyone is this: write for your audience.

Don't get caught up in keyword density, but have a very clear outline of what you want your post to be about before you start to write it.

The chances are, if you know what you're trying to say and can transpose thought to word at least a little cohesively, the search engine crawlers will know what you're on about, too.

If it helps, write your keywords down or have them on a floating sticky and slip them into your article where they fit naturally.  Do not force them in like in the article written by 'Julie', even if it does mean your page ranks highly.

Attracting traffic and/or page rank is one thing; retaining/converting customers is another entirely

The article, make no bones, reads awfully from start to finish.  I even felt sorry for the little subscribe button beneath the article; it must be a very lonely widget indeed.

Ask yourself this if you're trying to promote an affiliate product or trying to sell your service: would you choose to fill your own inbox with content put together by someone so desperate to sell you their product that they compromise their content for the sake of the sale?

Check out Julie's amazon side bars and 'about' page, which even has an affiliate link to a site-builder, to see what I mean about desperation seeping through the content like blood through the brickwork in a 4-bed in Amityville.

Articles like the Water/Weight Loss one in the example have to be the exception that proves the rule.  At least, I hope against hope that that's the case.

Yet, I can't help but think back to the crap Google used to send against my alerts and pass off as 'content', almost with a smug 'job well done' as the notification hit my inbox.  It makes me think twice (and shudder).

Plagiarism sucks, keyword stuffing sucks and poor quality content just drives your customers away in droves.

If you've got a fantastic site with a brilliant product and a huge gap in the market waiting to be filled, draft your article/promo/press release/e-mail, list your keywords and pay a professional blogger to re-write it.

So the aforementioned article is still holding onto its miraculous page 1 status with such dire quality; who's going to stick around to read it after paragraph two (other than some sad sod who's gonna rip it to pieces)?

I honestly hope that the keyword stuffers and content thieves who read this article think twice about messing about with a genuine author's content, but I doubt it.

Until Google comes clean, lists similar results in the order they were published and takes down sites with reams of duplicate content, the practise will continue.  And us victims will whinge about it until stops...
...thank you for listening.

Oh, the final keyword density for the word water was 5.92%, appearing 66 times in an article of less than 1,000 words.  That even beat the amount of full stops, period.

The word weight finished on 2.33%, loss on 1.71%.  I rest my case, m;lud.

Enhanced by Zemanta