On Cnet deleting its archive

CNet Deletes Thousands of Old Articles in an Attempt to Game Google Search – Pixel Envy:

Google says this whole strategy is bullshit. A bunch of SEO types Germain interviewed swear by it, but they believe in a lot of really bizarre stuff. It sounds like nonsense to me. After all, Google also prioritizes authority, and a well-known website which has chronicled the history of an industry for decades is pretty damn impressive. Why would “a 1996 article about available AOL service tiers” — per the internal memo — cause a negative effect on the site’s rankings, anyhow? I cannot think of a good reason why a news site purging its archives makes any sense whatsoever.
There’s been quite a kerfuffle about this. This is an area where I have more than a little experience, and although it sounds counter-intuitive it is completely true that there are instances where it's better for users and the site for old content to be removed.

Although, as Nick points out, Google advises that simply deleting content does nothing for you there are three circumstances where deleting content very definitely does improve your SEO. But you don’t just delete it. Deleting content without redirecting it or in an unstructured manner just leaves you with a bunch of 404s, which you don’t want. It will also almost certainly break some of the crawl paths which Google and other robots use to find their ways around the site.

But there are circumstances where you want to delete and redirect content, either because it’s a bunch of content which is actively harming your site’s authority with Google or because it no longer best serves the needs of your audience.

The first is where that content is thin. Thin content is typical old-style news in brief pieces which are very short. Google has always disliked short content (the rule of thumb is under 300 words) and while a few pieces are fine if a sizeable percentage of your content is thin it can hurt you. Those kind of stories tend to date from the early/mid 00’s, when blasting out tonnes of content was the fashion, and a lot of new-in-brief pieces got written.

The second is when you have lots of repetitive or duplicate content – content which essentially says the same thing, over and over and over again. Big news sites do this a lot, because often with news you have covered the same story with more or less the same facts for a long time. But you will often also have content which is essentially the same, because people have the same idea for an article and don’t bother to check if it already exists – leading to two very similar articles.

Why does that matter? Because Google likes it when there’s one article on your site which provides a clear answer to a specific search query. If you have written two articles on, say, the history of the Mac Plus then it doesn’t know which one to rank and so basically down-ranks both.

The third circumstance is where you have old content receiving no traffic but which is about a keyword you are targeting. Every page has authority on some topics, even if it doesn’t rank well or at all. Often, old content isn’t maintained well. Google likes content which is updated with fresh information, because that content tends to best-serve users arriving from search. If you don’t update content, it tends to gradually lose ranking over time.

Sometimes the best approach with content like this is to start fresh – particularly when you have multiple articles on the same topic. In that case, deleting the old piece and redirecting it to a new URL is the right approach. You get the minimal authority of the old page, sending a clear signal to Google that the new page is the right one for any search queries you previously ranked for.

The Cnet memo on its process is actually a model for how you should do it, with clear guidance and opt-outs for content which is of historical value. Most content isn’t – remember the old adage that today’s news is tomorrow’s fish-and-chip paper – but some stories clearly are. They also ensure that anything deleted is in the Internet Archive (which is another reason why the clear attempts of some publishers to kill it are so stupid).

As a writer, all this can be hard to take – after all you want to see all your articles available – but there are things you can do about it. First, make sure that you keep copies of your work. If you work for a site with an SEO team, talk to them about republishing it on your own personal blog (you can add a canonical to your post to show where the original version was published, and this is actually good for their SEO). And use Authory to keep an archive of everything across every site you publish on.

Ian Betteridge @ianbetteridge