August 8, 2020

Scraper Sites Save Lost Content

One of the caveats of using a 3rd party site to blog is that you really have no control if and when that 3rd party site ever shuts down or goes away. I shutter to think of all the content that the web lost when Geocities shut down last week, or how the web’s link graph was suddenly altered with the disappearance of the tr.im URL shortening service. These things may seem minor, but in the large scale of the web they certainly can have a significant effect.

All of the past shutdowns have changed my way of writing on the web. I now almost always double-post, or at least save a local copy of every post I write. I’ve even started using my own URL shortening service Tiny.tw so that even if I ever have to take it offline, I can still control where all the links go.

But what about content that you didn’t backup? Is it gone?

One of the first things I always tell parents to advise their kids about the internet is this: Once it’s out there, it really never goes away. Even if you upload something to one site, that’s not going to stop somebody else from using it.

Case in point: scraper sites.

I spent hours last night looking for an old post of mine that I posted on Shoutwire almost 2 years ago – but I couldn’t find it. Then, after some creative Google searches, I managed to locate the post on a made for adsense scraper site. Somebody had stripped out my name but taken my content. Normally I’d despise such a thing, but in this case it saved my ass and I was able to dig up the post.

If anything, it got me thinking about permanence, the internet, and what happens when sites you rely on disappear. It’s also made me realize that spam sites could just be a bit useful after all.

Oh, if you’re wondering what the post was, I re-posted it on my Blog here: 10 Endangered Ideas

color picker
About Ryan Jones

Ryan Jones is an SEO from Detroit. By day he works as a manager of SEO & Analytics at SapientNitro where his team performs SEO for Fortune500 clients. By night he's either playing hockey or attempting to take over the world with his own websites - which he would have already succeeded in doing had it not been for those meddling kids and their dog. The views expressed here have not been paid for and belong only to Ryan, not any of his employers or clients. Follow Ryan on Twitter at: @RyanJones, add him on Google+ or visit his personal website: www.RyanMJones.com

Comments

  1. This is why I don’t like URL shorteners: http://bit.ly/iP7DB