April 23, 2014

Rant: SEO Tests, Cutts Statements, & The Algorithm

belief_200-155062f4e4d8080e7e533ecc37b0db156b3a5474-s6-c10

I’m going to channel my inner @alanbleiweiss and rant for a minute about some things I saw over the last few days in the SEO world. I also want to apologize for any spelling mistakes from the start, as my right arm is in a cast and I’m typing this entirely left-handed until I can find an intern. (If you’re curious as to how I broke my arm, it was with a softball. there’s a video here. )

There’s been lots of SEO chatter lately about a recent SEL post called More Proof Google Counts Press Release Links. and I want to address a couple of issues that came up both in this thread and on Twitter.

First point: what works for one small made-up keyword may not scale or be indicative of search as a whole. Scientists see this in the real world when they notice that Newton’s laws don’t really work at the subatomic level. In SEO algorithms, we have the same phenomenon – and it’s covered in depth by many computer science classes. (Note: I have a computer science degree and used to be a software engineer, but I haven’t studied too much in the information retrieval field. There’s more in depth and profound techniques than the examples I am about to provide.)

A long time ago the Google algorithm was probably just a couple of orders more complex than an SQL statement that says something like “Select * from sites where content like ‘%term’ order by pagerank desc.”

It’s not that simple anymore. Most people think of the algorithm like a static equation. Something like Pagerank + KeywordInTitle – ExactMatchDomain – Penguin – Panda + linkDiversity-Loadtime. I’m pretty sure it’s not.

When I think of the Google Algorithm, (especially with things like Panda and Penguin) I instantly think of a neural network where the algorithm is fed a training set of data and it builds connections to constantly learn and improve what good results are. I’ll refrain from talking more about neural nets because that’s not my main point.

I also want to talk about the branch of information retrieval within computer science. Most of the basic theories (on which, the more complicated ones are built) in IR talk about dynamic weighting based on the corpus. (Corpus, being latin for body and referring here to all of the sites that Google could possible return for a query.)

Here’s an example that talks about one such theory (which uses’s everybody’s favorite @mattcutts over-reaction from 2 years ago: inverse document frequency)

Basically, what this says is that if every document in the result set has the same term on it, that term becomes less important. That makes sense. The real learning here though, is that the weighting of terms is dynamic based on the result set. If term weights can be dynamic for each result set, why can’t anchor text, links, page speed, social signals, or whatever other crazy thing is correlated to rankings? They Can Be!

So let’s look at the made up keyword example. In the case of a made up term, the corpus is very very small. In the SEL example, it’s also very very small.

Now, in this instance, what should Google do? It has pages that contain that word, but they don’t have any traditionally heavily weighted ranking signals. Rather than return no results, the ranking factor weights are changed and the page is returned. That one link actually helps when there’s no other factors to consider. get it?

Think of it as kind of a breadth first search for ranking factors. Given a tree of all factors Google knows about, it first looks at the main ones. If they aren’t present, it goes further down the tree to the less important ones and keeps traversing the tree until it finds something it can use to sort the documents.

It’s like choosing a car. First you decide SUV or Car. Then Brand, Then manual or automatic. Then maybe the color, and finally it’s down to the interface of the radio. But what if the entire car lot only had Red Automatic SUVs? That radio interface would be a LOT more important now wouldn’t it? Google is doing the same thing.

OK, point number 2. Still with me?

We need to stop analyzing every word @mattcutts says like it’s some lost scripture and start paying attention to the meaning of what he says. In this example, Matt was right. Press releases aren’t helping your site – because your site is probably going after keywords that exist on other sites, and since there’s other sites that means the press release link factor is so far down the tree of factors that it’s probably not being used.

Remember when Matt said that Page Speed was a “all else being equal we’ll return the faster site” type of factor? That fits perfectly with the tree and dynamic weights I just talked about.

Instead of looking at the big picture, the meaning, and the reasoning behind what Matt says, we get too caught up on the literal definitions. It’s the equivalent of thinking David and Goliath is a story about how there are giants in the world rather than a story about how man’s use of technology helps him overcome challenges and sets him apart from beasts. We keep taking the wrong message because we’re too literal.

That’s all I want to say. Feel free to leave feedback in the comments.

About Ryan Jones

Ryan Jones is an SEO from Detroit. By day he works as a manager of SEO & Analytics at SapientNitro where his team performs SEO for Fortune500 clients. By night he's either playing hockey or attempting to take over the world with his own websites - which he would have already succeeded in doing had it not been for those meddling kids and their dog. The views expressed here have not been paid for and belong only to Ryan, not any of his employers or clients. Follow Ryan on Twitter at: @RyanJones, add him on Google+ or visit his personal website: www.RyanMJones.com

Comments

  1. Couldn’t agree more – folks can be way too literal, and there is a lack of critical thinking. Competitive advantage for the rest of us ;)

  2. Anthony says:

    Great point and analogy. Thanks for helping me see their algorithm a little bit differently now :)

  3. best post I read in weeks, had to read a few section over before it sunk in and I understood what was said but it means its making my brain think! always sign of a good read.

  4. Agreed on the SEL article and agree here. Thanks for following up with more.

  5. Expanding on the idea that we need to think about the meaning of what Cutts is saying, we should really just make sure we’re taking a step back and focusing on the issue as a whole. All in all they’re after providing relevant user searches. SEO’s should probably think more about actual indicators than how many points each link is worth, or whether it’s natural or not.

  6. Candice A-S says:

    Agree with point number 2! Why would you listen to Matt Cutts? If you have good analytical skills, critical skills, and a good collection of historical data to compare, you should be able to know what really works and doesn’t work for your site! I also agree with Mr. Shure’s comment; sad, but very true that common sense is the one factor that can give someone the competitive advantage.

  7. Alan Bleiweiss says:

    Ha. Let me channel MY inner @AlanBleiweiss here.

    WHY THE FUCK do people choose to completely IGNORE what Matt REALLY said? Even AFTER I covered it and referenced it in my last “Matt Cutts did NOT just kill another SEO kitten” article last year?

    Because what he SAID is REALLY what matters. Everything else is secondary.

    And what he said was not that press release links don’t count or matter. No he did NOT say that.

    He referred to LINKS FROM PRESS RELEASE SITES.

    HOLY CRAP people. Does ANYONE understand the subtle yet CRITICAL refinement of that statement?

  8. Alan Bleiweiss says:

    And for the record Ryan,

    I really am glad you’re in this industry because you’re one of the few ho apply critical thinking to SEO and its why you’re as good as you are at your work.

  9. Well done rant given your corpus has a broken branch, Ryan! ;-) Typing with one hand you were able to channel Alan!

    Bottom line…when all things being equal. Your explanation of the weights and ranking factors is well done and clearly stated.

    Literal = SEO blinders

  10. Critical thinking is one of the most important competences in the field, luckily for us not everyone is capable of doing that, so we can profit from that. From a market perspective this is bad, to many people are distributing useless pieces of information.

  11. Thanks Alan!

  12. Thank heavens that there are free thinking folk like Ryan and Alan around who are prepared to call bullshit when they hear it and who can analyse from scratch. (Jim Hodson is another who deserves an honourable mention)

    For a while I got to the stage of ignoring any commentary on MC’s pronouncements because it was all so predictably ill-considered and misunderstood – hell I even stopped listening to him at all because actually most of it was simple and straightforward and hardly needed to be said. Sadly the industry is full of sheep who just follow the latest misquote or the latest so-called test and regurgitate it endlessly.

    I’ve read many descriptions of SEO tests over the years – some have even been quite inventive – but I can hardly think of any that actually proved anything useful. People need to realise that there’s a Heisenberg-like principle operating on such tests – the more refined they get the less useful they are about saying anything in the real world of normal search results.

    Keep ranting guys.

  13. I love your theory. And your interpretation of the David and Goliath story :)

    It was always funny to me when people try to calculate what percentage of the ranking is attributed to which ranking factor as it was obvious to me that it depends. It also depends on the search intent.

    My favorite example is in eCommerce SEO field as I spent 5 years doing that. I love when a client comes ot me and tells me that his previous SEO made him put 7 paragraphs of size 6 font text about t-shirts under the products in the t-shirt category. It’s a clear sign of not understanding the search intent. No user searching for “buy t-shirts online” is interested in text about the origin of t-shirts. So, the keyword density and anything else related to the textual content of the page becomes pretty much irrelevant in the algorithm that’s supposed to retrieve the best set of results for that search intent.

  14. The fish and the pond theory.. .

    I total agree.. You see it a lot with content when the content is say the same/similar as an amazon product description the pool becomes so massive that you are basically ignored due affs and syndication.

  15. I think we should stop listening to Matt Cutts entirely. Half of what he says is disinformation and propaganda anyway.