Keyword Research: We’re ALL Doing It Wrong

  • SumoMe

Okay folks, if I admit that the headline is just a touch “clickbait-y,” will you concede that you clicked because – in your heart of hearts – you’re just not that comfortable with your keyword research process? Why is that? I mean, if you’re paying attention to fellows like Sam Crocker and Richard Baxter, you should have a serious arsenal of keyword research & keyword selection tools and methodologies. Right?

The problem is they’ve got the same issue anyone using Google data has: we all may be using manipulated search data. And I think a lot of us felt that in our bones for years. It’s time to confront some uncomfortable coincidences, contradictions, and facts about the search field and even our own methods.

Inside Man

I spent some years doing SEO for law firm websites at a company that specialized in services for the legal vertical. The nice thing about that was we had really good market intelligence about what the most valuable areas of practice were for our clients. The marketing team was also smart enough to create variable pricing that harmonized reasonably well with how competitive each market (city) would be online. A lot of carefully collected data and research went into marketing and product planning. When it got to us fulfillment people, I probably shouldn’t say much more than it was interesting to see what was on an official keyword list and ranking well versus what was actually driving traffic and conversions. You should know that marching orders were to optimize against what drove leads in addition to what was on the official keyword list. I say this because we were successful enough to grow the business from a couple dozen clients when i started out to 1,000+ in a year or so. Throughout that time, one of the dashboard metrics that HAD to be reported to clients were the rankings for the official campaign keywords. At first we scraped, but quickly decided we couldn’t scale it (cheaply) so we found a vendor to take over rank reporting. Most (if not all) specialists would still scrape and check rankings manually every day based on client requests or if they were just anxious to see if the links they built recently were having any impact. Keep this in mind.

Something Fishy In the State of Minnesota…

One day while looking for a way to integrate more data points into a keyword selection project, I noticed something very strange. Among the most in-demand legal services are:

  • Divorce law
  • Personal injury law
  • Bankruptcy law

While working with Google Insights for Search, I put these together with the seed term “lawyer” and the output really surprised me.

keyword research tool data is bullshit

Minnesota? The nation’s capital for broke and negligent divorcees? All at once? Something wasn’t right. I zeroed in on divorce and looked up divorce statistics by state figuring I would learn something. As it turns out, Minnesota is not leading the nation in divorce; it’s not even top ten! Curious about the state of Google’s own data collection infrastructure, I decided to pit Google against itself and see what Trends had to say (there’s a thought-provoking article that I recommend by Wil Reynolds citing some discrepancies within Google’s own tools for the same keyword ).

Sure enough, Google Trends saw more interest coming out of Minnesota – specifically, the area of St. Paul.

Thinking hard about what could create such high search demand where, ostensibly, there shouldn’t be as much relative to other states, I remembered how we used to collect ranking data: querying the FUCK out of Google. On a hunch, I looked up our primary competitor to see where they were based. Sure enough, competitor HQ is a suburb of St. Paul, MN. Did they find a way to scale their rank scraping better than we could? And hang on, surely this would be something Google would check for and filter anyway, right??


Here’s the thing. Remember that folks were still running queries on Google daily from our IP which was primarily located in a different state from where we were located. This could amount to hundreds of queries per day from our IP – perhaps thousands per month. Take a look at that screen grab again – our IP’s location is definitely on there. And it, too, should not be so high up on the list according to real census data.

Take a Look Around

Just last week I had a look around the weight loss vertical. I noticed a few states pretty well dominating – some of them having more regional search interest than more populace states. Comparing this map to this map, once again things failed to add up. That is until i discovered there’s a chain of weight loss centers that operate out of Texas, Georgia and Florida.

Plug the following into AdWords and have a look at the suggestions and search volumes on exact match basis:

  • Weight loss Houston
  • Weight loss Atlanta
keyword research tools
Doesn’t the output look and smell like a keyword list put together by lazy or production-line keyword list development as opposed to true market behavior? Is it too far of a stretch to believe the output is being affected by furious and constant querying (for rank checking purposes) from the aforementioned regional business who has locally targeted web presences? Don’t think for a second it’s impossible to manufacture search volume even if it is unintentional. As an aside, none of my research outside of Google yielded the same types of keyword suggestions.

Finding Refuge from the Noisy Crowd(sourcing)

I haven’t even touched on the well-known (but little talked about) way many search providers prospect for leads but suffice to say it’s only adding to the problem. I submit that what many of us have suspected to be true for a while is more demonstrably so now. That rank-checking and competitive activity are skewing keyword suggestions, keyword volumes, and even regional interest. If your keyword research process does not explore the world beyond Google, I truly worry about the shape of things to come.

To be clear, I’m not suggesting Google’s data is complete garbage. But for SEO, I tend to use it as directional and a source of inspiration. My PPC colleagues may find Googles data to be much more spot on – I hear few complaints from them but then they have a traffic estimator. Must be nice!

So what can you do to avoid creating a keyword list that is little more than a misleading pile of crap? I offer the following suggestions:

  1. Go beyond Google – Use other tools, other data providers, and read up on keyword research methodologies from the folks that do it seriously (*cough* *cough*).
  2. Know your space – Who are your competitors? What do you know about them? What are the relevant OFFLINE data points you can use to make sense of what you’re seeing come back from your keyword intelligence data providers? Census data worked gangbusters for me on a couple of occasions, by the way. The more information you have, the better you’ll be at spotting bullshit.
  3. Understand your target consumer – Follow them around the web. Go where they like to hang out on the web and just eavesdrop a bit. What you find there usually yields better starting points for keyword ideas.
  4. Use existing performance data – I get giddy when a client has existing Analytics data that goes back at least a year. Especially if they’ve enjoyed some visibility AND had some conversion mechanism on the website. Picking winners is almost academic at that point – take the ones that are contributing to the bottom line and blow those out first. Simplistic, i know, but if I attempted to do a better a job as Nick Eubanks at explaining competitive keyword analysis, I’d likely go mad. Just go read that.
  5. Remember that no one person or entity knows it all – So don’t bank on one data provider, one tool or one loudmouth SEO (yours truly included) to guide you all the way through. Learn from lots of folks and try a bunch of things yourself and develop your own instinct. Once you’ve got the instinct, always keep your eyes open because you’re never done learning. But at least you’ll be able to filter out the nonsense.

Final thoughts

Please note that I’m not leaving out some specific info about the companies and other pertinent details to be deceptive. If you’re clever at all you can figure most of it out. I’m being extra careful to not be interpreted as giving away anything sensitive. On a related note, please note my investigations were my own and NOT sanctioned or otherwise supported by any employer past or present. Now go put a skeptical eye on your keyword portfolio!

21 thoughts on “Keyword Research: We’re ALL Doing It Wrong”

  1. Hi David,

    The Google data supposedly filters automated bots / suspicious activity but I think we all know it’s still not totally accurate. Some time ago I did a study comparing clikcthrough rates extrapolated from Hitwise vs UK Google data – the results were unsurprisingly variable – between 10% and thousands of % +/-!

    I do think the API volumes data has improved from an accuracy point of view, though as I’m sure everyone will agree a secondary data point to validate the search volumes is important.

    For the big stuff we tend to validate with Google analytics referrals – the low hanging fruit filter mentioned (I believe) in the categorisation post you’ve linked to is very powerful. I’m also a big fan of testing the data along side a PPC campaign – for our bigger clients we insist on this as part of the research process.

    Love that someone is stepping up and writing about KW research – nice work Dave!

  2. Thanks, Richard. I dropped a couple links to your stuff precisely because you take a data-intensive approach. Crazy about that Hitwise-Google study… It really is a shame just how shit keyword intelligence tools are for SEO – out of the box, anyway. I think you and a few others have proven they can be made functional.

  3. Nice article Dave. I remember those days of keyword rank scraping – never did we think about the skew in google’s data as a result.

    Part of my keyword research these days is through a tool I built that pulls in analytics data from whatever site I’m working on and computes estimated traffic as a result of change in ranking (factors in current traffic, ranking, ctrs, etc). Shows me very clearly where my current opportunities are and where to focus my efforts. Of course the disadvantage of this tool is discovering new keywords to target.

    1. Thanks, Marc. Glad you’re weighing in. Like anything with Google, it’s tough to say i’ve got definitive proof on my hands. But the evidence just keeps pushing me in the direction i wound up taking here…

      Have you tried Richard’s Keyword Tool on Sounds like your homegrown solution works very similarly to his. Maybe you two should talk some shop?

  4. I’m currently sitting on a beach in Mexico and read this article because I knew it was going to be great. Boy was I right. Keep it up Dave this is excellent insight from a true SEO pro who is actually doing the work himself.

    Google should just provide their own ranking solution and not some garbage broken illusion of a ranking solution built into GWT as a temporary bandaid fix to a huge problem.

    What it does do is forces the bigger businesses to pay to play within PPC to get even remotely close accurate-ish data.

    Keep writing Dave!

    1. Wow, thank you, sir! Google really does have a legacy of treating SEOs like the red-head stepchildren of the Search Marketing industry. I think it’s getting better but ever so slowly…

  5. Interesting article Dave. I never really gave it much thought but it makes perfect sense that all those SEO keyword queries could skew the data. Thinking how often we’d ran searching, our outsourcer scraped data, and then add on our competitors too, it probably was a sizable chuck out of all the natural queries Google would receive and enough to change results.

    Being that SEO is still a growing field the problem will only get worse in time.

    1. We can’t say anything’s conclusive. Just REALLY coincidental and REALLY fishy… The only things that won’t mislead are results and real performance metrics.

    1. It’s definitely fishy if not conclusive. A question I never asked before I ran into this was “is this search volume Google is reporting coming from 1,000 people or 100,000 people?” Much different implications either way.

  6. So I’ve found a couple of cool ways to get solid keyword data. One method is to start a PPC campaign, and take all the key words I’m interested in and I run a strange PPC add, basically the text says this in the tittle and body text. “404 Error website will not display.” I run that ad with a pretty high bid, so I get to see how many impressions that ad gets, and that gives me a better idea of how much traffic that keyword is getting (most of the time people don’t click on the add so it costs virtually nothing). Also, when I want to get keyword ideas, I ask my visitors. I use KISSinsights, install that on client sites and I set the survey to ask them “what they would type into Google to find this site?” Love this post and thanks for sharing Dave! I’m def going to follow you on twitter.

    1. Interesting ideas. What’s the response rate like on that visitor survey? Has it yielded insight that proved valuable? Props for bringing some new ideas to the table 🙂

  7. As a lawyer, I can tell you that qualified potential client pleads rarely originate from ‘head’ keyword phrases like “Minnesota bankruptcy lawyer.” Instead, the qualified leads tend to come from long tail searches, something like “bankruptcy filing Minnesota lawyer who can protect homeownership,” the sort of jumbled collection of phrases that no one would ever intentionally target because it’s too specific.

    There is some degree of search funnel behavior, though, in that potential clients will often learn terms of art like “birth injury lawyer” as they begin their search, but in the end clients are rarely impressed by a high-ranking for a generic keyword. Instead, they look for content that is on-point for the issue that matters to them. Never underestimate the persistence of someone looking for professional services: the better clients didn’t just client on the first result, they tried a variety of related phrases until they found relevant content, and only then started investigating the particular lawyer who posted the content.

    Can this same reasoning be applied outside of the professional services industry? I don’t know that. But I do know that, within professional services, anyone who is spending their time trying to rank for the short keyword phrase with the highest volume is just wasting their time and money.

    1. Thanks for your inside point of view, Max. Like I said in this post, it’s interesting (to say the least) to see what winds up on a keyword list because of what Google reports as valuable keywords, versus what actually moves the needle on leads and traffic.

      Agreed on the point about the search funnel as well. I’d add that I think many businesses in the services industry have a unique problem that other industries don’t: many customers have (or at least feel like they have) a unique problem that can’t be adequately described by a head term. Anyone that needs to travel on a budget knows they’re in the same boat as others looking for [cheap flights]. So visibility for that keyword could actually yield substantial conversions making it worth the time/resources to earn a top spot. The game changes when your grandmother has slipped and fallen in the atrium of an office building owned by a holding company but operated by a property management vendor. That whole sentence could be the qualifiers you use to find a lawyer who’s had experience dealing with that problem successfully.

      Thanks again for the insight.

  8. Nice post Dave!

    @Richard Not too sure about Google being able to filter rank checkers using proxies based on unique IP addresses from different C classes etc.

    What I find very efficient is using Adwords keyword suggestions for generic terms and feed these terms into Google suggest. Then write onsite copy for the long tail terms revealed by Google suggest. Long tail terms in general convert better, especially for services.

  9. Good stuff. The best takeaway from this is to not trust any single point of data completely. I’m always triangulating using a number of different tools to see if they add up. ‘Directionally accurate’ is a beautiful thing.

    Similar to what Richard does I’m always looking for query classes.

    It’s really all about pattern recognition and understanding intent. Tools can get you half way there but you’ll need some brain power to get across the finish line.

  10. Funny how a post from 2011 can still hold true today years later. A great read and very well put together. Gave you a follow on Twitter 🙂

    I recall getting this feeling when I was working in the real estate industry. The keyword phrase “City Real Estate” was head and shoulders above others in search volume, but when we had our agents start polling actual people for how they searched for homes we found zip code queries really brought home the bacon. All those agents, and all their marketing teams, and all those WPG machines running queries for “City Real Estate” super inflated the numbers.

Leave a Reply

Your email address will not be published. Required fields are marked *