Profile Information

I am search scientist at Moz. I have 3 amazing daughters Claren, Aven and Ellis, an incomparable wife Morgan, and am a Christian, democrat nerd who often doesn't know when to shut his mouth :-)

Full Name Russ Jones
Display Name rjonesx.
Job Title Principal Search Scientist
Company Moz
Type of Work Design / Development
Location Cary, NC
Favorite Thing About SEO research
Additional Contact Info @rjonesx
Favorite Topics Advanced SEO, Competitive Research, Keyword Research, Link Building, Technical SEO

Blog Comments & Posts

Paint by Numbers: Using Data to Produce Great Content

It happens to the best of us. Sometimes it is hard to find that inspiration for writing even though we feel a drive to produce more and better content to push marketing numbers. Sometimes we should stand back and let the numbers do the writing for us.

June 26, 2017   31 26
Google Search Console Reliability: Webmaster Tools on Trial

Can Google Search Console data be trusted? Everyone uses it, but does it really reflect what's happening on the web? Is it out of date? Is it biased? We take a hard look at several metrics in Google Search Console, measuring internal and external validity.

January 31, 2017   53 57
Google Keyword Planner's Dirty Secrets

Google Keyword Planner has some pretty scary skeletons in its closet! Learn about all the dirty secrets that should make you think twice about relying on Google's volume estimates.

December 1, 2015   74 95
Moz Transitions: Rand to Step Away from Operations and into Advisory Role in Early 2018
Blog Post: July 13, 2017
The Unspoken Reality of Net Neutrality
Blog Post: July 12, 2017
  • Russ Jones

    Net Neutrality, or in particular Title II, requires that "common carriers" not "make any unjust or unreasonable discrimination in charges, practices, classifications, regulations, facilities, or services for or in connection with like communication service"

    Those who fall on the other side of the debate tend to make a couple of arguments...

    1. Such restrictions will unduly burden Internet providers from innovating.

    2. The FCC might unfairly interpret this rule to demand further government intrusions into privacy.

    I fail to see how #1 is the case unless Title II is interpreted in ways that it historically has not. And, while I am concerned about privacy (#2), it seems to me that the government has plenty of tools already to influence governmental data sharing, and Title II hardly improves on that tool set.


  • Russ Jones

    I can't say that Moz as a company is doing more than having our employees write and edit a post and the social team promote it and putting up the banner on the homepage. I know our employees have been doing quite a bit on their own, though.

    I, too, was a little surprised to see things like individual subreddits (/r/videos) doing a lot more (they literally blocked access altogether) than the site as a whole. In retrospect, I personally wish Moz had done a little more, but I will put that on my own shoulders for not advocating for it.

    I do thank you for your active and sacrificial participation. Things change when good folks like you choose to shoulder an unfair burden to secure rights for all.

  • Russ Jones

    Hey Josh,

    Thank you for the thoughtful question. It is important that we get a good grasp of what Title II of the Communications Act of 1934 actually says...

    "It shall be unlawful for any common carrier to make any unjust or unreasonable discrimination in charges, practices, classifications, regulations, facilities, or services for or in connection with like communication service, directly or indirectly, by any means or device, or to make or give any undue or unreasonable preference or advantage to any particular person, class of persons, or locality, or to subject any particular person, class of persons, or locality to any undue or unreasonable prejudice or disadvantage. "

    Notice that the language begins with "unjust" or "unreasonable". Title II does not prevent ISPs from creating novel service offerings that appeal to customers (both subscribers and publishers). However, it requires that those service offerings meet the standards of being neither "unjust" nor "unreasonable", and gives the FCC regulatory authority to interpret what that means. It is highly improbable that the FCC would consider offering prioritization of 911 calls, telemedicine for veterans, or public school video feeds as "unjust or unreasonable discrimination", especially if that prioritization were offered for free. What is of concern is what telcos can do if they are allowed to be unjust or unreasonable.

    Let's give an example of what things might look like under Title II or not.

    Scenario: AT&T and Verizon, two of the largest landline phone providers, decide that they want to earn extra revenue from VOIP traffic in order to shift users to their landline business. Thus, they require a new premium of 20% extra to have VOIP enabled Internet access.

    With Title II, a customer or a VOIP business or the local municipality's 911 Call Center could file a claim with the FCC and have it considered against this language of the "unjust or unreasonable". The FCC can then require the organization to take "action[s] necessary or desirable in the public interest".

    Without Title II, a complaint can still be made, but the legal framework would largely disappear.

    Finally, it is worth noting that nothing about Title II prevents Congress from passing other laws which protect vital services like those I mentioned above, and I (although not speaking for Moz here) would support such an endeavor.

  • Russ Jones

    Thanks for the reply. I certainly agree it is much broader than a political issue, however I think at this point we do have a clear "political position" to advocate, which is to prevent the undoing of Title II protections in August.

Tackling Tag Sprawl: Crawl Budget, Duplicate Content, and User-Generated Content
Blog Post: May 24, 2017
  • Russ Jones

    Thanks!

    > noindex tag pages that had no impressions in GSC in the past 90 days

    This is a fantastic short cut that I too have used in the past. The biggest concern is when tags might be seasonal, which was quite common in our data set for this problem. We expanded the horizon to an entire 12 months to avoid that problem (although it could still run into issues with words like Olympics or Election which might be even less frequent)

  • Russ Jones

    I think Jake Bohall the project lead knows more about these questions. Ill get him to respond when he wakes up :)

Not Your Dad's Keyword Tool: Advanced Keyword Research Use Cases
Blog Post: May 08, 2017
  • Russ Jones

    Hi David,

    Great point about Seasonal Data. We definitely are looking to it, although we want to do it right. Most tools out there show only the last 12 months which isn't enough to confirm seasonality. We also rely on clickstream data to build our traffic models and won't have 12 months of that until September.

    That being said, we are definitely looking to add this kind of feature but we want it to be highly valuable, accurate, and actionable when we roll it out.

    Thanks!

  • Russ Jones

    That was the hope! Most of these types of uses are really just efficiency gains. There have been ways to do this kind of stuff in the past, but it meant building spreadsheets, populating data from multiple locations, writing your own aggregate formulas, etc.

  • Russ Jones

    Thanks! I think lots of SEOs do this kind of stuff to a degree, but Keyword Explorer lists just makes it really easy.

  • Russ Jones

    Hey folks, if you have any questions you can ask here or hit me up on twitter @rjonesx! Looking forward to it!

The State of Links: Yesterday's Ranking Factor?
Blog Post: April 25, 2017
  • Russ Jones

    Good points all around!

    I am still quite skeptical of "Brand Awareness". I think most of what we could consider Brand Awareness is captured in other metrics and wouldn't add much more in the long run, but I'm not quite sure.

  • Russ Jones

    That's kind of like asking for democracy to get so good it doesn't need votes.

  • Russ Jones

    First off, great article and great topic for discussion. It is refreshing to read thoughtful content like this. I thought I'd drop in some ideas for consideration.

    > We should take with a large pinch of salt any study that does not address the possibilities of reverse causation, or a jointly-causing third factor.

    We need to be careful to do this for all factors. Just as much as rankings might cause links, brand might cause links too. Massive advertising campaigns, newsworthiness, and other real-world interactions via brands can influence the link graph.

    Moreover, brands can have spurious results with ranking correlations in other ways. For example, you could make the assumption that brands spend more on on-site SEO or faster websites which could influence rankings. Brands might already have a relationship with the customer creating higher engagement from the SERPs which might influence rankings.

    I think we have to proceed with extreme caution before moving to any conclusion that "Google uses brand measurements as a ranking factor", even if it seems to have a great deal of explanatory value.

    Finally, we do have a weapon against these types of issues - experimental studies. There are dozens of studies available on the web (and more conducted privately) which show a clear, causal relationship between link acquisition and improved ranking.

    > Flux and CTR

    I actually think that Google's addition of SERP features will increase the CTR for top results, rather than "drive the opposite". As a searcher makes a quick observation of the results revealed to them, if only 1 organic result appears (or the first 2 or 3), they may be more inclined than ever to click on those results rather than scroll and find the remainder. Moreover, as Google limits the number of total organic results on the initial results page (often 7 or 8), one would expect a steeper curve.

    > brand awareness seems to explain away most of their statistical usefulness

    Or, brand awareness just happens to correlate with a number of other actual ranking factors (links, content quality, site speed, user engagement).

    > For competitive queries with lots of search volume, links don’t tell Google anything it couldn’t figure out anyway

    But then again, we see sites like PolicyGenius ranking in the top 10 for "Life Insurance", a highly competitive, highly valuable term, despite having little brand presence relative to the big insurers.

    > user signals

    I have seen controlled studies of user signals like CTR give modest (1 position in a month, for example) rankings improvements, but I am not convinced these are sufficient to explain much about the top rankings. I don't mean to say they don't matter (I am almost certain that Google does measure these in one form or fashion), but call me skeptical as to their prominence.

    > However, links may still be a big part of how you qualify for that competition in the top end.

    I think this is a fair statement. I'm guessing Google has studied the relationship between each of its myriad ranking factors and user satisfaction. I imagine that links are not a highly sensitive factor in this regard (that is to say, a page with 100 backlinks is not significantly likely to please a user more than one with 99). We could even go so far as to imagine several types of relationships between links and rankings, such as a percent based approach where your backlink score is relative to other matching documents, thus the score for Document A1 and A2 with 1000 and 2000 links respectively for query A might be the same as Document B1 and B2 with 100 and 200 links respectively for query B. If a ratio of 1:2 is used, it is possible that other factors like content quality or user metrics might have an identical impact even though at first glance we would expect the site with 2000 links vs 1000 to have a wider advantage than the 200 vs 100 on raw count alone.

    Ultimately, I think that Google's continued move towards machine-learned algorithms means teasing out individual ranking factors via correlation studies will become more and more difficult. The relationships between factors will can vary from query to query. Our best bet, in this regard, is to experiment and moderate. That is to say, we shouldn't rely on a single factor (link building for example) once we determine its effectiveness.

Google Search Console Reliability: Webmaster Tools on Trial
Blog Post: January 31, 2017
  • Russ Jones

    Awesome, thank you for your response...

    1. Regarding GSC and rare keywords: Great find, but it still means that GSC's data is incomplete (albeit intentionally). This could be particularly problematic for a site with a good deal of long tail traffic. If most of their keywords are rarely searched, but there are a sufficient number of them, they could appear to have little to no traffic in GSC when actually accumulating a decent stream of visitors via organic.

    2. Clicks in GSC vs GA: No disagreement here, as I concluded "After analyzing several properties without similar problems as the first, we identified a range of approximately .94 to .99 correlation between GSC and Google Analytics reporting on organic landing pages. This seems pretty strong." The differences here really are trivial at the aggregate level.

    3. Conclusion about impressions in GSC: I stand by this conclusion. Our experiments clearly demonstrated that at least some impressions are ignored (which you admit as much when it is a rare keyword) and that the collection methods render the data rather meaningless. This is why I said "misleading at best, false at worst".

    Thanks for your thoughts and the links!

  • Russ Jones

    Great Questions

    About your test on how Google is tracking impressions and clicks - how big the site was that you tested it on? If you tested this on a large, high traffic site, I think it would also be interesting to test on a smaller, low traffic site to see if there are any differences.

    The experiment was done on a low traffic site, but the comparative method was tested on sites ranging from a few visits a week to hundreds of thousands per week. I think it was a fairly diverse set.

    On your correlation of Rank+Impressions to Clicks – did you set the country when looking at GSC average rank? Your rank tracker is probably looking at Google US so I was just wondering if you were only looking at US ranking data in GSC? Was mobile or desktop specified in GSC to match your rank tracker data?

    Good question. I did not differentiate. However, it would be even more bizarre if desktop, US tracking did a better job of predicting GSC click numbers (which are blended desktop, country, etc.) than the appropriate blended ranking of GSC. However, it is worth taking another look! Great catch!

    We know that Google only records the ranking position when it gets an impression, so in your first example with the piece of content moving from position 80, to 70, to 60, an eventually to position 1, it would obviously be getting a lot more impressions at position 1 than position 80.

    I would agree with you if it were not for the experiment we ran which showed that Google doesn't count a large number of impressions (out of 84 impressions delivered, it only showed 2). Bizarrely, it showed hundreds of impressions for those same landing pages except for keywords that ranked in the 80+ position!

    At any rate, I think the critique is fair to consider, and perhaps my explanation of why GSC is untrustworthy in this regard isn't quite correct, but it still stands that GSC is untrustworthy in this regard, it is just yet explained.

    Thanks for the really bright and thoughtful critiques / insights. You have a fantastic mind for this sort of thing, I'm impressed!

  • Russ Jones

    I tested that briefly but didn't complete in time for launch. I found that the numbers were actually fairly accurate IF you grepped your log files carefully. You have to exclude any non-HTML entry in the logs, you have to remove redirects, you have to sync the dates appropriately to GMT, etc. It took several steps, but eventually the numbers came out fairly close. That being said, this was for smaller sites and not enough for me to include in my evaluation above.

    Maybe you could write a post on that ;-)

  • Russ Jones

    Good question.

    1. I like comparing landing pages from GSC to GA because discontinuity means that there are redirects Google's Index hasn't picked up yet.

    2. I think it is valuable for finding new keywords

    At this point, if you can afford a 3rd party rank tracking solution, I would recommend it for just about everything else.

  • Russ Jones

    Thanks for your response. A couple of thoughts...

    "HTML Recommendations: I seriously doubt anyone at Google thinks you should be using this tool determine whether you have valid HTML"

    I agree, but certainly many Webmasters think they should use this tool to do just that. My intent is to inform webmasters why the data isn't wholly trustworthy. I say just as much in my conclusion to that section... "Given this difference, it almost always makes sense to crawl your site for these types of issues in addition to using GSC"

    "Index Status: GSC is not going to be perfect here either, but I have found it to be pretty darn close"

    I'm not sure what you are complaining about here. I said just the same thing... "I think it is safe to conclude that the Index Status metric is probably the most reliable one available to us in regards to the number of pages actually included in Google's index."

    "Internal Links: I think this is the MOST important report in all of GSC"

    Good, so do I. Again, I refer you to the conclusion of the section on internal links in which I write... "As search marketers, in this case we must be concerned with internal validity, or what Google believes about our site. I highly recommend comparing Google's numbers to your own site crawl to determine if there is important content which Google determines you have ignored in your internal linking."

    I have to admit, I am really confused. I agree with everything you have said and my research as presented above does as well. Did you read the article? Perhaps I didn't make my conclusions clear enough.

  • Russ Jones

    I think the inaccuracies are often just byproducts of the data collection and presentation method. Take, for example, the HTML Recommendations. Chances are, at some regularity, Google produces a list of issues and ports it into GSC for you. However, less often does GSC prune those examples which are no longer an issue. So you end up with out-dated material, both in reference to the web and Google's current index.

    I don't think any of these are malicious on Google's behalf by any stretch of the imagination.

  • Russ Jones

    My intent here was simply to show that rank tracking, with all its flaws, still predicts actual clicks better than the "average position" in GSC. You can get better data in GSC by clicking into a keyword and seeing the number of impressions in individual position, but even then the data can be problematic. Thanks for the comment!

  • Russ Jones

    Good tip!

  • Russ Jones

    This is a good tip. A lot of us correlate rankings->keywords but it isn't an exact science. Thanks!

  • Russ Jones

    You weren't a sheep :-) My guess is that Google doesn't test the validity of this information in the way that we do - they simply provide the best information they have at the time. The question still remains whether their best is good enough for our purposes. It appears that sometimes it isn't.

Google's War on Data and the Clickstream Revolution
Blog Post: November 07, 2016
  • Russ Jones

    Unfortunately, clickstream data is quite expensive, but we get to split that across our huge customer base, so hopefully even the little guy can stick it to the man, so to speak, by using our tools!

  • Russ Jones

    I agree with this "I think this is another example of the need for SEO strategy to target topics and people instead of looking for that perfect keyword phrase" but we also have to watch out for issues caused by aggregate data. Let's say you want to figure out what topic to write on. You find all the keywords related to that topic and see that it has a good bit of searches. But can you trust that aggregate number? If you aren't careful, you could end up with "car part", "car parts", "cars part" "cars parts" all showing huge numbers and being added together, giving you a huge misrepresentation of volume for the aggregate topic. In the end, getting data right always helps. And we hope to do just that.

  • Russ Jones

    And our data will continue to get better and better over time as well! We are constantly refining and building models behind Keyword Explorer.

  • Russ Jones

    This has long been my shared concern as well. Being an amazing business is not identical with knowing how my users search for my product or services. This means that there is a disconnect between users and the solutions that could most make them happy. Google would like force businesses to use their paid platform to bridge this gap, but that ultimately hurts users.

  • Russ Jones

    I think this will be the case going forward. Any opportunity Google has to replace natural results with paid, and where consumers don't seem to mind, they will. We just need better tools to help us uncover what opportunities remain.

  • Russ Jones

    Thanks for your response! Unfortunately, we are finding that the Search Console query report isn't particularly accurate (especially with respect to rankings). It seems everything must be looked at with a suspicious eye.

Moz Keyword Explorer vs. Google Keyword Planner: The Definitive Comparison
Blog Post: October 31, 2016
  • Russ Jones

    I don't think any tool could ever do all the work that Eric does in his keyword analyses :-) but it is certainly worth aiming for. Our future versions of Keyword Explorer will have the kind of data required for the type of analysis that gives webmasters a competitive edge. I think the dev team is already on the right track though :-)

Google Keyword Unplanner – Clickstream Data to the Rescue
Blog Post: August 16, 2016
  • Russ Jones

    If you run large enough active campaigns in Adwords, supposedly you will be unaffected by Google's new restrictions.

  • Russ Jones

    Our clickstream data and corpus in the UK is much smaller than in the United States, which makes it harder for us to find matches and then to fix them. But, the good news is we are working on 100x our clickstream data, including in UK/AU/CA so it is coming!

  • Russ Jones

    Hey Adam, thanks for your question!

    Google has actually always had ranges, but they were far more granular and they were displayed as the median of the range, rather than the range. The new limits by Google are far more obfuscated like 10-100, 100-1000, 1000-10000, 10000-100000. Ours are far more granular, more akin to what Google is. Most importantly, our ranges are built around giving you maximum predictive power throughout the year. Our range shows the likely range that the volume will fall into every month of the year, rather than giving you 1 number that is never right.

    We have batted around the idea of giving a mean, median, mode, standard deviation, etc. for the volumes, but that is something we would likely only expose in the API. It is definitely on our minds though.

    Thanks!

  • Russ Jones

    We purchase data from 3rd party providers that monitor, distill, and anonymize real user data.

  • Russ Jones

    Great question! Potential is based on an algorithm developed by Dr. Pete which combines both search volume, opportunity, and difficulty. In both cases, the volume and opportunity override the difficulty enough to make them have decent potentials. Of course, "high" potential will be relative to the industry - keywords like "keyword tool" and "seo tips" have higher potential scores (by just a little).

  • Russ Jones

    Glad to hear. I think you will be surprised at the rigor behind our data.

  • Russ Jones

    Thanks for the kind words! I'm not sure why Google is making the changes either. My guess is that they want to funnel people into using Traffic Estimator more (where you give a bid and get back estimated results). I'm not quite sure why, and until they give good data there it seems like they haven't left us with any good choices.

  • Russ Jones

    Thanks, I appreciate it. I understand why Google would wan't to make data hard to access for people who aren't running campaigns, at least after a grace period. However, removing the data for people with long-term, active campaigns seems like a low blow. I hope we can help you fill in the gaps.

  • Russ Jones

    I believe we have one of if not the largest keyword volume data sets in the industry, but it certainly will never be as large as Google's. If you let us know what industries you see affected via a support ticket, that would be really helpful!

  • Russ Jones

    That is great to hear!

  • Russ Jones

    Thanks! We have seen roll out slowly, some accounts unaffected yet. It is hard to tell whether there is something different about the accounts that keeps them unaffected, or just that Google hasn't gotten around to rolling it fully out.

  • Russ Jones

    I don't doubt it! And what is frustrating for SEO's is that the search results for those two phrases are quite different, which means you have to optimize for them differently. They have different difficulties and different search volumes, but Google obscures it. Luckily, Moz Explorer gets it right. Website Designing is only searched a few hundred times a month, while Web Design is searched nearly 100K.

  • Russ Jones

    Thanks for the info!

  • Russ Jones

    Ohhhh! I thought you were referring to the issue where in one place we said no data and another place we said 0. I get it. I'll talk to the team about switching out the wording. Thanks!!!!

  • Russ Jones

    Thanks for the response. This was a problem where we were inconsistent with our language in the tool, but I believe it has been fixed for all future lists (and probably if you refresh your list). Basically, it should say "no data" everywhere we don't have data.

    We do, of course, recommend words that have 0-10 search volume as well. People use KWE for a lot of reasons, including finding relevant words and phrases to use in their content to be more topically authoritative (even if the terms themselves are not searched). That being said, you should be able to rely on the "no data" message in the suggestions and simply not include them in your lists. You can also sort by Volume if you want to ignore those altogether.

    Thanks!

  • Russ Jones

    Hey folks, thanks for reading my post. I wanted to start an informal survey - have you started seeing volume ranges in Google Keyword Planner and do you have an active campaign in Google Adwords?

Sweating the Details - Rethinking Google Keyword Tool Volume
Blog Post: May 17, 2016
  • Russ Jones

    They have the exact same drawbacks as Keyword Planner in the United States. That being said, we are working on volume for the UK, CA, and AU right now actually@

  • Russ Jones

    Thanks, there are definitely tradeoffs and we do our best to make smart ones.

  • Russ Jones

    Thanks for the question! Our data comes from two sources, Google and raw clickstream data. We only use Web search, not vertical engines.

  • Russ Jones

    Great points as well! There is a lot that Google Keyword Planner just doesn't do!

  • Russ Jones

    Hey folks, I'll be around to answer questions you might have! Hope you enjoy the read!

Announcing Keyword Explorer: Moz's New Keyword Research Tool
Blog Post: May 03, 2016
  • Russ Jones

    Hi Jack, thanks for your response. I think you will be pleasantly surprised by Moz Keyword Explorer. I am personally well versed with the vast majority of keyword tools out there and I don't think there are any that offer this combination...

    1. 2 Billion+ keyword corpus

    2. 500 Million+ SERP crawled corpus

    3. Keyword Opportunity metrics based on real clickstream data and SERP features

    4. Keyword relationship based on multiple semantic and word-relation algorithms

    Just to name a few. It really is worth a look.

  • Russ Jones

    That's a great idea!

  • Russ Jones

    You read our minds! We have already been brainstorming 2.0 features and hopefully our savvy customers will keep bringing more ideas to us. Thanks!

  • Russ Jones

    Thanks! If our existing customers can use our tools to grow their businesses, they will use our services more. We see it as a win-win mutual growth strategy. And just plain old fair.

  • Russ Jones

    You aren't kidding. When I was at Angular before Moz, it was all about combining multiple keyword sources, multiple difficulty measurements, traffic data from keyword planner, there was nothing even close to "Opportunity" and then shoving it together in a spreadsheet and guessing how it all works together.

Google Keyword Planner's Dirty Secrets
Blog Post: December 01, 2015
  • Russ Jones

    We can get at some of this semantic mapping by using SERP analysis. Be on the lookout for this in Keyword Explorer

  • Russ Jones

    It is and it should be. People should be aware of a tool's limitations, but just because it is imperfect doesn't make it incredibly useful.

  • Russ Jones

    We weren't able to determine an exact cutoff for why something gets in a particular bucket because we don't know the exact number of searches :-) That number is never told to us, unfortunately, only the bucket.

  • Russ Jones

    It is back up now :-)

  • Russ Jones

    Thanks for the heads up - it is up again.

  • Russ Jones

    It is best to think of GKP as a 10000 foot view, so to speak. It just isn't very specific.

  • Russ Jones

    The data is just acquired differently between the two, AFAIK. For example, GSC registers a click on the click event itself, not whether the user waits for the page to load GA. Moreover, the user might click the same link multiple times. GA might count that or exclude it as not a unique visit. It is really hard to know. The collection methods are just so different that we would expect there to be some pretty big differences.

  • Russ Jones

    When you find related keywords, click on keyword ideas. Click the download all and choose "segment by month". This will give you the last 12 months in your export.

  • Russ Jones

    I agree that it is useful, but hopefully now it can be more useful!

  • Russ Jones

    It is still the best free tool around, but there are better paid ones, IMHO, and an even better one coming out ;-)

  • Russ Jones

    This is one of the fuzzy questions we don't really know about. It is possible that an "Impression" is different between Search Console and Adwords. For example, maybe an "Impression" in search console is every time someone sees the SERP, even if they hit refresh or the back button to get there. In Adwords, it might be every unique impression in 30 seconds, not counting back-button presses. We don't really know, but it can certainly account for discrepancies.

  • Russ Jones

    I don't believe it is, but it is worth a shot. For what it is worth, Bing data has much more granular buckets

  • Russ Jones

    Yeah, it isn't unusable but it comes with some pretty big gotchas.

Good News: We Launched a New Index Early! Let's Also Talk About 2015's Mozscape Index Woes
Blog Post: November 12, 2015
  • Russ Jones

    You've actually got our attention. We have started an internal discussion about holding on to untrustworthy links, almost like a supplemental index, for the purpose of helping webmasters with certain issues like link removal problems. Great idea!

  • Russ Jones

    I would have to talk to the team behind the MozBar, but my guess is that you are just getting cached data that will refresh within a certain amount of time.

  • Russ Jones

    Thanks for the response. There are certainly risks with a smaller index like it being "gameable", although anyone who is paying their SEOs based on their ability to push up PA/DA is making a horrible mistake :-) With spam-score in place, it is easier than ever for clients to see if their SEOs are building trustworthy links in OSE, so hopefully we are mitigating some of those concerns.

    That being said, while it is easier to manipulate a single metric within the system, this pales in comparison with the literally billions of URLs (and tens of billions of links) that a large index will pick up that are not in Google's index at all, throwing off the link graph as a whole.

  • Russ Jones

    Hey folks,

    I wanted to chime in with some early quality tests we have been running against the index. Bear with me for a moment because I will have to give some context. There is a tl;dr; at the bottom though for those of you who want to hastily skip my beautiful prose.

    The Crawl Paradox: Bigger not Better

    Before I joined Moz, I released a study that pointed out a somewhat paradoxical situation: it seemed that the deeper you crawled the web, the less the index looked like Google's. Upon further investigation, this made sense. All crawlers have to prioritize which page to visit next. Small differences in how we prioritize pages (and our competitors) vs. Google would cause the crawlers to diverge from one another more and more as they crawled deeper and deeper. Thus, the path to building a link index like Google's meant less focus on the biggest index, and more focus on an index that prioritizes, schedules, and crawls like Google. Quality had to do more with the shape of our index than size.

    Measuring Quality:

    After coming Moz in late August, I had the privilege to partner with a number of individuals and teams looking at index quality - big shout out to Neil and Dr. Matt for their help! While I am not even close to qualified to handle the kind of work the Big Data team addresses, I have had the opportunity to work on measuring the results. While we have built up a number of metrics to determine index quality, two that I kept a close eye on for this release were the relationship between Page Authority and Rankings, and our hit-rate for pages in Google's index (ie: are pages that rank in Google's search results also in MozScape). This one is also potentially paradoxical in that - bear with me here - NOT having coverage in Google can artificially boost your correlation metrics.

    Imagine if you only crawled 1/10th of what Google did. Chances are, that 1/10th would be the most important pages because you would find them faster since they have so many links. You'd find Facebook, Wikipedia and Youtube really quickly in your crawl, but probably not joeschmoesblog.com. These popular pages would likely rank in the top 2 or 3 for important search phrases. Your small index would probably not include many of the pages that rank #8, #9 or #10. In fact, you could imagine that as you dropped down from #1 to #10, the odds that the page is in your index at all would drop dramatically. Thus, when you calculate quality metrics, the pages ranking near the bottom would mostly have 0s because they aren't in your index at all, and the ones at the top will likely have at least some authority. What happens when the ones at the top have some authority, and ones at the bottom have none? You get a positive correlation not because you have figured out Google's ranking factors, but because you just happened to not index the lower ranking pages!

    So, here was my concern: Were our measures of predictive power going to plummet because we were doing a better job indexing the stuff Google indexes? Were we going to win one battle and lose another? Or can we increase our coverage of URLs in Google search results AND simultaneously increase our maintain our predictive capacity from Page Authority?

    Hell yes we can!

    The first thing I did when I heard that the index was live was re-run our quality metrics on these two issues. And the verdict is in. We both increased our hit rate in Google Search (by about 3%) AND we increased the predictive capacity of Page Authority (by about 1.7%).

    This is exciting for a number of reasons. First, Moz's index is based on a rolling crawl. The improvements made by the team have only had a fractional impact on this index, the next 2 should be even better because the changes already made will have more and more influence over the whole index. Second, it shows that we can consciously increase the quality of our index relative to Google's and, finally, it shows that links still matter a lot :-)

    tl;dr;

    Despite a slightly smaller index, both our hit rate against Google indexation went up and our correlation with rankings went up. This is the definition of focusing on quality over quantity - we got the good stuff that matters to Google and quantified it better than before. Kudos to the Big Data team and everyone at Moz for putting MozScape on the right path.


  • Russ Jones

    Thanks for your thoughts. We recognize that a bigger index has its values, but it also comes at a cost of less trustworthy metrics. If you have a chance, take a look at this research I did prior to joining Moz describing the inverse relationship between index size and proportional relation to Google's index.

    That being said, putting my SEO hat back on for a second, if you are concerned about finding each and every link, you need to be using every link index out there, including Moz. There is huge disparity between the major link indexes in who is blocked by what sites. You would be surprised at how many sites block one or more of the link indexes bots. There literally millions of pages that you would miss were you to only subscribe to Moz + one other, or the others individually.

    It is worth saying that the Big Data team has put in place improvements that will allow steady index growth over time. We will put up big numbers again, but not at the expense of quality. Our goal isn't just to be the longest list of links out there, rather the best.

Using Social Media as Your Primary (or Only) Link Building Tactic Probably Won't Work - Whiteboard Friday
Blog Post: October 02, 2015
  • Russ Jones

    Great WBF!

    I think there is no substitute for sweat equity through direct outreach. Even if we should replace social with "engagement" on the fly-wheel, there is another more important amplification factor which is direct outreach. Moreover, I think that direct-outreach angle needs to be considered early in the content creation process - what is the pitch and to whom - so that content which is produced is not only engaging but link producing. This is why techniques like broken link building and ego/vanity baiting are so effective - they answer the direct outreach question early on, knowing that there is a group of webmasters who will care enough to link.

Why I Stopped Selling SEO Services and You Should, Too
Blog Post: October 07, 2015
  • Russ Jones

    I'm going to disagree, only because it is fun :-)

    But honestly, don't those mega brands, aggregators, news outlets, etc. employ SEOs or SEO agencies? And what about technical SEO for large or complex sites?

    And can you please explain what the difference is between beating the search algorithm and using it to your advantage, the final conclusion of the piece? It sounds like an equivocation to me.

    There is still a place for selling SEO, it's just that cheating and scheming is becoming less a part of the craft.

    Also please don't fire me. :)



Million Dollar Content - An Analysis of the Web's Most Valuable Organic Content
Blog Post: September 29, 2015
  • Russ Jones

    I think this is better, but still possibly insufficient. Most infographics attempt to condense content and let the graphs, data, etc. tell the story. Content needs to be somewhat verbose to capture all the related words and phrases.

  • Russ Jones

    I put Garrett's in the middle because most people just read the beginning and skim to the end.

  • Russ Jones

    Yep, if you notice there is a hummingbird graphic later on in that section. To be fair, though, the word "bed" occurs 33 times on that page ;-)

  • Russ Jones

    Thanks for your comment. When I use the word "content", I include most of the things you are mentioning - such as keyword usage - but I think Google is using far more sophisticated techniques to get at the question "what is quality content" than raw metrics like the number of words in a document (text length).

  • Russ Jones

    I think you are certainly correct here, although I am not convinced that Google is using any social metrics directly in their rankings algorithms. There is likely a natural relationship between the industry and social performance expectations, though, regardless of whether or not Google implemented it directly in their algorithm. A highly shared, highly social industry will be more competitive in that space, thus it will simply take more to stand out from the crowd so to speak.

  • Russ Jones

    Thanks for the heads up! I have fixed it!

  • Russ Jones

    Thank you for your comments! I agree that readability of data/information is important, but I think at best those types of metrics are acquired by Google via user-engagement. I don't think Google can easily assess the "readability" of a graphic, or more specifically its usefulness to users. Instead, I think that content specificity really matters, which is why I used the description "on-point". The content is exhaustive but it is not excessively verbose. Another way to describe it is that the content is dense with related terms, rather than long and sparse.

    My vote on what is king? Esse quam videri

  • Russ Jones

    Hey Ali,

    I would be hesitant to make an infographic a primary strategy for developing long-term, successful evergreen content like that which we discuss above. Infographics did not feature prominently in any of the pieces of content we studied (although a handful had infographics on the page as part of the total strategy).

    The biggest limitation of infographics is that they remove Google's primary data point for determining long and mid-tail rankings: textual content. The infographic both condenses content into short factual statements and, generally speaking, only includes them in the image itself, making it largely inaccessible to Googlebot.

    Even using a textual transcript of the graphic is often insufficient because of the condensed nature of the text included (the purpose of the infographic is to convey information via graphics, not words), but Google needs that language to determine rankings for related terms.

    This is not to say that infographics shouldn't be a part of your overall strategy - they certainly can be hugely helpful. However, they are more likely to succeed as short-term link building strategies than as your primary long-term content.