Google API Leak – What does it mean for Search Engine Optimisation?

What Google API Leak?

Earlier this month, documentation from Google’s Search Content Warehouse API was published on GitHub by an automated bot. This documentation included over 2,500 pages detailing:

  • Search Ranking Factors: Detailed information on over 14,000 metrics that Google uses to rank search results
  • Quality Rating Data: Information about how Google’s quality raters assess the relevance and quality of search results
  • Clickstream Data: Data from Chrome browsers that helps Google understand user behaviour
  • Algorithm Adjustments: How search results are tweaked based on user navigation and click data

It’s safe to say that all it has caused a bit of an upset in the search engine community. Mainly because although everyone claims to understand how to do well in search engines, the way that Google’s search algorithm actually works is a closely guarded secret, and this leak provides never-seen-before access to the algorithm’s inner workings.

As a result, since it was discovered, a whole range of Search Engine Optimisation experts have been poring over the documents and trying to work out what it can tell us.

Was it really an accident?

Yep, it appears so. The data was exposed quite a while back (since 2023, it’s just taken until this month for someone to find it!), and since it’s been found it’s been corroborated by a few pretty knowledgeable sources – this information wasn’t meant to be out in the wild [Update: 30th May – Google has now verified the documents are real].

Does the Google API leak actually tell us anything?

Again, yep, it appears so.

The leak tell us about all sorts of metrics that Google collects and uses to create its search results. Whilst it doesn’t tell us how these are weighted, or to what degree these metrics are ‘ranking factors’ in the algorithm, it does help us better understand what might important.

It also suggests that Google hasn’t been ENTIRELY accurate in some of the statements it’s made in the past about it’s algorithm.

Whilst we’re not going to throw the baby out with the bathwater and solely focus our SEO on what the leak contains going forward, we are definitely going to adjust our strategy and use the insights to focus on some new areas – as well as to run some tests around some of the things we’ve learned to see whether there are any new opportunities.

What are the BIG lessons that might affect your SEO?

There’s a LOT of detail in the documents so it’s tricky to work out what’s important and what’s not. Here are 7 key takeaways that we’ve spotted:

1. Clicks & Engagement are key

The quantity and quality of clicks from organic rankings matter.

Recent information from the Google antitrust trial revealed that Google’s Navboost system is an important ranking signal. This uses Chrome click data and quality raters to work out what are ‘good’ web pages.

The leak though shares some of the metrics it uses to calculate this. For example, it measures data like the search result a user spent the longest time on, or the last time somewhere came to your site and hung around – and it tracks clicks over 13 months.

Creating demand for your website among targeted searchers is key. The best approach for that is focussing on high intent search queries, and making sure your content is useful and sticky.

BTW, this also suggests that there could be a SEO effect from paid advertising. Possibly not the paid clicks, as Google could discount those fairly easily, but the secondary clicks you get from paid marketing (when people come back to your website after finding you in ads) might give you a natural search boost – so this could be a great approach to drive long term organic performance.

2. Know your niche

Despite Google suggesting otherwise in the past, there are a lot of metrics mentioned that reference ‘site wide’ scores, including “siteAuthority” to “siteFocusScore”.

It’s not clear how these are calculated, but given they exist, and taking them together with Google’s focus on quality content (EEAT as they put it –  Experience, Expertise, Authority, and Trust), it’s likely that having your site engaging useful content, focused on your core subject matter is going to result in higher scores.

EATT Diagram showing intersecting circles, with Expertise, Experience, Authoritativeness and Trust
Google’s focus is on content the reflects the concepts of EATT

3. Be original 

Continuing the EATT theme, it’s clear based on the documents that Google is looking for quality original content.

Pages that only include small amounts of content receive an “OriginalContentScore”, which reinforces the need for unique, authentic, quality content. There also appears to be an AI rating for Content Effort – though how exactly this is being measured it’s not clear.

This does though mean that it’s worth focussing on pages with shorter content to make sure they’re original. It also suggests that relying heavily on AI tools like ChatGPT to generate content is likely to cause issues for you down the line. Instead, take the human approach, focus on adding value for your readers, and try to differentiate your content from your competitors.

4. Fresh is best

Google measures content recency and freshness, with metrics to track both the publication and update dates. It clearly wants to prioritise content that is curated and kept up to date.

With that in mind, it’s important to review all the content on your site to keep it fresh and relevant. For instance, if you’re in accountancy, make sure your site is updated to reflect the latest tax advice.

If a page isn’t relevant any more, its better to take it down – even if it does still get the occasional visitor!

5. Google likes to mix it up

The API documents suggest that Google takes steps to make sure there’s a range of different content sources in the results – limiting the number of videos, small site blogs etc. to give users a range of different sources in response to their query.

To get broader coverage in search results, it’s then a good idea to create a diverse range of content types on your site to improve overall visibility.

This is particularly important for sites trying to enter really competitive areas. For example, if you’re trying to gain traffic in markets where there are already lots of ecommerce sites at the top of the results, maybe consider video content as a way to more effectively compete.

6. Spammy links will hurt you

Links from established sites, using proper anchor text, are great, but a load of links from dodgy websites with over optimised anchor text seems to trigger a spam penalty.

Skip the link building services and instead use an organic approach – or focus on quality PR and build relationships with high quality websites that are relevant to your audience.

7. The experts get things wrong

There are a couple of things that we all thought were important that don’t appear to be – in particular, it appears that character limits on page titles and descriptions don’t need to rigidly stick to the character counts – especially if it improves readability

Also, internal linking doesn’t seem to have the benefit most experts thought – so just link to other pages when you want to signpost them to users, rather than worrying about search engines.

Keeping Ahead of Google’s Algorithm

This leak isn’t a silver bullet! We don’t have the whole algorithm, we don’t know to what extent each metric is used as a ranking factor, and we don’t know how up to date this is (although it’s certainly less than 12 months old, based on timestamps).

Whilst there’s clearly some great information in there that helps inform how we can get better at search engine optimisation, it does broadly align with the core message that Google’s been sending out for years now – we should be focussed on creating high quality, useful content that is interesting to our users. This means that, no matter what else, we’ll be well placed to react to new shifts in Google’s algorithm.

In the meantime, we should be testing out the key ideas the leak has suggested, to see the impact they have on search results – and what can give us the edge. We’ll share what we find along the way!

Need help?

Want to know more about how your site can more traffic from search engines? Get in touch and we can talk you through how you can improve.

Leave a Reply

Your email address will not be published. Required fields are marked *