Google API Leak – What does it mean for Search Engine Optimisation?

What Google API Leak?

Earlier this month, documentation from Google’s Search Content Warehouse API was published on GitHub by an automated bot. This documentation included over 2,500 pages detailing:

  • Search Ranking Factors: Detailed information on over 14,000 metrics that Google uses to rank search results
  • Quality Rating Data: Information about how Google’s quality raters assess the relevance and quality of search results
  • Clickstream Data: Data from Chrome browsers that helps Google understand user behaviour
  • Algorithm Adjustments: How search results are tweaked based on user navigation and click data

It’s safe to say that all it has caused a bit of an upset in the search engine community. Mainly because although everyone claims to understand how to do well in search engines, the way that Google’s search algorithm actually works is a closely guarded secret, and this leak provides never-seen-before access to the algorithm’s inner workings.

As a result, since it was discovered, a whole range of Search Engine Optimisation experts have been poring over the documents and trying to work out what it can tell us.

Was it really an accident?

Yep, it appears so. The data was exposed quite a while back (since 2023, it’s just taken until this month for someone to find it!), and since it’s been found it’s been corroborated by a few pretty knowledgeable sources – this information wasn’t meant to be out in the wild [Update: 30th May – Google has now verified the documents are real].

Does the Google API leak actually tell us anything?

Again, yep, it appears so.

The leak tell us about all sorts of metrics that Google collects and uses to create its search results. Whilst it doesn’t tell us how these are weighted, or to what degree these metrics are ‘ranking factors’ in the algorithm, it does help us better understand what might important.

It also suggests that Google hasn’t been ENTIRELY accurate in some of the statements it’s made in the past about it’s algorithm.

Whilst we’re not going to throw the baby out with the bathwater and solely focus our SEO on what the leak contains going forward, we are definitely going to adjust our strategy and use the insights to focus on some new areas – as well as to run some tests around some of the things we’ve learned to see whether there are any new opportunities.

What are the BIG lessons that might affect your SEO?

There’s a LOT of detail in the documents so it’s tricky to work out what’s important and what’s not. Here are 7 key takeaways that we’ve spotted:

1. Clicks & Engagement are key

The quantity and quality of clicks from organic rankings matter.

Recent information from the Google antitrust trial revealed that Google’s Navboost system is an important ranking signal. This uses Chrome click data and quality raters to work out what are ‘good’ web pages.

The leak though shares some of the metrics it uses to calculate this. For example, it measures data like the search result a user spent the longest time on, or the last time somewhere came to your site and hung around – and it tracks clicks over 13 months.

Creating demand for your website among targeted searchers is key. The best approach for that is focussing on high intent search queries, and making sure your content is useful and sticky.

BTW, this also suggests that there could be a SEO effect from paid advertising. Possibly not the paid clicks, as Google could discount those fairly easily, but the secondary clicks you get from paid marketing (when people come back to your website after finding you in ads) might give you a natural search boost – so this could be a great approach to drive long term organic performance.

2. Know your niche

Despite Google suggesting otherwise in the past, there are a lot of metrics mentioned that reference ‘site wide’ scores, including “siteAuthority” to “siteFocusScore”.

It’s not clear how these are calculated, but given they exist, and taking them together with Google’s focus on quality content (EEAT as they put it –  Experience, Expertise, Authority, and Trust), it’s likely that having your site engaging useful content, focused on your core subject matter is going to result in higher scores.

EATT Diagram showing intersecting circles, with Expertise, Experience, Authoritativeness and Trust
Google’s focus is on content the reflects the concepts of EATT

3. Be original 

Continuing the EATT theme, it’s clear based on the documents that Google is looking for quality original content.

Pages that only include small amounts of content receive an “OriginalContentScore”, which reinforces the need for unique, authentic, quality content. There also appears to be an AI rating for Content Effort – though how exactly this is being measured it’s not clear.

This does though mean that it’s worth focussing on pages with shorter content to make sure they’re original. It also suggests that relying heavily on AI tools like ChatGPT to generate content is likely to cause issues for you down the line. Instead, take the human approach, focus on adding value for your readers, and try to differentiate your content from your competitors.

4. Fresh is best

Google measures content recency and freshness, with metrics to track both the publication and update dates. It clearly wants to prioritise content that is curated and kept up to date.

With that in mind, it’s important to review all the content on your site to keep it fresh and relevant. For instance, if you’re in accountancy, make sure your site is updated to reflect the latest tax advice.

If a page isn’t relevant any more, its better to take it down – even if it does still get the occasional visitor!

5. Google likes to mix it up

The API documents suggest that Google takes steps to make sure there’s a range of different content sources in the results – limiting the number of videos, small site blogs etc. to give users a range of different sources in response to their query.

To get broader coverage in search results, it’s then a good idea to create a diverse range of content types on your site to improve overall visibility.

This is particularly important for sites trying to enter really competitive areas. For example, if you’re trying to gain traffic in markets where there are already lots of ecommerce sites at the top of the results, maybe consider video content as a way to more effectively compete.

6. Spammy links will hurt you

Links from established sites, using proper anchor text, are great, but a load of links from dodgy websites with over optimised anchor text seems to trigger a spam penalty.

Skip the link building services and instead use an organic approach – or focus on quality PR and build relationships with high quality websites that are relevant to your audience.

7. The experts get things wrong

There are a couple of things that we all thought were important that don’t appear to be – in particular, it appears that character limits on page titles and descriptions don’t need to rigidly stick to the character counts – especially if it improves readability

Also, internal linking doesn’t seem to have the benefit most experts thought – so just link to other pages when you want to signpost them to users, rather than worrying about search engines.

Keeping Ahead of Google’s Algorithm

This leak isn’t a silver bullet! We don’t have the whole algorithm, we don’t know to what extent each metric is used as a ranking factor, and we don’t know how up to date this is (although it’s certainly less than 12 months old, based on timestamps).

Whilst there’s clearly some great information in there that helps inform how we can get better at search engine optimisation, it does broadly align with the core message that Google’s been sending out for years now – we should be focussed on creating high quality, useful content that is interesting to our users. This means that, no matter what else, we’ll be well placed to react to new shifts in Google’s algorithm.

In the meantime, we should be testing out the key ideas the leak has suggested, to see the impact they have on search results – and what can give us the edge. We’ll share what we find along the way!

Need help?

Want to know more about how your site can more traffic from search engines? Get in touch and we can talk you through how you can improve.

Google To Start Using Page Speed for Mobile Rankings

Is your website on the slow side? From July 2018, load speed will start affecting a page’s ranking on Google’s mobile search results.

Only the slowest loading pages will be initially targeted, but Google says there is no way to determine whether a page is affected by this change.

Their webmasters’ blog did say: ‘We encourage developers to think broadly about how performance affects a user’s experience of their page and to consider a variety of user experience metrics’.

If you think your website could be affected, we can benchmark your site and, if there are problems, suggest different ways it can be sped up. Give us a call if you’d like our help

To start off, you can get an idea of how fast your website performs using Google’s Page Speed Insight tool.

Google confirms site quality matters

So on a Google Webmaster hangout that took place yesterday, John Mueller (a Webmaster Trends Analyst at Google) talked about site quality and architecture, specifically whether the Panda algorithm took these into account when evaluating pages.

When asked if Google takes site architecture into account and if improving the site categories would make a difference to search engine performance, John said:

“We see (Panda) as something that is more like a general quality evaluation of the website that takes into account everything around the site…That is something where, if we find issues across the site where we think this essentially affects the quality of the web site overall, then that is something that might be taken into account there”

You can watch the hangout here:

Whilst, as far as statements go, this is a little woolly, it does go some way to confirming that Google’s looking at a far broader set of factors than most companies typically focus on – and that investing time getting your site architecture and categorisation right upfront will pay dividends in the long term.

Google moves to mobile-first: What it means to you

Toward the end of 2014, Google caused chaos by announcing that, from 21st April 2015, it would introduce a change to its search algorithm that would penalise website that weren’t mobile friendly when showing mobile search results.

The change was swiftly named ‘Mobilegeddon’ and led to many companies rushing to change their websites to meet Googles new rules.

Well, a couple of weeks ago, Google let slip another announcement.

This time, they’ve said that Google will soon prioritise mobile websites as the primary source of information for their search index.

What does that mean?

Google works out who to show in search results by ‘spidering’ the Internet – following each link within a website and seeing where it leads and, as a result, building up a picture of the whole web.

Until now, Google has always done this by browsing around at the full version of websites. What Google is now saying is it will now also browse using a smartphone – and treat those results completely separately – using them as its primary source of information for decision making for the Google search index.

This is significant because some mobile versions of websites don’t include some of the pages on the main site, or they hide some sections of the page to make them easier to read on a phone.

Why ANOTHER algorithm change?

There’s a really good reason for this change. The percentage of people browsing the Internet on their mobiles has exploded over the last few years – to the point where most people now are browsing the Internet on their phones.

% of people using mobiles to use the Internet

As it stands at the moment, it’s possible that users might see a snippet of content on a Google search results page, that might not be there when they actually click through to the mobile site.

By updating it’s index to look at and evaluate the quality of the mobile versions of websites, Google is basically looking to make sure that it’s search results reflect the needs of the majority of it’s users.

When is this going to happen?

Basically, this change has already started happening.

Google are testing the effects now and as they become more confident in things working as they want, they’ll start to roll it out more widely. They expect the whole process to take a few months.

What’s does this mean for me?

If you’ve got a responsive mobile friendly website where the markup is the same across the desktop and mobile versions (& if you’re a Curious client, this will be the case for you), you won’t need to change anything.

Google will likely see your site in exactly the same way, as it does now.

However, if you have a separate mobile site, that is different than your desktop site, then you should start to think about making changes to your website.

Very often, a separate mobile site will contain a subset of pages of the main site; it might hide certain bits of content – in particular sidebars that include additional links; or it could exclude some of the metadata – the technical information about the website that sits within the HTML.

An example of schema metadata within a website
An example of schema metadata within a website

In this case, you’re likely to find that over the next few months, the effectiveness of your website in attracting traffic from search engines, starts to reduce – particularly for those searching on mobiles.

If you do have a separate mobile site, the key things to focus on are making sure all content is available when browsing with a smartphone and that any structured markup is present on both the desktop and mobile versions.

If you find you need to make changes though, the most important thing is DON’T RUSH.

Google will continue to index desktop sites and it’s better to have a working desktop site than a broken or incomplete mobile version!

Is your website ready for a separate mobile index?

For advice on how to prepare your site for the change to Google’s index, or how we can help you better optimise your site for mobile, get in touch. Call us on 0330 010 9000, or just fill in this form.

Are you ready for Penguin 4?

UPDATE: Google announced that the Penguin update went live on Friday 23rd Sept, and that Penguin is now real time and has been incorporated into its main algorithm.

In October last year Gary Illyes, a trends analyst that works at Google, let slip that a new Penguin update would appear in 2015. It never arrived – but all the evidence points that it’s still on its way.

Photo of 4 Penguins

This is good news for Search Engine Optimisation (SEO) companies and marketeers that are optimising their websites for search engines the ‘right way’ (basically, following Google’s Webmaster Guidelines). For those that have been taking a more unethical approach to SEO though, the news might not be so welcome!

What Is Google Penguin 4?

Back in April 2012 Google made an update to it’s algorithm that it named ‘Penguin’. It was designed to identify websites that were spamming it’s search results by buying links (or getting links through link networks created to boost search engine rankings). 

It had an immediate and significant effect. Sites that weren’t playing by the rules suddenly disappeared from search results, and many websites were notified of manual penalties that had been applied to their search engine rankings –  either demoting them many pages down, or removing them entirely.

Inter flora Search Performance
Interflora were one of the larger names hit by the first Penguin update: their search traffic fell to pretty much nothing overnight.

Since then, there have been a number of further ‘Penguin’ updates made. Each time the there are significant changes to the  search results – with generally, the sites that are approaching their marketing fairly seeing positive results and those trying to game the system being negatively impacted.

Penguin 4 is the name that’s been given to the next big update that’s will focus on combatting ‘spammy’ linking.

With Penguin 4 changes become real time

Each time a new Penguin Update  gets released, websites previously penalised that have worked to remove bad links (for instance through the Google disavow links tool) can regain rankings and, equally, sites that have not previously caught might get trapped.

The downside of this, is that it takes a fairly long time for the effect of changes to a website (whether positive or negative) to reflect in search results. Sites that have been penalised (whether fairly or unfairly) have to wait a long time before they can recover, and some companies are still finding quick wins by using spammy techniques in the gaps between updates.

Penguin 4  is rumoured to look to address these problems by introducing a real-time element to the algorithm – effectively meaning that the Penguin portion of the algorithm will always be “on” and updating, processing information about new links in real time and  making adjustments accordingly.

This should mean that Google will catch spam link profiles more quickly and allow companies that have identified and resolved issues to recover faster.

When’s Penguin 4 coming?

Well, we’re still waiting right now, but the signs are its imminent.

After the news broke last year that Penguin 4 was coming, the SEO community, have been watching carefully for the effect of it hitting. After being questioned when Penguin 4 didn’t appear in 2015,  Gary Illyes reported in January that we were “weeks away” from seeing the next iteration. 

However as of today, there’s still no sign and Google have confirmed it’s not live yet. Gary’s said he’s now not giving out any more dates for fear of being wrong should it be missed again because it’s not quite ready.

Why the delay? Well, we’re not sure, but the general consensus is it looks like Google’s taking it’s time to get it right – which, given the wide ranging effect these updates have, is a good thing!

What does this mean for me?

If you run a website, you might be wondering what a real-time Penguin algorithm means for you, or what you should be doing to prepare for the update.

Firstly, if you’re not already doing it, rather than trying to game search engines, focus on creating quality content. Earn links and don’t buy them. Focus on providing the best user experience you can and, rather than fixating on your rankings, allow them to improve naturally.

Secondly, it’s worth making sure that there aren’t any problems with the links that you’ve already got pointed towards your website (and if you’ve ever purchased links on Fiverr or have engaged the services of someone who emailed offering cheap SEO services – this is ‘must do’).

Checking and cleaning backlinks

Making sure that the links to your website from other sites across the Internet aren’t going to cause you a problem is fairly straightforward:

1. Create a comprehensive list of backlinks

There are lots of tools available on the web to help with this, but it’s worth using as many sources as possible to get a comprehensive list and then de-duping. Google Search Console is a great place to start, but, Open Site Explorer and are pretty good too.

2. Look for patterns

Once you’ve got your list of links look for indicators that unnatural practices have been used.  This could be the same Anchor Text repeatedly used, similar dates that backlinks were created, or the same IP address being used.

3. Remove or Disavow suspect links

If there are websites that  are suspect, its time to remove them. It might sound painful, but it’s better to remove low quality links before anything happens, rather than after you’re penalised.

Start off by asking site owners to remove or ‘nofollow’ the links, sending messages via the sites’ contact forms or their registered owner (you can find this out through Whois).

After you’ve done this, you can use Google Search Console to disavow those that remain.


If you’re concerned about your site being impacted and want some expert help, get in touch. We’ve got tons of experience with helping companies that have run into trouble and can give you clear advice and assistance to get things fixed quickly.