Google Leak: Information Contained in the Documents

A leak of 2,500 pages of internal Google documents reveals details about the search algorithm, including the use of Chrome data for ranking and the functioning of NavBoost.

The revealed documents are technical in nature and primarily provide information on the data Google collects about web pages and users. Here are some key points:

‍

Google Uses Chrome Data:

‍Google has consistently claimed not to use clickstream data from Chrome for rankings, but the documents suggest otherwise. According to Rand Fishkin, Google likely uses the number of clicks on pages in Chrome browsers to determine the most popular/important URLs on a site, which influences the URLs included in the sitelinks feature. Fishkin notes that analyzing clickstreams was a primary motive for creating Google Chrome in 2008.

‍

NavBoost Uses Click Data:

‍NavBoost's existence was revealed in October 2023 by Pandu Nayak, Google's VP of Search, during testimony before the U.S. Department of Justice. The documents provide more details on its operation, indicating that NavBoost counts clicks, analyzes bounce rates on pages, and evaluates click reliability. Google had previously denied using user signals centered on clicks.

‍

Filters Added for Sensitive Topics:

‍For certain sensitive queries, such as those related to COVID or elections, Google has implemented "whitelists" to prioritize sites deemed reliable, such as government authorities. This list can also extend to private domains like travel websites.

‍

Google Identifies Content Authors:

‍The E-E-A-T (experience, expertise, authoritativeness, and trustworthiness) criterion might "not have as direct an impact as some SEOs think," since it is not mentioned in any of the leaked documents. However, the leak reveals that Google collects data on authors, including a field to identify if an entity on the page is the author. Until now, Google stated that author pages primarily aimed to improve visitor experience without affecting rankings.

‍

Link Indexes Classified into Three Levels:

‍Google categorizes its link indexes into three levels: low, medium, and high. Depending on the number of clicks and their sources, links will be considered or ignored in site ranking. Fishkin illustrates this with an example:

"- If Forbes.com/Cats/ gets no clicks, it falls into the low-quality index and the link is ignored,

If Forbes.com/Dogs/ gets high click volumes from verifiable devices, it falls into the high-quality index and the link transmits ranking signals."

Links deemed "trustworthy" can transmit PageRank, while low-quality links are ignored and do not negatively impact site rankings.

‍