In a recent video from Google Search Central, Gary Illyes illuminates an aspect of webpage indexing concerning the selection of canonicals. He elucidates Google’s understanding of canonicals and provides a brief overview of webpage signals. Illyes highlights the significance of a page’s centerpiece and hints at a novel perspective on handling duplicates.
Deciphering Canonical Webpages: Perspectives and Principles
What constitutes a canonical webpage varies depending on the perspective—from the publisher and SEO’s standpoint to Google’s interpretation. Publishers typically identify the “original” webpage, while SEOs aim to select the “strongest” version for ranking.
However, Google’s approach to canonicalization diverges from these notions, as elucidated by Gary Illyes. According to Google’s official documentation, canonicalization involves “deduplication,” where a canonical version is chosen among duplicates. Google outlines five common reasons for duplicate pages:
- Region variants: Such as content tailored for the USA and the UK, accessible via different URLs but essentially identical in language and content.
- Device variants: For instance, pages with both mobile and desktop versions.
- Protocol variants: Like the HTTP and HTTPS versions of a site.
- Site functions: These include sorting and filtering functions on category pages.
- Accidental variants: For example, unintentional access to demo versions by crawlers.
Thus, canonicals can be interpreted in three ways, and there are multiple reasons for duplicate pages. Gary adds another perspective to consider when understanding canonicals.
Unveiling the Role of Signals in Choosing Canonical Webpages
Gary Illyes delves into another definition of canonicals, focusing on the indexing perspective and the signals utilized in selecting canonical pages.
Gary explains:
“Google assesses whether a page duplicates another already indexed page and determines which version should be retained in the index, known as the canonical version.
In this context, the canonical version represents the page from a cluster of duplicate pages that best embodies the group based on the signals gathered about each version.”
Gary pauses to elaborate on duplicate clustering and then revisits the discussion on signals shortly after. He continues:
“Mostly, only canonical pages are displayed in search results. But how does Google ascertain which page is canonical?
Once Google acquires the content of a page, particularly the primary content or what I refer to as the ‘centerpiece of a page,’ it groups it with one or more pages containing similar content, if any. This is what we term duplicate clustering.”
It’s noteworthy that Gary mentions the main content as the “centerpiece of a page,” which aligns with the concept introduced by Google’s Martin Splitt, the Centerpiece Annotation. While not elaborating on the Centerpiece Annotation, Gary’s insight sheds light on this aspect.
In the video, Gary proceeds to discuss the nature of signals. He elaborates:
“Then, Google evaluates a range of signals it has already calculated for each page to determine a canonical version.
Signals are fragments of information that search engines collect about pages and websites, which are then utilized for further processing.
Some signals are straightforward, like site owner annotations in HTML such as rel=’canonical’, while others, such as the significance of an individual page on the internet, are more nuanced.”
The Singular Canonical: Unveiling Duplicate Clusters’ Core Concept
Gary proceeds to elucidate that within the search results, one page is designated to embody the canonical for each cluster of duplicate pages. Consequently, every cluster of duplicates possesses a singular canonical.
He elaborates:
“Within each duplicate cluster, a solitary version of the content is chosen as the canonical representation.
This chosen version will stand as the representative content in Search results for all other versions within the cluster.
The remaining versions within the cluster are relegated to alternate versions, potentially served in diverse contexts, such as when a user searches for a specific page within the cluster.”
Navigating Alternate Versions of Webpages: A Key Consideration for SEO
The significance of the last segment underscores the importance of considering alternate versions of webpages, especially when optimizing for various keyword variations, which is particularly pertinent in e-commerce scenarios.
Content management systems (CMS) often generate duplicate webpages to accommodate product variations, such as size or color, which can alter the product description. When a variant page closely matches a search query, Google may select these variations to rank in search results.
This aspect merits careful consideration because there might be a temptation to use noindex directives on variant webpages to prevent them from appearing in search results, driven by concerns of potential keyword cannibalization issues. However, applying a noindex to variant pages can have unintended consequences, as there are instances where these variant pages are better suited to rank for specific, nuanced search queries involving different colors, sizes, or version numbers than those present on the canonical page.
Critical Insights on Canonicals and More: Essential Points to Keep in Mind
Gary’s discussion on canonicals encompasses various aspects, including insights into the main content. Here are seven key takeaways to consider:
- The main content is referred to as the Centerpiece.
- Google calculates a “handful of signals” for each discovered page.
- Signals serve as data utilized for “further processing” post webpage discovery.
- Some signals, such as hints and directives, are within the publisher’s control. An example of a hint is the rel=canonical link attribute.
- Other signals lie beyond the publisher’s control, such as the page’s importance in the broader context of the internet.
- Certain duplicate pages can function as alternate versions.
- Alternate versions of webpages retain the potential to rank and are valuable for both Google and the publisher in terms of ranking purposes.
If you’re still grappling with SEO’s complexities and confusion, consider exploring our monthly SEO packages. Our team can provide the guidance and support you need to navigate the intricacies of search engine optimization effectively. Let us help you achieve your goals and elevate your online presence.