In the recently concluded Google Office-hours Hangout in 2023, John Mueller addressed a common query regarding handling thin or duplicate content on e-commerce websites using techniques like Rel Canonical and Noindex Directive. His insights into these strategies shed light on a nuanced approach beyond simply choosing one method.
According to Mueller, the efficacy of Rel Canonical and Noindex depends on the desired treatment of content by Google. Rather than favoring one approach as superior, he emphasized that each has its strengths and can be leveraged strategically. Furthermore, he revealed an interesting perspective: employing both techniques simultaneously, though not widely discussed in the realm of SEO, can be a viable strategy for effectively managing duplicate or thin content.
Before delving deeper, let’s examine both the Rel Canonical and Noindex Directive to provide a more precise understanding for those who might be unfamiliar.
What is Rel Canonical?
As per Google’s official documentation:
“A canonical URL is the URL of the page that Google thinks is most representative from a set of duplicate pages on your site. For example, if you have URLs for the same page (example.com?dress=1234 and example.com/dresses/1234), Google chooses one as canonical. The pages don’t need to be absolutely identical; minor changes in sorting or filtering of list pages don’t make the page unique (for example, sorting by price or filtering by item color).
The canonical URL can be in a different domain than a duplicate URL.”
In more current terms, as of 2023, let’s break it down: Rel Canonical isn’t precisely a directive but a suggestion indicating to Google which URL should be prioritized for display in search results. However, it’s important to note that this isn’t a strict command, and Google might still decide to display other URLs. Nevertheless, Rel Canonical proves highly valuable, especially for e-commerce websites with multiple pages dedicated to the same product, differing only slightly in content or presenting nearly identical content.
What is Noindex Directive?
As per Google’s official documentation:
“You can prevent a page or other resource from appearing in Google Search by including a noindex meta tag or header in the HTTP response. When Googlebot next crawls that page and sees the tag or header, Googlebot will drop that page entirely from Google Search results, regardless of whether other sites link to it.”
In simple terms, and as the name suggests, Noindex is a directive, which means Google MUST not index that particular webpage and naturally, drop it from appearing in the search results.
Rel Canonical or Noindex Directive, which one to choose?
The viewer who asked the question wanted to know which one is the best way to go about it given that the website in question is an ecommerce website.
“We have a website… an ecommerce store with a lot of product variations that have thin content or duplicate content even sometimes. So …I made a list of all the URLs we want to keep or we want to have indexed… and then I made a list of all the URLs that we don’t want to have indexed.
The more I worked on it the more I asked this question to myself, canonicalization or noindexing? I don’t know what the better of those would be.”
To which John said:
“…I think the general question of should I use noindex or rel canonical for another page is something where there probably isn’t an absolute answer. So that’s kind of just offhand. It’s like if you’re struggling with that you’re not the only person who’s like, oh which one should I use?
That also usually means that both of these options can be okay. So usually what I would look at there is what your really strong preference there is. And if the strong preference is you really don’t want this content to be shown at all in search, then I would use noindex.
If your preference is, I really want everything combined in one page and if individual ones show up, like whatever, but most of them should be combined, then I would use a rel canonical. And ultimately the effect is similar in that, well, it’s likely the page that you’re looking at won’t be shown in search.
But with a noindex it’s definitely not shown. And with a rel canonical it’s more likely not shown.”
How about combining Rel Canonical and Noindex?
While it is not discussed as a possible solution, John said one could use both at the same time in order to negate the doings of duplicate or thin content. John added:
“…you can also do both of them. And it’s something… if external links, for example, are pointing at this page then having both of them there kind of helps us to figure out well, you don’t want this page indexed but you also specified another one.
So maybe some of the signals we can just forward along.”
Ultimately, the decision boils down to the preferences of the content publisher. It hinges on whether the goal is to completely exclude a page from appearing in search results or to prioritize streamlined and consolidated search outcomes. To gain a more comprehensive understanding of this concept, it’s worth checking out John Mueller’s response to the question starting at the 16:49-minute mark. Furthermore, staying engaged with the entirety of the discussion might unveil additional valuable insights.
Source: Search Engine Journal