In the recently concluded Google’s Office-hours Hangout, John Mueller was asked if Rel Canonical and Noindex were the best ways to deal with thin or duplicate content, specifically in an ecommerce site. He shed light on both and suggested that choosing one over the other isn’t the best way to go about it. He said that each one is can be used to its advantage given how you want the content to be handled by Google. He also suggested that even both can used together to deal with duplicate or thin content, however rarely this option is discussed in the SEO circles. Before we go any further, let’s put both Rel Canonical and Noindex Directive under the microscope to get a clearer picture for the uninitiated.
What is Rel Canonical?
As per Google’s official documentation:
“A canonical URL is the URL of the page that Google thinks is most representative from a set of duplicate pages on your site. For example, if you have URLs for the same page (example.com?dress=1234 and example.com/dresses/1234), Google chooses one as canonical. The pages don’t need to be absolutely identical; minor changes in sorting or filtering of list pages don’t make the page unique (for example, sorting by price or filtering by item color).
The canonical URL can be in a different domain than a duplicate URL.”
In simple terms, Rel Canonical is not a directive but sort of a hint that, quote-unquote, hints Google which URL to show in the search results. But since this is not a directive, Google may still choose to show in the search results. It is however extremely useful if you are running an ecommerce site and there are multiple pages of the same product with only minor variations in the content, or content being nearly duplicate.
What is Noindex Directive?
As per Google’s official documentation:
“You can prevent a page or other resource from appearing in Google Search by including a noindex meta tag or header in the HTTP response. When Googlebot next crawls that page and sees the tag or header, Googlebot will drop that page entirely from Google Search results, regardless of whether other sites link to it.”
In simple terms, and as the name suggests, Noindex is a directive, which means Google MUST not index that particular webpage and naturally, drop it from appearing in the search results.
Rel Canonical or Noindex Directive, which one to choose?
The viewer who asked the question wanted to know which one is the best way to go about it given that the website in question is an ecommerce website.
“We have a website… an ecommerce store with a lot of product variations that have thin content or duplicate content even sometimes. So …I made a list of all the URLs we want to keep or we want to have indexed… and then I made a list of all the URLs that we don’t want to have indexed.
The more I worked on it the more I asked this question to myself, canonicalization or noindexing? I don’t know what the better of those would be.”
To which John said:
“…I think the general question of should I use noindex or rel canonical for another page is something where there probably isn’t an absolute answer. So that’s kind of just offhand. It’s like if you’re struggling with that you’re not the only person who’s like, oh which one should I use?
That also usually means that both of these options can be okay. So usually what I would look at there is what your really strong preference there is. And if the strong preference is you really don’t want this content to be shown at all in search, then I would use noindex.
If your preference is, I really want everything combined in one page and if individual ones show up, like whatever, but most of them should be combined, then I would use a rel canonical. And ultimately the effect is similar in that, well, it’s likely the page that you’re looking at won’t be shown in search.
But with a noindex it’s definitely not shown. And with a rel canonical it’s more likely not shown.”
How about combining Rel Canonical and Noindex?
While it is not discussed as a possible solution, John said one could use both at the same time in order to negate the doings of duplicate or thin content. John added:
“…you can also do both of them. And it’s something… if external links, for example, are pointing at this page then having both of them there kind of helps us to figure out well, you don’t want this page indexed but you also specified another one.
So maybe some of the signals we can just forward along.”
It all comes down to how the published wants it all to be. Whether it’s important that the page is not shown in the search results at all or if consolidated results are the order of the day. Last but not the least, do watch John Mueller answer the question at the 16:49 minute mark and may be also stick around for the rest of the discussion for some other valuable insights.
Source: Search Engine Journal