SarkarSEO | Google Encourages Blocking Action URLs via Robots.txt

Posted at 06:50h in blog by Shilpi Mathur

Gary Illyes of Google advises using robots.txt to block crawlers from “add to cart” URLs, effectively preventing unnecessary server resource consumption. This longstanding best practice remains crucial in minimizing wasted server resources caused by irrelevant crawler activity on action URLs.

Google’s Advice on Using Robots.txt for Action URLs

In a LinkedIn post, Gary Illyes, an Analyst at Google, reiterated essential guidance for website owners: Utilize the robots.txt file to block web crawlers from accessing URLs that initiate actions such as adding items to carts or wishlists.

Illyes emphasized the frequent issue of unnecessary crawler traffic burdening servers, often caused by search engine bots crawling URLs designed for user interactions. He stated:

“From our analysis of sites reporting issues, a significant portion of the traffic comes from action URLs like ‘add to cart’ and ‘add to wishlist.’ These URLs serve no purpose for crawlers and are typically unwanted.”

To mitigate this strain on server resources, Illyes recommended implementing rules in the robots.txt file to disallow access to URLs containing parameters. For instance, he provided the following advice:

“If your site includes URLs like:

https://example.com/product/scented-candle-v1?add_to_cart and

https://example.com/product/scented-candle-v1?add_to_wishlist

You should consider adding a disallow rule for them in your robots.txt file.”

While Illyes acknowledged that using the HTTP POST method can also prevent the crawling of such URLs, he cautioned that crawlers can still initiate POST requests. Therefore, he reaffirmed the effectiveness of robots.txt in managing and preventing unnecessary crawler access to action-oriented URLs.

Reinforcing Decades-Old Best Practices: Google’s Guidance on Robots.txt for Action URLs

In the discussion, Alan Perkins highlighted the enduring relevance of this advice, drawing parallels to web standards established in the 1990s.

The robots.txt standard, which proposed rules to limit well-behaved crawler access, emerged as a consensus solution among web stakeholders in 1994.

Obedience & Exceptions: Google’s Commitment to Robots.txt Rules

Gary Illyes affirmed Google’s strict adherence to robots.txt rules, highlighting rare documented exceptions for specific cases such as “user-triggered or contractual fetches.”

This commitment underscores Google’s longstanding adherence to the robots.txt protocol as a cornerstone of its web crawling policies.

Why We Care: The Resurgence of a Time-Tested Best Practice

Despite its simplicity, the resurgence of this decades-old best practice highlights its enduring relevance.

Implementing the robots.txt standard allows websites to manage excessive crawler activity that consumes bandwidth without delivering value.

How This Can Help You: Leveraging Robots.txt for Better Website Management

Whether you operate a small blog or a central e-commerce platform, adhering to Google’s recommendation to utilize robots.txt for blocking crawler access to action URLs offers several benefits:

Reduced Server Load: Preventing crawlers from accessing URLs that trigger actions like adding items to carts or wishlists helps decrease unnecessary server requests and conserves bandwidth.
Improved Crawler Efficiency: By clearly defining rules in your robots.txt file on which URLs crawlers should avoid, you can enhance crawling efficiency by focusing on indexing and ranking the pages and content that matter most.
Enhanced User Experience: By directing server resources towards actual user interactions rather than wasted crawler hits, you can improve load times and ensure smoother functionality for your website visitors.
Adherence to Standards: Implementing these guidelines aligns your site with established robots.txt protocol standards, which have been industry best practices for decades.

Revisiting and updating robots.txt directives represents a straightforward yet impactful strategy for websites aiming to exert greater control over crawler activity. Illyes’ message reinforces the ongoing relevance of these foundational rules in today’s web environment.

If you need help, check out our monthly SEO packages and let the experts help you.

Tags:

benefits of monthly seo, google update 2024

Shilpi Mathur

navyya.shilpi@gmail.com

Google Advises Websites to Block Action URLs Using Robots.txt