The last few days have seen an uncontrollably massive spam attack against Google’s search results. Several domains display rankings for hundreds of thousands of keywords each, suggesting that the attack’s scope may extend to millions of keyword combinations.
This was recently brought to my notice by a series of posts by Bill Hartzer (LinkedIn profile), who revealed the link networks of several spam sites in a link graph produced by the Majestic backlinks tool.
Numerous websites were closely interlinking with one another in the link graph he shared, which is a fairly common pattern for spammy link networks.
Uncontrolled Spam Plaguing Google SERPs
In the expansive landscape of Google Search Engine Results Pages (SERPs), numerous websites are gaining rankings for longtail phrases, which are often easier to rank due to their rarity and localized search components. Longtail phrases, uncommonly used keyword combinations, have been a conceptual staple for nearly two decades, notably popularized by the 2006 book “The Long Tail: Why the Future of Business is Selling Less of More.”
Exploitatively, spammers capitalize on these rarely searched phrases due to minimal competition, facilitating swift rankings for an abundance of keywords within a short span. By inundating the digital realm with millions of pages optimized for longtail phrases, spammers secure rapid daily orders for hundreds of thousands of keywords.
Drawing a parallel to renowned companies like Amazon that leverage the longtail concept to sell myriad individual products daily, these spammers utilize the ease of ranking for longtail phrases to manipulate search results.
Furthermore, the exploitation extends to the inherent loophole within Local Search algorithms, distinct from those ranking non-local keywords. Instances have surfaced predominantly tied to variations of Craigslist-related keywords, such as “Craigslist auto parts,” “Craigslist rooms to rent,” “Craigslist for sale by owner,” and an array of similar terms, often devoid of the explicit use of the word “Craigslist.”
The sheer magnitude of this spam infestation extends far beyond keywords explicitly incorporating the term “Craigslist,” indicating a widespread and extensive infiltration of spam across Google SERPs.
Unveiling the Characteristics of Spam Pages
Attempting to inspect the appearance of spam pages through conventional browsing methods proves futile. Efforts to view these pages’ source code or content via browsers or website checkers are met with automatic redirection to alternative domains.
The redirected results persisted even after attempting to access the spam URLs through various means—using the W3C link checker or adjusting the browser’s user agent to simulate Googlebot. The spam sites seemed to discern Googlebot by IP address rather than the user agent, tailoring content display specifically for Googlebot-identified IP addresses.
Consequently, ordinary visitors encountered redirects leading to dubious content on different domains, while the spam pages exclusively catered to Googlebot-originated IP addresses.
To gain insight into the HTML composition of these elusive spam sites, resorting to Google’s Rich Results tester proved instrumental. Utilizing this tool, the HTML content of the spam page was retrieved, a method successfully replicated by Bill Hartzer, who promptly shared his findings on Twitter. Leveraging the Rich Results Tester’s option to display HTML content, the HTML code was extracted, saved, and archived in a text file for further analysis.
Deciphering the Success Behind this Spam Strategy
Distinguishing between local and non-local search algorithms is pivotal in understanding the effectiveness of this spam technique. In the realm of local searches, the algorithm functions differently, often requiring fewer links for ranking compared to non-local searches. Instead, emphasis lies on strategically placed keywords that align with the geographic area to trigger the local search algorithm.
Consider a query like “Craigslist auto parts.” Given its longtail nature and connection to local searches, it only requires a little effort to secure a ranking within this context.
This issue has persisted for several years. In a past incident, a website managed to rank for the query “Rhinoplasty Plano, Texas” by incorporating old Roman Latin content with English headings. Rhinoplasty, being a longtail local search, coupled with the relatively modest size of Plano, Texas, made it remarkably easy for the Latin-language website to attain a ranking for this keyword phrase. Acknowledging this longstanding spam issue, Google, as highlighted in a tweet by Danny Sullivan on December 19th, has been aware of the problem for some time.
The imminent question remains whether Google will eventually devise strategies to counter such spam techniques effectively, which has intrigued observers considering the prolonged duration of this challenge.
If you still need help and clarification, look at our monthly SEO packages and get professional assistance.