“Discover the Value of Quick Insights: Assessing the SEO Gains in 15 Minutes with Bulk Export from GSC to BigQuery.”
“Google Search Console: A Valuable Resource for SEO Professionals, Yet Overcoming Limitations Wasn’t Always Simple
In the past, addressing these limitations often required specialized knowledge or developer assistance for direct API data retrieval.
However, around 2018, Google introduced a native connector from Google Search Console (GSC) to what is now known as Looker Studio, streamlining the process.”
“This integration empowers users to effortlessly extract data from GSC directly into Looker Studio (formerly Google Data Studio), enabling the creation of customizable reports and dashboards without needing third-party connectors or additional API configurations.
However, the plot thickened in February 2023.
Google introduced a fascinating development: the ability to automate a built-in bulk data export to BigQuery, Google’s data warehousing solution.
Let’s have an open conversation for a moment: Many of us still heavily rely on the GSC interface for various tasks.
This article will explore why bulk data export to BigQuery is a game-changer.
But a word of caution: It’s not a one-size-fits-all solution for all SEO professionals’ challenges. Nevertheless, it’s a valuable tool for those who can configure and utilize it effectively.”
Breaking Free from Data Limitations with BigQuery Bulk Exports
Initially, the bulk data export feature was designed to cater to websites that garnered traffic from tens of thousands of pages or queries.
Data Volume Options
Currently, there are three data export alternatives apart from the BigQuery bulk data export:
- Most reports in GSC permit the export of up to 1,000 rows.
- Through a Looker Studio integration, you can acquire up to 50,000 rows.
- Using the API, accessing up to 50,000 rows is possible, retrieving additional elements such as URL Inspection, sitemaps, and site-specific data.
As elucidated by Daniel Waisberg, a Search Advocate at Google:
“The most potent method for exporting performance data lies in the bulk data export, granting access to the most extensive data sets available.”
When you utilize BigQuery’s bulk export, you’ll encounter no limitations on the number of rows.
BigQuery’s bulk data export feature enables you to retrieve every single row of data within your GSC account.
This feature renders BigQuery exceptionally well-suited for extensive websites or SEO analyses that demand access to a comprehensive dataset.
Data Preservation
Google BigQuery offers boundless data retention, allowing SEO professionals to conduct historical trend analysis without being constrained by Google Search Console’s 16-month data storage limit. Looker Studio and the API do not inherently provide this invaluable feature. Consequently, it grants you the substantial capacity to track developments over multiple years and gain enhanced insights into and analyses of these trends.
As a data storage solution, BigQuery permits you to keep your data indefinitely, effectively circumventing this limitation.
The capability to retain and access an unlimited archive of historical data represents a paradigm shift for SEO professionals for several compelling reasons:
Comprehensive Extended Analysis: The unlimited data retention feature empowers SEO analysts to conduct in-depth trend analyses spanning extended durations. This is especially beneficial for those seeking a more precise evaluation of the long-term performance of their SEO strategies.
Seasonal and Event-Driven Insights: For websites subject to seasonal fluctuations or periodic traffic spikes due to events, the capacity to access longer historical data proves invaluable in understanding and harnessing insights from these trends.
Tailored Reporting: With all your data stored in BigQuery, creating customized reports tailored to specific requirements becomes effortless. You can fashion reports to address a wide array of questions.
Enhanced Troubleshooting: Tracking performance over time simplifies identifying issues, comprehending their underlying causes, and implementing practical solutions.
Flexibility: Unlimited data retention provides adaptability in fine-tuning SEO strategies and maintains a comprehensive historical perspective for contextual decision-making.
Data Limitations
As is familiar with most data tracking tools, it’s important to note that retroactivity is not supported. It’s essential to understand that the GSC bulk data export initiates the daily data transmission to BigQuery only after the setup process. Consequently, you won’t be able to store or access data predating the setup date.
This system operates on a “from this point forward” basis, necessitating advanced planning if you intend to utilize historical data in the future. Even with careful planning, data exports may commence up to 48 hours after setup.
While the bulk data export encompasses significant metrics like site and URL performance data, it’s worth mentioning that not all data types are included in the export.
Another critical aspect to consider is the use of partitioned data tables. In BigQuery, data is structured into tables partitioned by dates.
Each day, new data is added, and it’s essential to keep this in mind while crafting your queries, mainly if you aim to optimize the efficiency of your operations.
If this concept still seems unclear, remember that data arrives daily, significantly influencing your data analysis approach.
Why Implement This Setup?
There are several advantages to establishing BigQuery bulk exports:
- Integration of GSC Data with Other Data Sources: Storing Google Search Console data in a data warehouse allows you to harness the benefits of combining it with other data sources, whether directly within BigQuery or in your data warehouse. For example, you can merge GSC and Google Analytics 4 data to gain deeper insights into conversions and behaviors stemming from organic Google traffic.
- Conducting Complex Calculations and Operations with SQL: A platform like BigQuery empowers you to query your data, facilitating the execution of intricate calculations and operations to delve deeper into your analysis. Using SQL, you can segment, filter, and apply your unique formulas.
- Anonymized Query Handling: BigQuery handles anonymized queries differently than other ETL vendors that access data via the API. It aggregates all metrics for anonymized questions per site or per URL basis per day. Instead of simply omitting rows, this approach ensures that analysts receive comprehensive totals for impressions and clicks when aggregating the data.
Are There Any Drawbacks?
Regrettably, no tool or solution is without its imperfections. This newly integrated solution does come with some downsides, including the following:
Expanded Skill Set Requirement
To fully leverage this integration, you must broaden your expertise beyond SEO. Familiarity with Google Cloud Platform, BigQuery, and SQL is necessary in addition to your GSC knowledge. Setting up a bulk data export entails tasks within both GSC and Google Cloud.
SQL-Centric Platform
BigQuery relies heavily on SQL for data access and optimization. This means you or a team member should be proficient in crafting SQL queries. Moreover, the platform has unique operational nuances, making efficient utilization a learning curve that demands time and experience.
Potential Limitations in Data Manipulation
While Looker Studio supports SQL-like data manipulation, it may provide only some of the power and flexibility of SQL for intricate analyses. Additional processing may be needed to achieve similar results when working with API data.
Varied Anonymized Query Levels in URL Impressions
It’s crucial to note the variance in anonymized query volumes when comparing the “searchdata_url_impression” table and the “searchdata_site_impression” table. Similar to the GSC interface, specific queries for particular URLs in specific regions may be so rare that they could reveal the searcher’s identity.
Consequently, you’ll observe a higher percentage of anonymized queries in your “searchdata_url_impression” table compared to your “searchdata_site_impression” table. Source: Trevor Fox.
Possible Expenses
While this feature is accessible, to begin with, its cost may not remain the same indefinitely.
BigQuery incurs charges based on the volume of data stored within a project and the queries executed.
The solution includes specific thresholds at which you may begin incurring monthly expenses.
Over time, this could lead to increased costs. However, the extent of these costs is contingent upon the amount of data exported (websites with many pages and queries are likely to be more resource-intensive in this regard) and the complexity of the questions used to access and process the data.
What Comes Next?
You’re now ready to dive into data querying! Here are some valuable analyses you can perform that might not be as easily achievable through other methods:
- Batch Analysis of Multiple Pages: In BigQuery, you can execute a single SQL query to obtain metrics for multiple pages (or a selected subset of pages) without the need to access each page individually.
- Traffic Seasonality Assessment: Compare performance metrics seasonally to spot trends and fine-tune campaigns accordingly.
- Bulk Analysis Across Multiple Websites: If you oversee a brand with multiple websites, this feature enables you to collectively examine clicks across all these sites.
- Click-Through Rate (CTR) Analysis by Page and Query: Instead of merely assessing the average CTR, you can calculate the CTR for each page and search query, providing more detailed insights.
To sum it up, the integrated bulk data export feature from Google Search Console to Google’s BigQuery presents a powerful solution for SEO data analytics. Nonetheless, it comes with certain constraints, including the requirement to acquire proficiency in Google Cloud and SQL and potential expenses linked to BigQuery storage and query usage.
If you find all this challenging and perplexing, consider exploring our monthly SEO packages, where industry experts can manage them on your behalf.