Page 1 of 1

How to use Google Search Console and ScreamingFrog to perform a technical audit of your domain

Posted: Wed Jan 22, 2025 6:17 am
by tongfkymm44
Using APIs to connect Google Search Console with ScreamingFrog
This is the second guide on Google Search Console (GSC). While the previous one focused on using Search Console for keyword analysis , this article focuses on using the tool to perform a technical SEO audit of your domain.

As Olga Zarr described in her article, it is possible to audit your domain using GSC alone . This can be done using the functions listed in GSC:


However, the beauty of APIs is allowing you to combine multiple pieces of data. Using GSC with a crawler like ScreamingFrog (SF) allows you to gain a broader view of any technical issues and optimizations. ScreamingFrog has an article along with a video explaining how to connect to the Google Search Analytics and URL Inspection APIs and pull data directly during a crawl.

Note: Structured Data will be covered extensively and separately in another article.

Indexed pages vs. non-indexed pages
Pages
Check the number of your indexable HTML pages in SF against the indexed fishing and forestry email database list pages in GSC. Export both lists and compare them in an Excel spreadsheet to identify which URLs GSC is not indexing, ask yourself if they are valuable and if you want them to be indexed.


Don't worry if there are pages that aren't indexable - every domain has unindexed pages in its GSC. What you want to ensure is that there aren't any valuable pages that are left unindexed.

Improve your SEO!

Also, check the list of non-indexed URLs and the reasons why they are not indexed. For URLs listed as Page Redirect and Not Found (404), open your crawl in SF and go to Bulk Exports > Response Codes > and export the 4xx and 3xx Report. This way, you will identify where the links pointing to your 3xx and 4xx URLs are coming from. Being able to find where these links are linked from allows you to know how the bot finds these links and remove them.

Pay special attention to “Crawled – currently not indexed.” You may find parameter URLs that you can exclude from crawling by editing your robots.txt file.

However, if you have valuable URLs that should be indexed and aren't, you may need to review the quality of your content. One of the most common issues that lead to unindexed pages is poor quality - refresh your content by making sure there are internal links with the correct anchor pointing to the URL.

Remember, as John Mueller said, “ It’s normal for 20% of a site to not be indexed .”

Videos

It is very common for GSC to report “Videos on pages are not indexed” with the reason “The video is not the main content of the page”.

As of December 2023, Google’s documentation has changed to say “Videos that are not the main content of the page will appear as ‘No videos indexed’ in Search Console.”

Therefore, evaluate whether the video is the main content of the page on your pages; if it is not, it is likely not indexed, as indicated by Google's documentation.

Sitemaps

Before crawling with SF, use the sitemap URL to add it to the crawler. SF will help you analyze errors in your sitemap URLs, such as if you have non-200 https URLs in your sitemap.


GSC will let you know if there are any errors while retrieving the sitemap. Open and review the sitemaps and complement SF to audit your sitemap. If you have multiple sitemaps that have been added over time, delete the old and unnecessary ones.

Eliminations
Improve your SEO!
In my experience, I have never seen a notification from Google in the Removals section. This section can also be used to remove URLs from your site from the index. ContentKing has a comprehensive article on “How to Remove URLs from Google Search Quickly.”


Once you begin auditing the domain for the first time, this section should be reviewed to determine if any of these requests have been sent and, if so, to understand the intent behind them.

Page Experience
As Rick Viscomi of the Chrome Web Performance team stated in episode 71 of the “Search Off The Record” podcast, the main difference to note for Core Web Vitals is that the data in GSC is field data that describes the actual user experience with the domain. So when you analyze the performance of a URL on your computer, remember that your users might have a different experience in terms of page performance than the one you see, both in terms of experience and appropriate Core Web Vitals values.

So, as Rick Viscomi says, while your Core Web Vitals (CWV) performance alone won’t lead to a drop in your site’s traffic, on the other hand, this data should be fully analyzed. And more prominence should be given to field data from your users over lab data (your CWV tests on your laptop).