Search Engine Optimization

Monitor the robots.txt

Website Monitoring Magazine

The most important technical file in the SEO area is the robots.txt in the root directory. Search engines look there first to know what they are allowed to do and what not. This is quite practical, because you don't always want to see admin areas or search results in the Google index, for example.

But since you can get into a lot of trouble by filling this file incorrectly, it is important to monitor it continuously. But what is the best way to proceed here? In the following sections we will discuss two approaches.

Black listing approach

The most damage can be done with the disallow parameter. This can prohibit all or specific search engines from indexing certain files.

Black-listing, i.e. forbidding certain commands, is a very good approach in this case. So we can simply turn on an alarm when we find a critical disallow statement. In principle, this is a very effective method. But, as always, there is a but here.

Such statements can get quite complex, since you can also work with wildcards, for example. So, finding the appropriate black listing expression that must not occur is relatively difficult. What is certainly useful, however, is to use a disallow: * to alert. Herewith one has already covered the worst case.

If you like this article, please subscribe to our newsletter. After that you will not become one of our articles about monitoring and agencies.

Yes, I want to subscribe to your newsletter

Delta approach

As much as one can achieve with the robots.txt, its content does not change. Normally, this is initially adjusted once cleanly and then rather neglected. If a change is made there once in the quarter, then that is often already much.

That's why the delta approach is probably the best. We are informed as soon as there is a delta, i.e. a difference, between the last validated version and the current one. The alert does not have to be critical, but as a technical SEO you should at least keep an eye on it. If everything is OK in the new version, it is declared the last validated one and we start again.

If you want to implement a monitoring of the robots.txt for your own online store, you should choose a combination of both approaches. Filter critical violations via the black list, have changes reported via the delta. Or one uses koality.io, as these approaches are already implemented here.

It's nice that you are reading our magazine. But what would be even nicer is if you tried our service. koality.io offers extensive Website monitoring especially for web projects. Uptime, performance, SEO, security, content and tech.

I would like to try koality.io for free