We’ve asked SEO experts via many channels about these examples. The results were quite surprising. Not every SEO expert seems to be as proficient in interpreting robots.txt files as we thought.
What about yourself?
Give us your opinion in the form at the bottom of this page, and get a chance to win 1 Year of URLinspector.
Given is this robots.txt for a restricted area:
User-agent: *
Disallow: /secret/*
User-agent: Googlebot
Allow: /coolgooglestuff/*
User-agent: Spacebot
Allow: /*
Q1 will Googlebot crawl the /secret/ folder YES or NO?
Bonus: Explain why.
Given this robots.txt
User-agent: *
Disallow: /
Allow: /style/
Allow: /userfiles/
Is the Googlebot allowed to crawl the /userfiles/
folder? YES or NO?
Bonus: Explain why.
With the given robots.txt file
# block this bot
User-agent: Somebot
Disallow: /
# don't block this bot, but slow him down
User-agent: Googlebot
Crawl-Delay: 1800
# block this bot
User-agent: Someotherbot
Disallow: /
Q3: Is Googlebot…
Bonus: Explain why.
Working with robots.txt seems trivial at first glance. But it’s not. There are many pitfalls and traps.
And even seasoned experts fail.
In URLinspector we use the original robots.txt library published by Google, the same code that Googlebot uses.
If you’re using software with some homemade robots.txt parser, you’re not doing yourself a favor.
Did you know? There are currently 142 robots.txt parsers in Github, and those are only the open source ones.
Guess how many more are hidden in private repos, from developers suffering from the “not invented here syndrome”?
See, even Moz have their own “modern robots.txt parser”, whatever that means. No thanks folks, we’d rather go with the original by Google.
Why? The original Googlebot robots library out there works differently to so many other robots.txt libraries out there, in some cases.
Also no JS, PHP or Node “interpretation” needed.
Of course, just interpreting robots.txt by visual inspection is a problem and will get you wrong.
But also, using all sorts of software to “test robots.txt” can go wrong simply because there’s so much faulty code out there.
Don’t miss the chance to win an account for a full year of URL Inspector Bronze.
Let us know what you think the correct answers are in the form below.
URLinspector uses the original robots.txt library published by Google.
That’s the same code run by Googlebot, to crawl your website.
One contributor there is Gary Illyes, whom you may know.
Why settle for less?
Do you want results that vary from what Google would do? We don’t think so.
Why not give URLinspector a try?
You can setup a free trial for 14 days with URLinspector, and see how it works.
Start with 14 days free trial; no credit card is required.
Don’t let poor indexing hold your website back. Learn why looking after indexing is crucial for search engine visibility, credibility, and revenue.
Give us your opinion in the form at the bottom of this page, and get a chance to win 1 Year of URLinspector.
It’s not an easy decision between subdomains and subfolders, but we think subfolders are the best recommendation for most marketers.