What is Robots.txt?
The robots exclusion protocol, or robots.txt is a text file that you can use to instruct search engine bots on how to crawl and index pages on your website.
Where is the Robots.txt File Located on my website?
The robots.txt file is normally located at the root of your site:
The robots.txt file indicates to Google which parts of your site you don’t want accessed by search engine crawlers. Sometimes site owners will block ridiculous things in their robots.txt file, so it is an essential check to see what is being blocked.
Why is the Robots.txt file important for SEO?
Unfortunately it is all to common for small businesses and even larger businesses to block parts of their site unintentionally using the robots.txt file.
By doing this you are effectively saying to Googlebot: “Don’t crawl my site!”. This then means your pages will not be indexed in Google and you can’t rank for your organic keywords.
How can I stop search engine bots crawling my site?
If you see the below in a robots.txt file, this will instruct all robots to stay OUT of the website. From a search perspective, this is something that you generally don’t want. (Unless you are Louis Theroux, or you are creating a test site that you don’t want Googlebot or any other bots to see.
To stop search bots crawling your site enter the following in your robots.txt file:
User-agent: * Disallow: /
User-agent: * ----> This command applies to all robots. Disallow: / ----> This command instructs the robot not to visit any pages on your website
How can I monitor search engine bots crawling my site?
By downloading the Log Files from your web server you can analyse how search engine bots (typically Google for SEO purposes are interacting with your site). Using a program such as Screaming Frog’s Log File Analyser you can also visualise quickly which bots are visiting your website and understand whether you need to block any of them.