Home > Technical SEO > The Complete Guide for Robots.txt file and Its Usage

The Complete Guide for Robots.txt file and Its Usage

The Robots.txt file is very important for doing On-Page SEO for any website or blogs. This Robots.txt file give you the power to instruct search engines like Google, Yahoo, Bing etc, that which pages from your website they should crawl and which pages they should index and can show in their SERP (Search Engine Results Pages).

Seo 2019: Actionable, Hands-On Seo, Including a Full Site Audit (Webmaster)

 

Hence to understanding of, how to use Robots.txt file is one of the most important task for a SEO specialist. This files is consider as one of the most power full tools  in the world of Internet marketing, it is because, this file involves a direct relation between website and the search engine’s crawlers and you can say with ranking. Being able to direct the search engine’s crawler on where they should go and which pages they should include in the their indexing database is a huge advantage for us.

We can use that to make sure that only important pages of our website like product pages, blog posts, contact page and others general pages are crawled and indexed by the search engines faster, not the payment pages or admin login pages being crawled and indexed by search engines. However, before we jump into the details of how and when to use this file, we must first know what it is and what is its specific functions.

What is a Robots.txt file?

The robots exclusion standard, or more usually known as Robots.txt is a file, which  instruct web crawlers or robots like Googlebot, Slurpbot, Bingbot which pages of websites should not be crawled.

What is the use of a Robots.txt file?

The robots.txt file is only a crawling directive and it cannot control how fast a bot should crawl your website and other bot behaviors. This is just a set of instructions for bots on what parts of your website should not be accessed.

You should also take note that while some bots respect robots.txt file, some can ignore it. Some robots can exploit files on your website or even harvest information so to completely block malware robots, you should increase your site security or protect private pages by putting a password. If you have other questions about the robots.txt, check out some frequently asked questions on robots here.

How to Create a Robots.txt File?

By default, a robots.txt file would look like this:

Robots txt example

You could create your own robots.txt file in any program that is in .txt file type. You could block different URLs such as your website’s blog/categories or /author pages. Blocking pages like this would help bots prioritize important pages on your website more. The robots.txt file is a great way of managing your crawl budget.

Wildcards used properly

In the robots.txt, a wildcard, represented as the (*) symbol, can be used as a symbol for any sequence of characters.

A directive for all types of crawl bots:

User-agent:*

 

The wildcard could also be used to disallow all URLs under the parent page except for the parent page.

User-agent:*

Disallow: /authors/*

Disallow: /categories/*

This means all page URLs under the main author page and categories page are blocked except for them.

A good and best example of a robots.txt file would look like this:

User-agent:*

Disallow: /testing-page/

Disallow: /account/

Disallow: /checkout/

Disallow: /cart/

Disallow: /products/page/*

Disallow: /wp/wp-admin/

Allow: /wp/wp-admin/admin-ajax.php

 

Sitemap: example.com/sitemap.xml

Upload Location for File:

After editing your robots.txt file, you should upload in the root of your websites Cpannel if your website is hosted as addons domain then add in the file of your domain so when a bot enters your website for crawling, it would see the robots.txt file first.

Check Also

Google Analytics

Benefits of Google Analytics in SEO

Leave a Reply

Your email address will not be published. Required fields are marked *