Robots.txt Generator | Advance best seo tools

Robots.txt Generator

best seo Tools

Robots.txt Generator


Default - All Robots are:  
    
Crawl-Delay:
    
Sitemap: (leave blank if you don't have) 
     
Search Robots: Google
  Google Image
  Google Mobile
  MSN Search
  Yahoo
  Yahoo MM
  Yahoo Blogs
  Ask/Teoma
  GigaBlast
  DMOZ Checker
  Nutch
  Alexa/Wayback
  Baidu
  Naver
  MSN PicSearch
   
Restricted Directories: The path is relative to root and must contain a trailing slash "/"
 
 
 
 
 
 
   



Now, Create 'robots.txt' file at your root directory. Copy above text and paste into the text file.


About Robots.txt Generator

Enter more information about the Robots.txt Generator tool!

Robots.txt can be used quickly and easily with the help of this RobotsST generator. Create a file. Help search engines properly index your page by using a robot.txt file. By default, this tool will allow major search engines to crawl every part of your webpage - if there are any fields that you would like to exclude, simply add them to this file and upload it to your root directory.

A robots.txt file is the root of your site. So, for the site www.example.com, the robots.txt file is located at www.example.com/robots.txt, at robots.txt, a plain text file that follows the robots extension value. A Robot.ST file contains one or more rules. Each rule blocks access to (or allows) a given crawler on the specified file path on that website.

Here's a simple robots.txt file with two rules explained below:

# Group 1
User-agent: Googlebot
Disallow: /nogooglebot/

# Group 2
User-agent: *
Allow: /

Sitemap: http://www.example.com/sitemap.xml

Explanation:

The user agent known as the "Googlebot" crawler should not crawl the http://example.com/nogooglebot/ folder or any subdirectories.
All other user agents can access the entire site. (This can be omitted, and the result may be the same, because the full access estimate)
The Sitemap file for the site is located at http://www.example.com/sitemap.xML
We will provide more detailed examples later.

Primary robots


Here are some basic guidelines for robots.txt files. We recommend that you read the entire syntax of the RobotsTTS file because the robot text syntax has some fine behaviors that you should understand.

Format and location


You can use almost any text editor to create a Robot.txt file. The text editor should be able to create standard UTF-8 text files; Do not use word processors, as word processors often save files in a proprietary format and may add unexpected characters such as curly quotes that create problems for crawlers.

Use the robots.txt tester tool to write or edit robots.txt files for your site. This tool enables you to test syntax and behavior against your site.
Format and location rules:

The filename must be robot text


Your site may contain only one robot context file.
The robots.txt file must be at the root of the implemented website host. For example, to control crawling of all URLs below http://www.example.com/, the robots.txt file must be located at http://www.example.com/robots.txt. It cannot be placed in a sub-directory (for example, at http://example.com/pages/robots.txt). If you do not have access to the source of your website, but if you would like permission, contact your web hosting service provider. If you cannot access the root of your website, use alternative blocking methods such as meta tags.
A Robot.SST file can be applied to subdomains (for example, http://website.example.com/robots.txt) or non-standard ports (for example, http://example.com:8181/robots.txt).
Comments are content after a # sign.


Syntax

robots.txt must be a UTF-8 encoded text file (which includes ASCII). It is not possible to use other character sets.
A robots.txt file contains one or more groups.
Each group has multiple rules or instructions, one instruction per line.
A group gives the following information:
Who Implements the Group (User-Agent)
An agent can access any directory or files and/or
Agents cannot access any directories or files.
Groups are processed from top to bottom and a user agent can only match a set of rules, the first, precise rule to match a given user agent.
The default idea is that no user agent can crawl a page or directory to cancel it: not blocked by rules.
The rules are case-sensitive. For example, discard/file.asp http://www.example.com/file.asp applies, but not http://www.example.com/FILE.asp.
The following instructions are used in the robots.txt files:

User-Agent: [one or more required per group] This rule applies to a search engine robot (web crawler software). This is the first line for any rule. Most Google user-agent names are listed in the Web Robot Database or Google User-Agent lists. * Support wildcards for path prefixes, suffixes, or entire strings. The example below uses different asterisks (*) to match all crawlers, whose names must be explicitly specified. (See the list of Google crawler names) Examples:

  • # Example 1: Block only Googlebot
    User-agent: Googlebot
    Disallow: /
    
    # Example 2: Block Googlebot and Adsbot
    User-agent: Googlebot
    User-agent: AdsBot-Google
    Disallow: /
     
    # Example 3: Block all but AdsBot crawlers
    User-agent: * 
    Disallow: /
  • Disallow: [At least one or more Disallow or Allow entries per rule] A directory or page, relative to the root domain, that should not be crawled by the user agent. If a page, it should be the full page name as shown in the browser; if a directory, it should end in a / mark.  Supports the * wildcard for a path prefix, suffix, or entire string.
  • Allow: [At least one or more Disallow or Allow entries per rule] A directory or page, relative to the root domain, that should be crawled by the user agent just mentioned. This is used to override Disallow to allow crawling of a subdirectory or page in a disallowed directory. If a page, it should be the full page name as shown in the browser; if a directory, it should end in a / mark. Supports the * wildcard for a path prefix, suffix, or entire string.
  • Sitemap: [Optional, zero or more per file] The location of a sitemap for this website. Must be a fully-qualified URL; Google doesn't assume or check http/https/www.non-www alternates. Sitemaps are a good way to indicate which content Google should crawl, as opposed to which content it can or cannot crawl. Learn more about sitemaps. Example:
    Sitemap: https://example.com/sitemap.xml
    Sitemap: http://www.example.com/sitemap.xml

Other rules are ignored.