Welcome to Blog – iRedlof

Random Post Refresh

Challenge Yourself

“Can your challenge yourself  ?” Have your ever thought of designing a personal intrusion detection software just using you PC and your web cam ? or Have you...
Read More ...
Help

If you need any help, leave a comment here or mail me »

Member Login
Not a member yet? Sign Up!

Math Required!
What is the sum of: 8 + 5    

Forgot Password !

Powered by

23

May

2009

Allow/Disallow Search Engines Bots Using Robots.txt [With Examples]

By admin. Posted in SEO

0 Comments »   |   Share   |  


The robots.txt file is a text file containing commands to the engine crawlers research to clarify their pages who may or may not be indexed.

Thus any search engine began its exploration of a website seeking robots.txt at the root of the site.

Format robots.txt :

The robots.txt (written in lower case and plural) is an ASCII file that are at the root of the site and may contain the following commands:

  • User-Agent: (value)
    Allows you to specify the robot affected by the following guidelines.
    (value) can be * meaning “all search engines”, Googlebot for google search engine bot, Yahoo-slurp for Yahoo search engine bot, Msnbot for Msn search engine bot, etc for other specific search engine bots which follow robots.txt standards.
  • Allow: (value)
    Allows you to specify the pages to include for indexing.
  • Disallow: (value)
    Allows you to specify the pages to exclude from indexing. Each page or path to exclude must be on a line at hand and must begin with the value / sole means “all pages.”

Note: The robots.txt file should contain no blank line!

Examples of robots.txt:

  • Exclusion of all pages:

User-Agent: *
Disallow: /

  • Exclusion of any page (equivalent to the absence of robots.txt, all pages are visited):

User-Agent: *
Disallow:

  • Authorization of a single robot: For example Google bot

User-Agent: Googlebot
Disallow:
User-Agent: *
Disallow: /

  • Exclusion of a robot: For example MSN bot

User-Agent: Msnbot
Disallow: /
User-Agent: *
Disallow:

  • Excluding one-page:

User-Agent: *
Disallow: /directory/path/page.html

  • Exclusion of several page:

User-Agent: *
Disallow: /directory/path/page.html
Disallow: /wp-admin/admin/page2.html
Disallow: /wp-admin/settings/page3.html

  • Exclusion of all pages of a directory and its subfolders:

User-Agent: *
Disallow: /directory/

  • Allow only Google, Yahoo, Msn Bots only and Disallow others

User-agent: *
Disallow: /

User-agent: Googlebot
Allow: /

User-agent: Yahoo-slurp
Allow: /

User-agent: Msnbot
Allow: /

  • Compact Version (I found this one on some forum, So not completely sure about this one)

User-agent: Googlebot
User-agent: Slurp
User-agent: msnbot
User-agent: Mediapartners-Google*
User-agent: Googlebot-Image
User-agent: Yahoo-MMCrawler
Disallow:

User-agent: *
Disallow: /

VN:F [1.8.4_1055]
Rating: 10.0/10 (1 vote cast)
VN:F [1.8.4_1055]
Rating: 0 (from 0 votes)
Allow/Disallow Search Engines Bots Using Robots.txt [With Examples]10.0101

admin Email this author | All posts by admin | Subscribe to Entries (RSS)

 

Leave a Reply

You must be logged in to post a comment.

You can buy xeloda here