Optimization Robots.txt

Written By batikbumi on 5 Jun 2013 | 05.52

google robot
For a reason, maybe you do not want a web crawler, either from a search engine or other type of web robot to access all or part of your website, then the robots.txt can be used for this purpose.

Robots.txt file is placed at the root of your website (Example: yourDomain.com / robots.txt), and is a standard that has been developed since 1994, when indexing web gaining popularity. This standard is not a guarantee that the web crawlers will follow it, it all depends on the cooperation of a web crawler to pay attention to these standards.


If you want to instruct the robot to not access your website at all, just write down the following instructions in the robots.txt file:
User-agent: *

Disallow: /
Line user-agent: *means the robots.txt instruction applies to all web robots. You can change the specific name of the web robot, if you just want to impose on the robots.txt instructions for a particular web robot. While Dissalow: / line means root directory and all its contents are not allowed to be accessed by a web robot.


If you want to protect a particular directory or file, the writing is as follows:
User-agent: *

Disallow: / cgi-bin /

Disallow: / images /

Disallow: / download / browse.php
That instruction means to tell robots not to access the web cgi-bin and images (and its contents), also did not access the file / download / browse.php (but can access files in a directory other than browse.php / download.

0 komentar:

Posting Komentar