Home About Us Privacy Policy Terms & Conditions Place Your Link Submit Article
Search:   
breakingarticles.com breakingarticles.com
Add Url
 

Self Healing

Property & Estate

Software & Networking

Recreation

Government & Politics

People & Communities

Drink & Food

Business & Commerce

Careers & Employment

Sports & Adventure

Finance & Investment

Shopping & Auction

Issues & News

Hotels & Travel

Vehicles & Automotive

Culture & Art

Fitness & Health

Education & Learning

Medicine & Treatment

Technology & Science

Games & Play

Relationship & Lifestyle

Children & Teens

Garden & Home

 

Home › Software & Networking › SEO Services
 

Search Engine Tips & Tricks: Create a Robots Text File for Your Web Site

 
Author: Sandra Waggett

Search engines index millions of web sites to generate the search results they return for key words. They do this using spiders.

Most search engines have their own spider that crawls around the web looking for web pages. Spiders are also known as robots because they are simply tiny little programs that run automatically, looking for web pages and recursively traveling through the embedded text links to index them. Most robots look for a robots.txt file in the top-level directory of your web site, also known as the root where your home page is located on the web server.

The robots.txt file is a simple text file created in a basic text editor, like Notepad. It allows you to control what the spider is allowed to access and what it is not allowed to access or index.

The format of the basic robots.txt file is pretty simple:
User-Agent: [Spider Name]
Disallow: [File Name]

For example, to allow ALL robots complete access to your web site, your robots.txt file will look like this:
User-agent: *
Disallow:
The asterisk is a wild card character that represents ALL robots. Leaving the Disallow line blank indicates to the robots, that nothing on the site is disallowed.

The next example bars all robots from the cgi-bin (where your scripts are typically located), images directories, and the portfolio directories:
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /portfolio/
Note: You should use a separate Disallow line for each directory or individual file.

In this example, you may wonder why you would want to disallow a robot from indexing your portfolio directory.

If you are a photographer and you have thumbnail images on a portfolio page that link to enlargement pages launched in a pop-up window, you may not want those pop-up pages indexed. These are called dead-end or orphaned pages because only the enlarged image appears on the page with no contact info or menu links back to the main site. If the visitor entered your site on one of these pages, they would have nowhere to go and no way to contact you.

For a live example, check out www.AnJPhotography.com and look at her wedding portfolio. When you click on an image, it opens in a new window. The page in the new window is a dead-end page. A robots.txt file can keep search engines from indexing these dead pages so you dont leave site visitors stranded.

This example keeps googlebot (the Google spider) from getting at the private.htm file:
User-agent: googlebot
Disallow: private.htm

When you create your robots.txt file it is extremely important that you use a basic text editor (like Notepad) and NOT a word processing application like Microsoft Word. Applications like Microsoft Word can insert hidden characters that may make your robots.txt file unreadable. After you post your robots.txt file to the web server, you can validate it to make sure it is properly formatted. There are several free validators on the web. Here is one: http://www.searchengineworld.com/cgi-bin/robotcheck.cgi

There are several advantages and some disadvantages of having the robots.txt file in your root directory. Protocol requires that all search engine robots start indexing your web site with the robots.txt file. This is the default entry point for robots if the file is present. Major search engines will never violate the Standard for Robots Exclusion. This is the primary reason it should be there. Beyond that, it can help with your search engine rankings when used correctly, and it can keep dead pages on your web site from being indexed. The primary disadvantage is that the robots.txt file may be viewed by nefarious individuals on the web, so you never want to use the robots.txt file to try to hide sensitive pages or directories on your web site (like passwords or private information). For more information about the robots.txt file and complete list of robots, visit the following web site: http://www.robotstxt.org/wc/robots.html

Author Bio:

Sandra Waggett

Sandra Waggett is the founder and principal designer of MSW Interactive Designs LLC (MSW-ID) major products and websites. MSW-ID provides custom website design, hosting, ecommerce and online marketing solutions to nearly 400 small business clients nationwide. MSW-ID helps small business professionals achieve an effective Internet presence.

Prior to founding MSW Interactive Designs LLC, she spent nearly 5 years working as a Senior Engineer for BAE Systems on the Lockheed Martin Mission Systems Team in Colorado Springs, CO. While with BAE, she was the training lead for the proposal phase of the Integrated Space Command and Control (ISC2) program. In this role, she authored the 10 year training plan for the proposal and developed web-based training prototypes for presentation to to the Government decision makers. Sandy earned her Master of Arts of degree from the University of CO, Colorado Springs, in Curriculum and Instruction, Corporate Track. Her specialties include web design, interface design, instructional design, and computer-based training development.

Sandy grew up in Las Vegas, NV and now resides on Capitol Hill in Washington DC.

You can search for this article using: Search Engine Tips & Tricks: Create a Robots Text File for Your Web Site
 
 
 

Related Articles

 
Long Copy Secrets - Keys To Mental Engagement
 
Polyglottal Internet
 
How companies are fighting spyware together.
 
Knowing When is Enough
 
Network+ Certification Exam Tutorial: Ethernet CSMA/CD Explained
 
Learn How To Get Your Articles In the Directories Without Being A Spammer
 
Is Your Ezine Being Zapped?
 
How to Start Your Own Mailing List
 
3 Quick and Easy Ways to Make Your Website That Much Better
 
Got Game?
 
 
 
 
 

Three Ways To Improve You Business

As a business owner, I am always trying to find ways to make my business run a little smoother. - Raymond Johnston Jr
 

4 Essential Steps to Eliminate Database Drama

Back in the day when I was still apart of corporate America, I found myself tasked with the huge job ... - Beth Schneider
 

Build Traffic And Ranking With Reciprocal Linking

The aim of a reciprocal linking strategy is usually twofold: First to attract visitors from the link ... - Ivan Kelly
 

The Amazing Ways To Jump Start Your Sales and Attract Orders

Every business owner want to attract more orders and have more sales. You can do this in smart ways. ... - Julia Tang
 

Bandwidth

The term bandwidth is very common these days, especially because its technology affects almost all a ... - Jason Gluckman
 

Tetris ?C The Game of the Past and the Future

Fans of gaming of any sort all know a few basic games that formed the foundation of what we know tod ... - dave4
 
 
   Home >> Privacy Policy >> Terms & Conditions
Copyright © 2008 www.breakingarticles.com All Rights Reserved.