UNDERSTANDING SEARCH ENGINES AND DIRECTORIES
A search engine actually has three main components.
The first component is a spider - also called a crawler or robot or worm. A spider
"crawls" the web. It
does this by finding pages, reading content, and following links. The spider collects
information such as web pages, documents, files and images.
All this is stored in the second component- the database or index.
When you use a search engine you are actually searching this database... not the internet.
The third component is the query software that enables users to query the
index and returns results in ranked order of relevance. Relevance is determined
by each engine’s formula or algorithm.
Meta-search engines search multiple databases simultaneously via a single user interface.
Examples of search engines include: Google, Teoma, Yahoo and MSN
A directory is quite different.
A directory, or catalog, is similar to a huge online library hierarchically arranged by
subject from broad to specific. Websites must be submitted for inclusion, and are reviewed and
categorized by people (editors). Relevance within a category is determined by these editors based on
website content and the information submitted.
Examples of directories include: The Open Directory Project (DMOZ.org),
About.com, and the Yahoo! directory.
Portals and search sites.
Many sites that we think of as search engines actually employ a combination approach.
They have a crawler-based search engine AND a human-powered directory.
Examples of sites providing both types of results: Yahoo! and MSN.