How Search Engines Work?
It is essential to have a basic understanding of how search engines work in order to optimise your site and be succesfully listed, or "ranked", in search engine results.
Search engines can be broadly categories into two groups, those that are manually maintained by human editors (sometimes called directories), and those that generate results algorithmically using automated software. Today, the second type is by far the most common and is how the most popular search engines (Google, Yahoo!, etc) work.
These automated search engines rely on web spiders, or web crawlers, which surf the web and make copies of all the content they come across. They do this in much the same way as a human browsing a website does, by reading the text and following any links. An important difference however is that the spider can only see textual content, it has no concept of the aesthetic of the site, or of any information stored in images. This has important SEO implications (See: What Search Engines See).
All this information is then indexed in large databases that the search engine provider maintains. In addition to the content extracted from sites other information about the site is also stored. This part of the process, although requiring significant infrastructure in terms of hardware and bandwidth, is actually quite straight forward. It is turning this information into useful search engine results that is challenging and is what separates the quality of different search engines from each other.
When you enter a keyword into a search engine a search algorithm is applied and a list of results are produced. For example, the most simple form of search algorithm could match any site that contained the keyword, perhaps ranking the results by the frequency with which the keyword appeared within the site or page. Clearly this example is far too simplistic to provide meaningful results in a web of billions of pages, and would be open to abuse by sites adding keywords that are not relevant to their content (know as "keyword stuffing").
Search algorithms are highly complex and closely guarded intellectual property of the various search engine companies. It is reported, for example, that Google uses over 100 different metrics to determine the rank of a site in the search results. These can include everything from the physical location of the IP address the site is running from to the density of certain words when compared to similar sites in the same industry. The most important factor, however, determining a site's rank in search results is the measure of external sites linking to the site (See: Inbound Links).