From the Oct. 2007 Issue
As a point of information to those whose eyes have been closed to Internet technology over the past few years, “pushing” information is losing ground to “pulling” information. In short, companies used to deploy e-mail newsletters and traditional websites that pushed information out to as many people as they possibly could in hopes of it reaching their intended audiences. Or people would simply visit their list of favorite websites every day to see if something had changed or had been updated.
Information pushing still goes on too frequently, of course — SPAM and other unsolicited commercial e-mail (UCE) will probably never stop. But savvy Internet marketers and information providers have been moving toward technologies like RSS that allow users to pull updated content from websites only on the subjects they want. If any of you need more information on RSS and newsfeed syndication, ask your local 10 year-old. Or you can go online and search for the answer because as much as these feeds have helped us, they cannot replace the bread and butter of the Internet, which is the resolution of an immediate question or query by searching.
Soon after the first commercial websites started appearing in 1994, two smart guys from Stanford realized that a directory of sorts would be beneficial, so they started Yahoo!, which went public in 1996, two-and-a-half years before Google would be incorporated. Many other search engines and portals soon appeared on the Internet, including MSN, HotBot, Excite and AltaVista. AOL, of course, tried launching its own kind of proprietary Internet, which eventually failed, and then tried to rebrand as a search portal, which is failing, too. Currently, Google has about 64 percent of the search market, with Yahoo! at 22 percent, MSN’s Live Search at 7 percent and ASK.com at about 3 percent. (Information for July 2007 from HitWise, an Experian company.)
How Search Engines Work
Just as with any product, each search engine brand has its own strong loyal following. And even though there are some differences in how search engines compile and sort through data (their algorithms), they all provide close to the same general search functions and generally provide close to the same results. But the results are not exactly the same. One reason for this is that they work the same way, but on a different schedule.
A user goes to a search engine because he or she expects it to know where everything on the Internet is so that it can point the searcher in the right direction. But how do the search engines find the information in the first place? Well, all major search portals rely on two primary methods for finding data: humans and spiders. People and businesses can submit their websites to the major search engines, which will then scan the site’s content, list it in appropriate directories and make it available in search results.
But the most common method by which search engines get their information is by using “spiders.” Spiders basically consist of a little program that goes out and visits all of the websites it knows, and then goes to the links on those websites, and the links on those websites, almost ad infinitum. Along the way, the spiders document the content of the websites, the prominence of certain words and phrases, various information contained in the background code of the website (meta tags, etc.), the source of the website and other information. This data is compiled into an index, which weights the likelihood of a page meeting a query when performing a search.