Prev NextLiterature Review

With the Internet being a massive and dynamic organism, and with new search engines constantly appearing and old ones constantly changing to keep up with the growing complexity of the Net, the growing demand for sophistication by Net surfers, and the competition, some search engines are poorly, if at all, represented in the literature, while the others have outdated information presented about them. Such is the case with WebCrawler, one of the first full text search engines. For example, in the 1996 publication Internet Agents by Fah-Chun Cheong, the coverage and comparison of WebCrawler to other search engines does not include its new Boolean search capabilities. In the book's appendix where new search engines are very briefly discussed, Excite is not to be found.

There are several articles detailing comparisons of web search engines, but they usually compare Yahoo, Lycos, InfoSeek, with some including WebCrawler, and very few including Excite. Others just used several search engines to get back as many unique hits as possible on a specific topic to discuss overall web or Net coverage on that topic rather than to compare the effeciency of the search engines involved.

A recent comparison of seven search engines, including Excite and WebCrawler, was the in Internet World magazine's May 1996 issue. The article, "Search Engine Showdown" by Venditto, can be found on the WWW at http://www.iw.com/1996/05/showdown.html. IW used several different search strings to test each search engine for relevancy, timeliness, and ability to parse complex query statements. Their comments on WebCrawler are already outdated, as WebCrawler, at the time of their study (Feb 1996) did not support Boolean operators, nor did it show summaries.

An online "In-Depth" Analysis of various search engines by Linda Barlow can be found at http://www.monash.com/spidap3.html. Excite is given a B rating while Webcrawler is given a B+ rating. Some notes as to ease of use, good and bad points are given, but no indication of what search strings were used in the analysis, and as search engines may perform differently under different search string types (single term, multiple term, phrases, concepts, advanced boolean), knowing what search string context to place the analysis in is important. Also, the descriptions of the search engines are out of date (for instance, it says that Excite does not report how many total hits have been returned, which is no longer true of Excite).

An article dealing with tracking net information from a user's PC via a program called PC-Meter Sweeps does not compare WebCrawler results with Excite, but it did show that in this study WebCrawler was the 2nd most accessed site, with Excite as the 19th most accessed site, beating out Lycos (Frook).

I have found most of the literature, then, to be basic discussions on how the various search engines index the web, but not so much on how these engines are used by individuals to search that index, or how one search engine stacks up against another in searching their respective indices. If they do compare search engines, they usually do not use more than one or two search strings. A better study would be to use a number of search strings, to help show that the differences are not due to a specific weak coverage or retrieval of the engine's index but due to an overall weakness in coverage or retrieval. Also many studies compare searches using a single search term, or search strings representing concepts like "French poet woman." Such studies, while useful, ignore multiple proper noun or phrase searching by users, like a search for "Marie de France."


Previous Home Next