Prev NextMethodology

The same list of six proper noun phrase search strings were used to compare Excite and WebCrawler. The Boolean operator "AND" was used, as both search engines had it as an advanced search option. Each search was performed on both search engines on the same day. All searches occured during the month of November, 1996. I used phrases that I was familiar with, phrases dealing with subject matters I kept web pages on (and thus have a known item in the search to help test the indexing and retrieval even further), or had much interest in. This was to help more easily determine relevancy of the hits returned in order to examine precision and uniqueness.

The proper noun phrases.

  1. Dancing Dragon
    (This is my favorite mail order catalog dealing with things dragon.)
  2. Marie de France
    (I did many papers and much research on Marie de France in my English studies.)
  3. Midori Ito
    (I created and maintain a Midori Ito web site.)
  4. Memorial Day
    (I created and maintain a Memorial Day web site.)
  5. Alan Parsons Project
    (My favorite musical group.)
  6. Southeastern Transportation Center
    (I created and maintain the Southeastern Transportation Center web site.)

    The Steps.

    1. A keyword search using the word phrase search string and the Boolean AND operator was performed using the Excite search engine. The first fifty returns were examined.

    2. The same keyword search using the word phrase search string and the Boolean AND operator was repeated performed using the WebCrawler search engine. The first fifty returns were examined.

    Hits returned from each search engine that were judged relevant were compared for overlap/uniqueness.

    This was repeated for all six search strings.

    Data Anaysis

    Each returned hit was examined for relevant content. Any document whose URL was "broken" could not be examined and thus was not counted as being a relevant hit. For those documents that could be examined, merely mentioning the search string was not enough to count as relevancy (i.e. a document having the statement "my favorite skater is Midori Ito" or merely having a link to another site devoted to or having information on Midori Ito was not enough to consider that document relevant). Otherwise, the particulars for determining relevancy for each search string was as follows:
    1. Dancing Dragon: Relevancy here was very narrow: only the official Dancing Dragon (A mail order company dealing with dragon related items) site was relevant.
    2. Marie de France: Any information about Marie, her works, or availability of her works.
    3. Midori Ito: Any biographical or competition history information.
    4. Memorial Day: Relevancy here was narrow: only sites dealing with Memorial Day as the only topic, and only those that treated or discussed Memorial Day as a day of rememberance for those who died in service to our country, no sites that discussed the morality of war, or were merely event calendars, or were just on parardes and picnics, or were dedicated to just one particular war or battle memorial. Sites that were a compilation/collection of various links to WWW sites relevant to Memorial Day were judged relevant.
    5. Alan Parsons Project: Any information about the Alan Parsons Project: history, biographical, reviews, or availability of various recordings.
    6. Southeastern Transportation Center: Any information about the STC. Precision was calculated as the number of hits judged relevant divided by the number of total hits examined, in all cases fifty at most (some searches returned less than fifty total hits). Uniqueness was judged as the number of relevant hits returned by an engine that were not returned by the other engine per search string.

      Revelancy ranking of documents found by both search engines was compared as well.


      Previous Home Next