Computer Science Technical Reports
CS at VT

FAST-INV: A Fast Algorithm for building large inverted files

Fox, Edward A. and Lee, Whay C. (1991) FAST-INV: A Fast Algorithm for building large inverted files. Technical Report TR-91-10, Computer Science, Virginia Polytechnic Institute and State University.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.
TR-91-10.pdf (1333054)

Abstract

Inverted files are widely used in building bibliographic and other types of retrieval systems. In order to investigate the utility of advance information retrieval methods for improving access to large online library catalogs, it was necessary to extend the SMART system in a variety of ways. One particular problem was to develop a fast method to produce an inverted file from hundreds of thousands of (partial) MARC records. The FAST-INV software was developed in 1986, taking advantage of the large primary memories available on modern computers and the order inherent in the input data. Using the new algorithm, processing in primary memory for N basic data elements has time complexity O(N), and processing of files that will not fit in primary memory can be accomplished in a fixed number of passes. Performance studies show this approach to be (at least) an order of magnitude faster than commonly used techniques. It is hoped that these findings will be of interest to database providers and will help them reduce costs relating to the building of inverted files, as we have been doing for the last five years.

Item Type:Departmental Technical Report
Subjects:Computer Science > Historical Collection(Till Dec 2001)
ID Code:256
Deposited By:User autouser
Deposited On:05 December 2001