Computer Science Technical Reports
CS at VT

Tracking Text in Mixed Mode Documents

Bixler, J. Patrick (1988) Tracking Text in Mixed Mode Documents. Technical Report TR-88-19, Computer Science, Virginia Polytechnic Institute and State University.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.
TR-88-19.pdf (892560)


This paper describes a method for extracting arbitrarily oriented text in documents containing both text and graphics. The technique presented is inspired by the tracking algorithms frequently found in raster to vector conversion systems. By identifying text components in the document, reducing the resolution of the image by the size of the characters, and then tracking the centers of the character components, all text strings can be removed and subsequently reoriented to the horizontal. They can then be presented for automated character recognition. A by-product of the method is that characters are automatically grouped together to form words and/or phrases. We give a detailed description of the algorithm, discuss its strengths and weaknesses, and present some sample results obtained from a typical city street map.

Item Type:Departmental Technical Report
Subjects:Computer Science > Historical Collection(Till Dec 2001)
ID Code:104
Deposited By:User autouser
Deposited On:05 December 2001