Tracking Text in Mixed Mode Documents
1988) Tracking Text in Mixed Mode Documents. Technical Report TR-88-19, Computer Science, Virginia Polytechnic Institute and State University. (
This paper describes a method for extracting arbitrarily oriented text in documents containing both text and graphics. The technique presented is inspired by the tracking algorithms frequently found in raster to vector conversion systems. By identifying text components in the document, reducing the resolution of the image by the size of the characters, and then tracking the centers of the character components, all text strings can be removed and subsequently reoriented to the horizontal. They can then be presented for automated character recognition. A by-product of the method is that characters are automatically grouped together to form words and/or phrases. We give a detailed description of the algorithm, discuss its strengths and weaknesses, and present some sample results obtained from a typical city street map.
|Item Type:||Departmental Technical Report|
|Subjects:||Computer Science > Historical Collection(Till Dec 2001)|
|Deposited By:||User autouser|
|Deposited On:||05 December 2001|