Segmentation of text/image documents using texture approaches
and John Håkon Husøy
Rogaland University Center
Department of Electrical and Computer Engineering
P.O. Box 2557 Ullandhaug, N-4004 Stavanger, Norway
In Proc. NOBIM-konferansen-94, Asker (Norway), June 1994, pp. 60-67
The digital computer and computer networks have made it possible
to search for and retrieve electronically stored documents in seconds,
no matter where in the world they are stored.
This is far from the reality for documents stored as paper copies.
Therefore there is considerable interest in digitizing paper documents.
To digitize existing paper documents, it is of great importance to be able
to separate the text from the graphics, in order to make the text searchable
and more efficiently stored.
In this paper we present an approach to segmentation of text and
graphics in scanned documents, based on
the assumption that the text in a document may be viewed as one
texture, while the graphics is a different texture.
Using this assumption, we segment the documents with a texture segmentation
scheme using filter banks as the feature extractors.
While most traditional text-graphics segmentation schemes assumes some
a priori knowledge of the input, our approach is independent of document
layout, typeface, font size, scanning resolution etc.
Another approach to texture segmentation of documents for text-graphics
segmentation has been presented by Jain and Bhattacharjee, using the
Gabor filter as the feature extractor.
In this paper we show that equally good results may be obtained using
much more computationally efficient critically sampled perfect reconstruction