Reduction of bleedthrough in scanned manuscript documents
projectsofme Active In SP Posts: 1,124 Joined: Jun 2010 
24112010, 02:54 PM
By:Eric Dubois and Anita Pathak School of Information Technology and Engineering, University of Ottawa, Ottawa, ON Canada K1N 6N5 Reduction of bleedthrough in scanned manuscript documents Abstract Many old manuscript documents were written on both sides of the paper, and the bleedthrough from one side of the document to the other increases the difficulty in reading or deciphering the information on the page. This paper presents techniques for reducing such bleedthrough distortion using techniques of digital image processing. Both sides of the document are scanned, maintaining full spatial and amplitude resolution (8 bits/sample). The bleedthrough is reduced by processing both sides of the document simultaneously. First the verso side is flipped from left to right, and then the recto and flipped verso images are registered. This registration is necessary since it is impossible to perfectly align the front and back when scanning the document, and the scanner may not be perfectly uniform. We used a sixparameter affine transformation to register the two sides, determining the parameters using an optimization method. Once the two sides have been registered, areas consisting primarily of bleedthrough are identified and replaced by the background color or intensity. The method has been tested on a number of documents, including documents we generated under controlled conditions and some original manuscripts; the readability of documents with heavy bleedthrough has been greatly improved by this method. Introduction Many documents written or printed on both sides of the page suffer from bleedthrough which can significantly impair the readability of the document. Fig. 1 shows an extract of the corresponding portions of the front (recto) and rear (verso) of a typical eighteenth century manuscript document, where the bleedthrough clearly makes the task of reading the document more challenging and fatiguing. There is thus great interest in removing this bleedthrough using digital image processing techniques. Since the darkness of some of the bleedthrough is comparable to the darkness of some of the desired writing, a simple thresholding operation will not be successful in removing the bleedthrough. However, by processing both sides of the document together, it is possible to identify regions of the image that are due to bleedthrough and replace them with an estimate of the background. Techniques of this type are reported in [1, 2] for reducing showthrough in scanned documents; the basic idea is presented in [1] and a restoration technique using adaptive filtering is presented in [2]. In order to adequately remove bleedthrough, the recto and leftright flipped verso images must be registered; this did not receive much attention in [1, 2]. This paper presents a method to carry out this registration along with a proposed method to reduce the bleedthrough. Section 2 presents the general formulation of the problem and describes the registration and bleedthrough removal algorithms. Section 3 gives experimental results with a test document . Bleedthrough Removal Algorithm 2.1. Assumptions In this paper, we assume that the original document consists of some type of paper on which ink has been applied to both sides, either through writing or printing. Ink may simply show through from one side to the other, or it may have actually “bled” through to the other side. Both sides of the document are digitized in order to apply the bleedthrough removal algorithm. The sampled recto and verso images are denoted fr(x; y) and fv(x; y) respectively, where the sample points (x; y) lie on a twodimensional rectangular sampling structure L. In this article, we assume that 8bit grayscale versions of the image of size pw by ph are acquired, which are normalized such that 0 _ fr(x; y) _ 1 and 0 _ fv(x; y) _ 1 with graylevel 0 corresponding to white and 1 corresponding to black. Color information may be helpful and will be addressed in future work. We assume that there exist ideal recto and verso images representing the writing applied to the front and the back of the paper, denoted fwr(x; y) and fwv(x; y) respectively; these are zero where there is no writing. Similarly, we assume that there is an ideal background fbr(x; y) and fbv(x; y) corresponding to the image of the paper without writing. The measured recto image combines the back The measured verso image is obtained in a similar fashion. However, since the two sides are scanned in separate operations, the two scanning rasters will not be aligned; they will differ by some offset, rotation and possible skew. The ideal measured verso image with perfect registration is given by fI v (x; y) = C(fbv(x; y); fwv(x; y);Rfwr(x; y)) (5) where the coordinate systems of the recto and verso are perfectly aligned. However, the actual measured verso image is fv(x; y) = ApfI v (x; y) (6) where Ap is a linear operator that models the geometric distortion between the two scanning lattices. In our work, we have assumed that Ap is an affine transformation specified by six parameters p = (p11; p12; p13; p21; p22; p23) defined by g = Apf : g(x; y) = f(p11x + p12y + p13; p21x + p22y + p23): (7) 2.2. Problem formulation With these assumptions, the bleedthrough removal problem can be stated as follows: given the sampled recto and verso images fr and fv, estimate the restored images ^ fr(x; y) = C(fbr(x; y); fwr(x; y); 0) (8) ^ fv(x; y) = C(fbv(x; y); fwv(x; y); 0): (9) The problem can be broken down into two steps: 1. Estimate fI v using fv and fr (registration). 2. Estimate ^ fr and ^ fv from fr and ^ fI v (restoration). For more information about this topic,please follow the link: site.uottawa.ca/~edubois/documents/dubo01pics.pdf 



Important Note..!
If you are not satisfied with above reply ,..PleaseASK HERE
So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this pagePossibly Related Threads...  
Thread  Author  Replies  Views  Last Post  
SMART HOME THROUGH INFORMATION PLANE  seminar ideas  0  425 
11052012, 12:53 PM Last Post: seminar ideas 

Haptic Deformation Modelling Through Cellular Neural Network  project uploader  0  446 
02052012, 11:17 AM Last Post: project uploader 

Dimensionality Reduction in Hyperspectral Image Analysis  seminar ideas  0  602 
23042012, 03:20 PM Last Post: seminar ideas 

Application System Modeling Data Modeling through ER Model  seminar class  0  1,302 
31032011, 12:12 PM Last Post: seminar class 

Communication through submarine cables  seminar surveyer  0  1,679 
10012011, 05:31 PM Last Post: seminar surveyer 