Adding Linguistic Constraints to Document Image Decoding: Comparing the Iterated Complete Path and Stack Algorithms

dc.citation.bibtexNameinproceedingsen_US
dc.citation.conferenceNameProceedings of IS&T/SPIE Electronic Imagingen_US
dc.contributor.authorPopat, Krisen_US
dc.contributor.authorGreene, Danen_US
dc.contributor.authorRomberg, Justinen_US
dc.contributor.authorBloomberg, Danen_US
dc.date.accessioned2007-10-31T00:58:06Z
dc.date.available2007-10-31T00:58:06Z
dc.date.issued2001-01-20en
dc.date.modified2002-07-10en_US
dc.date.note2002-07-10en_US
dc.date.submitted2001-01-20en_US
dc.descriptionConference paperen_US
dc.description.abstractBeginning with an observed document image and a model of how the image has been degraded, Document Image Decoding recognizes printed text by attempting to find a most probable path through a hypothesized Markov source. The incorporation of linguistic constraints, which are expressed by a sequential predictive probabilistic language model, can improve recognition accuracy significantly in the case of moderately to severely corrupted documents. Two methods of incorporating linguistic constraints in the best-path search are described, analyzed and compared. The first, called the iterated complete path algorithm, involves iteratively rescoring complete paths using conditional language model probability distributions of increasing order, expanding state only as necessary with each iteration. A property of this approach is that it results in a solution that is exactly optimal with respect to the specified source, degradation, and language models; no approximation is necessary. The second approach considered is the Stack algorithm, which is often used in speech recognition and in the decoding of convolutional codes. Experimental results are presented in which text line images that have been corrupted in a known way are recognized using both the ICP and Stack algorithms. This controlled experimental setting preserves many of the essential features and challenges of real text line decoding, while highlighting the important algorithmic issues.en_US
dc.identifier.citationK. Popat, D. Greene, J. Romberg and D. Bloomberg, "Adding Linguistic Constraints to Document Image Decoding: Comparing the Iterated Complete Path and Stack Algorithms," 2001.
dc.identifier.urihttps://hdl.handle.net/1911/20201
dc.language.isoeng
dc.subjectdocument image decoding*
dc.subjectoptical character recognition*
dc.subjectconvolutional decoding*
dc.subject.keyworddocument image decodingen_US
dc.subject.keywordoptical character recognitionen_US
dc.subject.keywordconvolutional decodingen_US
dc.titleAdding Linguistic Constraints to Document Image Decoding: Comparing the Iterated Complete Path and Stack Algorithmsen_US
dc.typeConference paper
dc.type.dcmiText
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Pop2001Jan5AddingLing.PDF
Size:
144.3 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
Pop2001Jan5AddingLing.PS
Size:
285.06 KB
Format:
Postscript Files