Present day OCR is good but still it makes 1-2% of errors. A simple solution would be to embed an error-correction code on each page as a small faint image, that would not distract from the main contents. This would allow to achieve 100% correct OCR of text.
One may ask, why bother if the text is prepared electronically in the first place. Counter-argument: in many cases we do not have access to electronic original, that's why we need OCR in the first place.
Update: turned out to be not a new idea, see comments.
One may ask, why bother if the text is prepared electronically in the first place. Counter-argument: in many cases we do not have access to electronic original, that's why we need OCR in the first place.
Update: turned out to be not a new idea, see comments.
no subject
Date: 2006-06-30 11:13 am (UTC)no subject
Date: 2006-07-03 02:09 pm (UTC)no subject
Date: 2006-07-03 02:10 pm (UTC)no subject
Date: 2006-07-03 08:16 pm (UTC)