Sample Header Ad - 728x90

Making badly scanned public domain books legible with OCR

0 votes
0 answers
175 views
I've obtained soft copies of some very old public domain books. The illustrations are clear enough, but the text is somewhat blurry. I've experimented with Tesseract OCR and it can recognize a surprising amount of the words with some errors, but it spits them out into a jumbled mess in a separate file. **Questions:** 1. Is there a way to have Tesseract or another OCR recognize the text and then place it over the original blurry text without changing other elements such as lines and illustrations? 2. And, if this is possible, is it also possible to have Tesseract or another OCR mimic the varying sizes, fonts, and colors of the original text? Thank you!
Asked by YQ002lc2 (145 rep)
Jul 24, 2023, 06:11 AM