Debian Buster: Tesseract not supporting URL as argument
3
votes
0
answers
345
views
I'm trying to parse text from a hosted image, but it looks like I've miss-configured Tesseract.
I'm using Debian Buster,
tesseract-ocr
, libtesseract-dev
and a Ruby wrapper are installed.
# $ tesseract -v
tesseract 4.0.0
leptonica-1.76.0
libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
Inside a terminal tesseract output
returns Error, cannot read input : No such file or directory
. The same error message is raised using the Ruby gem.
Did I miss something after installing the packages ?
The doc talks about manually placing the traneddata directory on Ubuntu, should it also be done on Debian ?
> The traineddata is currently not shipped with the snap package and must be placed manually to ~/snap/tesseract/current.
I can get it working by using curl
and local path as argument, but it should support URL as argument
Thanks
**EDIT**
I've tested both v4.1.1 and v5.0.0 by following these instructions and setting up tessdata directory. They both explicity returns that they don't support URLs:
Tesseract Open Source OCR Engine v5.0.0-alpha-647-g4a00 with Leptonica
Error, this tesseract has no URL support
Error during processing.
I'm obviously missing something because release notes says it supports URL since 4.1.1
Asked by Sumak
(273 rep)
May 6, 2020, 05:05 PM
Last activity: May 6, 2020, 06:40 PM
Last activity: May 6, 2020, 06:40 PM