convert pdf to text using ocr
make sure you have imagemagick and tesseract are installed
$ sudo apt install imagemagick tesseract-ocr
It's a 2 step process:
  1. 1.
    Convert PDF to .tiff using convert from imagemagick
    $ convert -density 300 input.pdf -depth 8 output.tiff
  2. 2.
    convert .tiff to text using tesseract
    generate out.txt
    $ tesseract output.tiff out
Copy link