# convert pdf to text using ocr

make sure you have imagemagick and tesseract are installed

```
$ sudo apt install imagemagick tesseract-ocr
```

It's a 2 step process:

1. Convert PDF to .tiff using `convert` from imagemagick

   ```
   $ convert -density 300 input.pdf -depth 8 output.tiff
   ```
2. convert .tiff to text using `tesseract`

   generate `out.txt`

   ```
   $  tesseract output.tiff out
   ```
