Install OCR on Nextcloud 11

image_print

This installation guide is tested with Nextcloud 11 and Ubuntu 16.04 x64.

Nextcloud OCR (optical character recoginition) processing for images and PDF brings OCR capability to your Nextcloud server. Currently there are more than 100 languages supported.

  • in case of a PDF a copy will be saved with an extra layer of the processed text, so that you are able to search in it.
  • in case of an image (PNG, JPG, TIFF) the result of the OCR processing will be saved in a .txt file next to the image (same folder).

Install prerequisites

apt-get install python3-pip
pip3 install --upgrade pip
apt-get install libffi-dev
pip3 install ocrmypdf

apt-get install tesseract-ocr tesseract-ocr-deu tesseract-ocr-deu-frak
apt-get install tesseract-ocr-eng tesseract-ocr-equ tesseract-ocr-osd

Now you are ready to install OCR in your admin console of Nextcloud

Enable OCR

Run the OCRWorker

vi /usr/local/bin/OCRWorker.sh
sudo -u www-data nohup php /var/www/nextcloud/apps/ocr/worker/OCRWorker.php > /dev/null 2>&
chmod +x /usr/local/bin/OCRWorker.sh
OCRWorker.sh

Run OCRWorker script at startup

crobtab -e
@reboot /usr/local/bin/OCRWorker.sh

Congratulations! You’ve successfully installed OCR on your Nextcloud server.