Use automatic OCR for PDF files
If you have PDF files and need to extract the text with optical character recognition (OCR), follow these steps:
Step-by-step guide
- Log in to Nextcloud - see Access shared files remotely
Upload your PDF files to your Nextcloud home folder. DO NOT upload them to the Shared Drive or Private Drive - OCR will not work from these locations.
If you're not sure where your Nextcloud home is, look for a folder called Documents and upload files there.
- Click the "..." on the entry of the files you uploaded, or right click the file name and choose Details from the menu
- Click "..." in the upper right corner of the Details pane and click Tags
- Click the box labeled Collaborative tags and select
needs-ocr
- Wait a 5-10 minutes for OCR to run
Check your OCR folder
If you did not have an OCR folder in your Nextcloud home folder, one will be created automatically
Review the processed PDF files for accuracy
Each processed file will match the name of the original file but with "-ocr" appended to the file name.
- Copy processed PDF files to the Shared Drive or Private Drive for archiving
- Remove the
needs-ocr
tag from the original file or just delete it from your Nextcloud home folder
You can see all of your files tagged with
needs-ocr
by selecting Tags from the left hand menu in Nextcloud and typing needs-ocr
in the box labeled Select tags...Related articles