OCR With PDFpen
Summary: Uses optical character recognition to add a text layer in a scanned PDF.
Requires: EagleFiler, PDFpen or PDFpen Pro
Install Location: ~/Library/Scripts/Applications/EagleFiler/
Last Modified: 2021-06-13
Description
This script uses PDFpen to perform optical character recognition on a scanned PDF file. This makes the contents of the PDF searchable in EagleFiler. (Note that Smile Software sold PDFpen and recent versions might no longer work with this script.) Initially, the PDF has only an image layer; after running the script it has an image layer and an invisible text layer. If the PDF file had the “NeedsOCR” tag because you had used the Tag PDFs that Need OCR script, the tag will be removed after OCR has been applied.
There are several ways to use this script:
- Run the script by itself to operate on the selected PDFs in EagleFiler.
- Save the script as an application and drop PDF files onto it to OCR them and then import them into EagleFiler.
- Save the script as an application and set it as the target of your scanner’s software. For example, go to the Application tab of the ScanSnap Manager’s settings, click “Add or Remove,” and choose the script application.
- Attach the script to a folder as a folder action and save files into that folder.
See also the Import From Scanner script.
Installation Instructions · Download in Compiled Format · Download in Text Format
Script
on run
tell application "EagleFiler"
set _records to selected records of browser window 1
repeat with _record in _records
set _file to _record's file
my ocr(_file)
tell _record to update checksum
my removeTag(_record, "NeedsOCR")
end repeat
end tell
end run
on open _files
my ocrAndImport(_files)
end open
on adding folder items to _folder after receiving _files
my ocrAndImport(_files)
end adding folder items to
on ocrAndImport(_files)
repeat with _file in _files
my ocr(_file)
end repeat
tell application "EagleFiler"
import files _files
end tell
end ocrAndImport
on ocr(_file)
tell application "PDFpen"
open _file as alias
tell document 1
ocr
repeat while performing ocr
delay 1
end repeat
delay 1
close with saving
end tell
end tell
end ocr
on removeTag(_record, _tagName)
tell application "EagleFiler"
set _tags to _record's assigned tags
set _newTags to {}
repeat with _tag in _tags
if _tag's name is not _tagName then
copy _tag to end of _newTags
end if
end repeat
set _record's assigned tags to _newTags
end tell
end removeTag