OCR With PDFpen

Summary: Uses optical character recognition to add a text layer in a scanned PDF.
Requires: EagleFiler, PDFpen or PDFpen Pro
Install Location: ~/Library/Scripts/Applications/EagleFiler/
Last Modified: 2021-06-13

Description

This script uses PDFpen to perform optical character recognition on a scanned PDF file. This makes the contents of the PDF searchable in EagleFiler. Initially, the PDF has only an image layer; after running the script it has an image layer and an invisible text layer. If the PDF file had the “NeedsOCR” tag because you had used the Tag PDFs that Need OCR script, the tag will be removed after OCR has been applied.

There are several ways to use this script:

Run the script by itself to operate on the selected PDFs in EagleFiler.
Save the script as an application and drop PDF files onto it to OCR them and then import them into EagleFiler.
Save the script as an application and set it as the target of your scanner’s software. For example, go to the Application tab of the ScanSnap Manager’s settings, click “Add or Remove,” and choose the script application.
Attach the script to a folder as a folder action and save files into that folder.

See also the Import From Scanner script.

Installation Instructions · Download in Compiled Format · Download in Text Format

Script

on run

tell application

 "EagleFiler"

set _records to selected records of browser window

repeat with _record in _records

set _file to _record's file

my ocr(_file

tell _record to update checksum

my removeTag(_record

, "NeedsOCR")

end repeat

end tell

end run

on open _files

my ocrAndImport(_files

end open

on adding folder items to _folder after receiving _files

my ocrAndImport(_files

end adding folder items to

on ocrAndImport(_files

repeat with _file in _files

my ocr(_file

end repeat

tell application

 "EagleFiler"

import files _files

end tell

end ocrAndImport

on ocr(_file

tell application

 "PDFpen"

open _file as alias

tell document

ocr

repeat while performing ocr

delay

end repeat

delay

close with saving

end tell

end ocr

on removeTag(_record, _tagName

tell application

 "EagleFiler"

set _tags to _record's assigned tags

set _newTags to

{}

repeat with _tag in _tags

if _tag's name is not _tagName then

copy _tag to end of _newTags

end if

end repeat

set _record's assigned tags to _newTags

end tell

end removeTag