OCR With PDFpen
Summary: Uses optical character recognition to add a text layer in a scanned PDF.
Requires: EagleFiler, PDFpen or PDFpen Pro
Install Location: ~/Library/Scripts/Applications/EagleFiler/
Last Modified: 2021-06-13
Description
This script uses PDFpen to perform optical character recognition on a scanned PDF file. This makes the contents of the PDF searchable in EagleFiler. Initially, the PDF has only an image layer; after running the script it has an image layer and an invisible text layer. If the PDF file had the “NeedsOCR” tag because you had used the Tag PDFs that Need OCR script, the tag will be removed after OCR has been applied.
There are several ways to use this script:
- Run the script by itself to operate on the selected PDFs in EagleFiler.
- Save the script as an application and drop PDF files onto it to OCR them and then import them into EagleFiler.
- Save the script as an application and set it as the target of your scanner’s software. For example, go to the Application tab of the ScanSnap Manager’s settings, click “Add or Remove,” and choose the script application.
- Attach the script to a folder as a folder action and save files into that folder.
See also the Import From Scanner script.
Installation Instructions · Download in Compiled Format · Download in Text Format
Script
on
run
tell
application
"EagleFiler"
set
_records
to
selected records
of
browser window
1
repeat
with
_record
in
_records
set
_file
to
_record's
file
my
ocr(
_file)
tell
_record
to
update checksum
my
removeTag(
_record, "NeedsOCR")
end
repeat
end
tell
end
run
on
open
_files
my
ocrAndImport(
_files)
end
open
on
adding folder items to
_folder
after receiving
_files
my
ocrAndImport(
_files)
end
adding folder items to
on
ocrAndImport(
_files)
repeat
with
_file
in
_files
my
ocr(
_file)
end
repeat
tell
application
"EagleFiler"
import
files
_files
end
tell
end
ocrAndImport
on
ocr(
_file)
tell
application
"PDFpen"
open
_file
as
alias
tell
document
1
ocr
repeat
while
performing ocr
delay
1
end
repeat
delay
1
close
with
saving
end
tell
end
tell
end
ocr
on
removeTag(
_record,
_tagName)
tell
application
"EagleFiler"
set
_tags
to
_record's
assigned tags
set
_newTags
to
{}
repeat
with
_tag
in
_tags
if
_tag's
name
is
not
_tagName
then
copy
_tag
to
end
of
_newTags
end
if
end
repeat
set
_record's
assigned tags
to
_newTags
end
tell
end
removeTag