OCR With UNPDF
Summary: Uses optical character recognition to add a text layer in a scanned PDF.
Requires: EagleFiler, UNPDF
Install Location: ~/Library/Scripts/Applications/EagleFiler/
Last Modified: 2019-10-02
Description
This script uses UNPDF to perform optical character recognition on a scanned PDF file. It creates a Microsoft Word file with the text of the PDF and then imports both the PDF and the Word file into EagleFiler.
There are several ways to use this script:
- Save the script as an application and drop PDF files onto it to OCR them and then import them into EagleFiler.
- Save the script as an application and set it as the target of your scanner’s software. For example, go to the Application tab of the ScanSnap Manager’s settings, click “Add or Remove,” and choose the script application.
- Attach the script to a folder as a folder action and save files into that folder.
Installation Instructions · Download in Compiled Format · Download in Text Format
Script
property
_format : "doc"
on
open
_files
my
ocrAndImport(
_files)
end
open
on
adding folder items to
_folder
after receiving
_files
my
ocrAndImport(
_files)
end
adding folder items to
on
ocrAndImport(
_files)
repeat
with
_file
in
_files
set
_sourcePath
to
_file's
POSIX path
set
_destPath
to
my
ocr(
_sourcePath,
_format)
end
repeat
set
_files
to
{
_file,
POSIX file
_destPath}
tell
application
"EagleFiler"
import
files
_files
end
tell
end
ocrAndImport
on
ocr(
_sourcePath,
_format)
set
_basePath
to
my
removeExtension(
_sourcePath, "pdf")
set
_destPath
to
_basePath &
_format
my
unpdf(
_sourcePath,
_destPath,
_format)
return
_destPath
end
ocr
on
removeExtension(
_path,
_extension)
if
_path
ends with
_extension
then
set
_end
to
(
length
of
_extension) + 1
set
_path
to
characters
1
thru
-
_end
of
_path
as
Unicode text
end
if
return
_path
end
removeExtension
on
unpdf(
_sourcePath,
_destPath,
_format)
set
_unpdf
to
"/Applications/deskUNPDF for Mac/Command Line Scripts/deskUNPDF"
set
_script
to
_unpdf's
quoted form & " "
set
_script
to
_script & "-convert -silent -closeOnExit -autolaunch false "
set
_script
to
_script & "-outfile " &
_destPath's
quoted form & " "
set
_script
to
_script & "-outputType " &
_format's
quoted form & " "
set
_script
to
_script &
_sourcePath's
quoted form
do shell script
_script
end
unpdf