Summary: Uses optical character recognition to add a text layer in a scanned PDF.
Requires: EagleFiler, UNPDF
Install Location: ~/Library/Scripts/Applications/EagleFiler/
Last Modified: 2010-11-27


This script uses UNPDF to perform optical character recognition on a scanned PDF file. It creates a Microsoft Word file with the text of the PDF and then imports both the PDF and the Word file into EagleFiler.

There are several ways to use this script:

Installation Instructions · Download in Compiled Format · Download in Text Format


property _format : "doc"

on open _files
my ocrAndImport(_files)
end open

on adding folder items to _folder after receiving _files
my ocrAndImport(_files)
end adding folder items to

on ocrAndImport(_files)
repeat with _file in _files
set _sourcePath to _file's POSIX path
set _destPath to my ocr(_sourcePath, _format)
end repeat
set _files to {_file, POSIX file _destPath}
tell application "EagleFiler"
import files _files
end tell
end ocrAndImport

on ocr(_sourcePath, _format)
set _basePath to my removeExtension(_sourcePath, "pdf")
set _destPath to _basePath & _format
my unpdf(_sourcePath, _destPath, _format)
return _destPath
end ocr

on removeExtension(_path, _extension)
if _path ends with _extension then
set _end to (length of _extension) + 1
set _path to characters 1 thru -_end of _path as Unicode text
end if
return _path
end removeExtension

on unpdf(_sourcePath, _destPath, _format)
set _unpdf to "/Applications/deskUNPDF for Mac/Command Line Scripts/deskUNPDF"
set _script to _unpdf's quoted form & " "
set _script to _script & "-convert -silent -closeOnExit -autolaunch false "
set _script to _script & "-outfile " & _destPath's quoted form & " "
set _script to _script & "-outputType " & _format's quoted form & " "
set _script to _script & _sourcePath's quoted form
do shell script _script
end unpdf