I’m trying to save Safari Web Pages as a PDF rather than the .webarchive format because it seems more portable and also I don’t seem to end up with the same occasional dependencies to be online due to some random Javascript. I don’t want to use the “Print to PDF” function in Safari because that PDF rendering is horrible. Rather, I’d like to use something like Paparazzi because it creates a pretty good PDF rendering of the web page.
My script thus far is:
set _filePath to (path to home folder from user domain as text) & “Dropbox:Application Data:EagleFiler:General:To Import (General)”
tell application “Safari”
set _url to URL in document 1
set _name to ((name of window 1) as string)
end tell
tell application “Paparazzi!”
capture _url
repeat while busy
end repeat
set _fileName to _filePath & “:” & _name & “.pdf”
save as PDF in _fileName
end tell
This seems to work; however, I have two questions:
Is it possible to somehow set the URL when I’m using the “To Import” folder method of importing into EagleFiler? Perhaps by setting metadata on the file (maybe an OpenMeta tag)?
If this works well, I may opt to use this means by default. Is it possible to overwrite the default capture script? I could always assign a different shortcut via FastScripts but was more curious than anything else on this point.
Go easy on my AppleScript. It’s attempts to be a user-friendly language creeps out the developer in me.
No, however you could save the PDF elsewhere. Then use the import files command to import the file. It will return the EagleFiler record, and then you could set its source URL property.
Yes, you can override the built-in capture script by saving your script in the folder:
Thanks Michael. Just for reference, I tweaked the script per your suggestions and it works great. Combined with a FastScript keyboard shortcut in Safari, it’s a great way to clip/preserve pages for future reference.
tell application "Safari"
set _url to URL in document 1
set _name to ((name of window 1) as string)
end tell
-- Make the _name "file name" safe
set _name to replace_chars(_name, "/", "--")
-- Create a temp directory
set _tempDir to do shell script "mktemp -d -t 'SYC-script-temp'"
set _pdfFile to _tempDir & "/" & _name & ".pdf"
tell application "Paparazzi!"
capture _url
repeat while busy
end repeat
save as PDF in _pdfFile
quit
end tell
tell application "EagleFiler"
set _recordList to import files _pdfFile with deleting afterwards
set (source URL of item 1 of _recordList) to _url
end tell
on replace_chars(this_text, search_string, replacement_string)
set AppleScript's text item delimiters to the search_string
set the item_list to every text item of this_text
set AppleScript's text item delimiters to the replacement_string
set this_text to the item_list as string
set AppleScript's text item delimiters to ""
return this_text
end replace_chars
My only complaint with it is that the Safari PDF “print” is horrible looking. Paparazzi’s PDF is remarkably close to the actual web page formatting and layout.
Yes, that’s exactly it. I don’t know if you ended up providing the single page PDF, but I have to say that I am very happy with this script. I love the size, portability and searchability of a PDF but close to the same rendering as under Safari. I’m not sure what Paparazzi does, but the PDF renders look quite good.