Saving Safari Web Pages as PDF

Hi all,

I’m trying to save Safari Web Pages as a PDF rather than the .webarchive format because it seems more portable and also I don’t seem to end up with the same occasional dependencies to be online due to some random Javascript. I don’t want to use the “Print to PDF” function in Safari because that PDF rendering is horrible. Rather, I’d like to use something like Paparazzi because it creates a pretty good PDF rendering of the web page.

My script thus far is:

set _filePath to (path to home folder from user domain as text) & “Dropbox:Application Data:EagleFiler:General:To Import (General)”

tell application “Safari”
set _url to URL in document 1
set _name to ((name of window 1) as string)
end tell

tell application “Paparazzi!”
capture _url
repeat while busy
end repeat
set _fileName to _filePath & “:” & _name & “.pdf”
save as PDF in _fileName
end tell

This seems to work; however, I have two questions:

  1. Is it possible to somehow set the URL when I’m using the “To Import” folder method of importing into EagleFiler? Perhaps by setting metadata on the file (maybe an OpenMeta tag)?

  2. If this works well, I may opt to use this means by default. Is it possible to overwrite the default capture script? I could always assign a different shortcut via FastScripts but was more curious than anything else on this point.

Go easy on my AppleScript. It’s attempts to be a user-friendly language creeps out the developer in me.

No, however you could save the PDF elsewhere. Then use the import files command to import the file. It will return the EagleFiler record, and then you could set its source URL property.

Yes, you can override the built-in capture script by saving your script in the folder:

/Users/<username>/Library/Application Support/EagleFiler/Capture Scripts/

For Safari, the filename should be com.apple.Safari.scpt.

Just for completeness, I should mention that EagleFiler has a built-in option to import in PDF format. It looks much like when you print from Safari.

Thanks Michael. Just for reference, I tweaked the script per your suggestions and it works great. Combined with a FastScript keyboard shortcut in Safari, it’s a great way to clip/preserve pages for future reference.

tell application "Safari"
	set _url to URL in document 1
	set _name to ((name of window 1) as string)
end tell

-- Make the _name "file name" safe
set _name to replace_chars(_name, "/", "--")

-- Create a temp directory
set _tempDir to do shell script "mktemp -d -t 'SYC-script-temp'"
set _pdfFile to _tempDir & "/" & _name & ".pdf"

tell application "Paparazzi!"
	capture _url
	repeat while busy
	end repeat
	save as PDF in _pdfFile
	quit
end tell

tell application "EagleFiler"
	set _recordList to import files _pdfFile with deleting afterwards
	set (source URL of item 1 of _recordList) to _url
end tell

on replace_chars(this_text, search_string, replacement_string)
	set AppleScript's text item delimiters to the search_string
	set the item_list to every text item of this_text
	set AppleScript's text item delimiters to the replacement_string
	set this_text to the item_list as string
	set AppleScript's text item delimiters to ""
	return this_text
end replace_chars

My only complaint with it is that the Safari PDF “print” is horrible looking. Paparazzi’s PDF is remarkably close to the actual web page formatting and layout.

Are you looking for a more screen-like rendering, rather than a print-style PDF? There’s some discussion about that in this thread.

Yes, that’s exactly it. I don’t know if you ended up providing the single page PDF, but I have to say that I am very happy with this script. I love the size, portability and searchability of a PDF but close to the same rendering as under Safari. I’m not sure what Paparazzi does, but the PDF renders look quite good.