Tools/scripts for anonymization of e-mail

I’m looking for ways to anonymize e-mail that I archive in EagleFiler. Scripts, eml editing tools or the like.

Due to GDPR and corresponding new policies for e-mail at my workplace I’m looking into using EagleFiler to archive those e-mails I would like to keep for reference. But in most cases the personal information (sender, recipients, names or identifiers in the text of the e-mail etc.) is not what is important to keep, and then I’m not allowed to keep them. Thus I would like to be able to scrub this information from the e-mails before or right after archiving.

I realize I could store only the contents, by selecting the text and importing that instead of the complete mail. But then I lose the subject and also the date, both of which are often important for context. So my hope is that someone has some workflow ideas or pointers to useful tools that would allow me to store anonymized copies of e-mail in EagleFiler.

Unfortunately this isn’t an easy task. EagleFiler itself gives me the opportunity to change the metadata of archived eml-files, so I can use that to clear out the “From” column, and remove names or other personal details from the title (and file name). But the content of the message is not editable inside EagleFiler, nor is the sender/subject actually changed in the eml-file. I can open the file in an external editor, but that works well only when the mail is in plain text, for html multipart e-mail it is harder and for base64 encoded content it is even worse.

I started sketching on a workflow where I use KeyboardMaestro to tell Mail to forward the message so a new draft is created, and then convert the draft to plain text, and pass it through a script that made some crude attempts at automatic anonymization of the complete mail based on the sender and recipients. This draft could then be edited further and at last imported to EagleFiler, but it would get the date of when the draft was created, not when the original e-mail was received …

Any ideas on a better workflow?

I think probably the easiest way to do this would be to take advantage of the fact that EagleFiler converts mail to the standard mbox format when importing. Thus, you could use pre-existing tools for processing mail: import into EagleFiler to get a mailbox file, process the mailbox to create a new mailbox, import new mailbox into EagleFiler and delete the old one.

One tool you could use is called formail. It’s built-into EagleFiler at:

EagleFiler.app/Contents/Frameworks/WashFramework.framework/Versions/A/formail

formail can remove/replace the values of certain headers in each message or even let you process the entire message with a script.

Thanks Michael for the pointer to formail, it definitively looks like something I could use.

After writing here yesterday I realized I could adjust the creation date in EagleFiler through AppleScript, so after some adjustments this morning I now have a somewhat working Keyboard Maestro workflow for storing a new draft copy instead of the original. Of course, since it has quite a few steps across two different applications it is a bit brittle and not especially fast, and it operates on a single selected message at a time. It would probably be more efficient to import more messages at once to an mbox and work in batches with something like formail, but I’ll try this out for now.

Not being very proficient in AppleScript, one thing I hade some trouble with was getting the most recently captured e-mail. It seems I would have needed to write or import some sort of sorting routine to do this. After some unsuccessful experimentation I gave up and settled for setting a temporary tag during capture and always remove it afterwards. As long as I don’t accidentally let any record linger with this tag set I can always assume that the first record there is the newly imported one. Is there a better way to do this?

Yes, although that only changes the date in EagleFiler, not in the actual mail file.

That should work. The direct way to do this would be to use the “import” AppleScript command because it will return the new record to the script:

tell application "EagleFiler"
    tell library document 1
        set {_record} to import URLs {"https://c-command.com"}
        -- do something with _record
    end tell
end tell

You could combine this pattern with the Import From Apple Mail script.