Why am I getting the same 'Trained as Good' emails in my Spam Folder all the time?

kmoreau893 · March 14, 2012, 2:19pm

I recently did a complete reset of the corpus at Michael’s recommendation for a lot of problems I was having about SpamSieve not catching many spams. This helped a lot and now I am getting very few Spams. I also set Spamsieve one notch toward ‘Aggressive’ from the middle.

However, I am now getting emails from some senders in my Spam Folder. OK, I accept this, and train them as Good, they then go out of the Spam and into my inbox. This is good.

The bad thing is, it seems that the same senders are going to my Spam box over and over, like the “Train as Good” is not working. 2 that have been trained, probably 50 times each since the reset, are from Amazon.com, and from Restaurant.com. A lot of them wind up in my Spam.

Please tell me how to troubleshoot this, thanks!

Michael_Tsai · March 14, 2012, 3:43pm

First you should check SpamSieve’s log. The log will show whether SpamSieve predicted the messages to be spam (and, if so, why). If it didn’t, then that would tell you that the problem is elsewhere in your mail program (or perhaps server) configuration.

kmoreau893 · March 14, 2012, 4:03pm

OK, checked the most recent log:

Predicted: Spam (96)
Subject: Your Order with Amazon.com
From: auto-confirm@amazon.com
Identifier: sU5MC54HMVW0oMtEXfnsqQ==
Reason: P(spam)=1.000[1.000], bias=0.498, scalp(1.000), scalp(1.000), inhibitor(1.000), inhibitor(1.000), stain(1.000), stain(1.000), remover(1.000), remover(1.000), palmetto(0.999), palmetto(0.999), emu(0.999), emu(0.999), exfoliating(0.999), exfoliating(0.999), ketoconazole(0.999)
Date: 2012-03-14 02:08:45 +0000

OK, makes sense there is a lot of spam about hair products. However, I have trained as good

And here is the record of the training:

=====================================================================
Mistake: False Positive
Subject: Your Order with Amazon.com
Identifier: sU5MC54HMVW0oMtEXfnsqQ==
Classifier: Bayesian
Score: 96
Date: 2012-03-14 21:20:08 +0000

Here is another one:

=====================================================================
Predicted: Spam (96)
Subject: $4 - The shot clock’s ticking…
From: restaurant_com@emailrestaurant.com
Identifier: w4RFA4+WAGQreGJGtRq4MQ==
Reason: P(spam)=1.000[1.000], bias=0.498, U:382f(0.999), U:92ef(0.999), U:52d(0.999), U:52d(0.999), U:216f(0.999), U:75cf(0.999), U:857b(0.999), U:008(0.999), U:655e(0.999), U:254b(0.999), U:254b(0.999), U:694(0.999), U:694(0.999), U:900a(0.999), U:8ada(0.999)
Date: 2012-03-13 10:26:15 +0000

and my train as good record:

=====================================================================
Trained: Good (Manual)
Subject: $4 - The shot clock’s ticking…
Identifier: w4RFA4+WAGQreGJGtRq4MQ==
Actions: added rule <From (address) Is Equal to "reply-fe5c1579756703797311-2058194_HTML-811564236-77657-1449@emailrestaurant.com"> to SpamSieve whitelist, added rule <List-Unsubscribe Is Equal to “<mailto:leave-fc7e1670706c0574717b28313958-fe2b1c7174610279771776-fe5c1579756703797311-feef1377736103-ff2d1574716d@leave.emailrestaurant.com>”> to SpamSieve whitelist, added to Good corpus (1402)
Date: 2012-03-13 18:46:48 +0000

However…

I have trained as Good over and over again for these same senders, I could go to older records to prove it but I don’t really want to take the time right now. Now that you have more information, can you tell me what is going wrong?

Thanks much!

Michael_Tsai · March 14, 2012, 5:39pm

Normally, when you train a message as good SpamSieve will add the sender address to the whitelist. So you should never get a false positive from that address again.

In the case of Amazon, it may be that you had received spam from auto-confirm@amazon.com in the past, and so when you trained those (earlier) messages as spam SpamSieve disabled the whitelist rule. Training the new messages as good does not re-enable the disabled rule because it’s already been proven that you get spam from that address. If you want to override the whitelist and tell SpamSieve to treat all the messages as good, you could add the address to your address book.

The emailrestaurant.com messages seem to come from a unique address every time. I guess they use this for tracking purposes when you reply. You could create your own whitelist rule for when the “From (address)” ends with “emailrestaurant.com”.

atknapp · July 15, 2012, 1:09am

Having read, checked and done all the above, what do you do when SS persistently junks a message that a) has a whitelist record already, and b) has the sender already in my address book?

I’m having the same problem as the OP, and neither training-as-good or adding-contact-to-address book seems to be doing any good.

Alexander.

Michael_Tsai · July 15, 2012, 8:21am

You’re either running into a bug that no one else has seen in the last 10 years, or something isn’t set up properly on your Mac. My guesses would be either:

SpamSieve is not actually classifying these messages as spam. Perhaps your mail server’s junk mail filter or another rule is doing so. If this is the case, there will be no “Predicted: Spam” entries for those messages in the log.
The whitelist rule is failing because it’s unchecked (because you’ve received spam from that address). The Address Book entry is failing because Use Mac OS X Address Book is unchecked or the e-mail address that you entered has a typo or is not entered in the e-mail field. (It’s common to accidentally enter the address in the phone number field.) If this is the case, the log will show why SpamSieve thought these messages were spam.

So, either way, the next step is to check the log.

Why am I getting the same 'Trained as Good' emails in my Spam Folder all the time?

===================================================================== Mistake: False Positive Subject: Your Order with Amazon.com Identifier: sU5MC54HMVW0oMtEXfnsqQ== Classifier: Bayesian Score: 96 Date: 2012-03-14 21:20:08 +0000

=====================================================================
Mistake: False Positive
Subject: Your Order with Amazon.com
Identifier: sU5MC54HMVW0oMtEXfnsqQ==
Classifier: Bayesian
Score: 96
Date: 2012-03-14 21:20:08 +0000