The subject will be something like “Registration Details” or “Membership Support”.
The “From” will be exactly what it says in the message, such as “Pet World” or "Entertaining Pros"or “Cat Lovers” or “Joke-A-Day”.
Obviously the spammer’s intention is to get the unwise recipient to visit the the “link” as given, whereupon nasty things can be surreptiously snuck into their computer.
Nothing new in this. But what I can’t understand is why isn’t SpamSieve recognising them?
I keep Control-Command-S’ing them, to train SpamSieve, but I’m still getting 20 or more a day.
I’m just curious as to how these are evading SpamSieve, which traps an average of 500 other spams for me each day.
I’ve looked at the log, and it seems that each of these spam messages has been added to my whitelist, therefore future arrivals from same place or with same subjects are getting through!
Trained: Good (Auto)
Subject: Registration Details
Identifier: /49HFVk2IPwtnhZueEtmQA==
Actions: added rule <From (address) Is Equal to "m.coll3@btfinancialgroup.com"> to SpamSieve whitelist, added rule <From (name) Is Equal to “Pet World”> to SpamSieve whitelist, added to Good corpus (1128)
Date: 2007-08-21 19:27:23 +0100
Now, the question is, how are they getting added to my whitelist, when all I’ve done is since they started arriving I’ve marked each one as “spam” with Control-Command-s.
It’s normal for addresses to get automatically added to the whitelist when SpamSieve thinks the message is good. However, this is not a problem because when you train the message as spam the corresponding whitelist rules will be disabled.
The log excerpt that you’ve quoted does not support this conclusion.
Basically this means that words like “please,” “welcome,” “link,” “password,” “change,” and “greetings” have historically appeared in your good mail but not in the spam. Therefore, their presence makes SpamSieve think that this message is good. Training it with more messages like this will eventually teach it otherwise, or you could speed things up by either (a) going to the corpus window and deleting these words, or (b) resetting SpamSieve’s corpus and re-training it with some recent messages.
If you click on “send a report” from my previous post, it will take you to the Web page that tells how and where to send a report.
Aha! So Bruce is seeing this, too, by which I take it Bruce that you mean that 10 a day or so are getting into your Inbox after having been accepted as good by SpamSieve.
Are any other forum members seeing a similar pattern?
So far no one has followed the instructions to send me a report about this. I’m much more interested in stopping these spams than in knowing how many people who read the forums are seeing them.
I was in the process of doing this but wanted to wait to see if one of these messages came in this morning. One finally did and the log and false negatives should be at the requested address by the time you read this.
I didn’t mean to criticize his program – it’s worked wonderfully for me since I started using it.
I was just surprized that this spam (say 10 or more messages a day) was being passed through by SpamSieve, and kept coming even after I had trained SS on dozens of them.
Excuse my ignorance of programming etc, but isn’t the problem that if I tell SpamSieve that emails containing words such as “please,” “welcome,” “link,” “password,” “change,” and “greetings” are spam, I’m also at risk of losing genuine emails when I sign up for something!!
Looking forward to Michael’s response, and observations from anyone else in the Forum who’s seeing similar types of spam getting through.
Thanks again, and all the best in our joint fight against those pesky spammers!
No problem. Criticism is fine—I just like it to be something that I can act on.
I’ve now looked at several log files that people sent, and it seems to be the same pattern. There’s nothing unusual about the structure of the message. SpamSieve is processing it normally. But it happens that it contains a bunch of words that (for these users) have historically appeared mostly in good messages. The other factor is that in all these cases SpamSieve’s corpus was a bit larger than normal, so it takes longer for it to respond to training. If spams like this are bothering you, the quickest way to better accuracy is probably to reset the corpus and then re-train SpamSieve with a smaller number of recent messages.
At the simplest level, yes, if you tell it that messages containing “please” are spammy then, all things being equal, future messages with that word will be predicted to be more spammy.
But the idea is that, if you train SpamSieve with both good and spam messages that contain these words, it will learn to not treat them as a strong indicator in either direction. It will use other words in the messages to make its determination. So I don’t think this is cause for concern.
Does this mean there’s not a “halfway house” to ‘thin down’ the corpus in some other way – that I either leave things as they are, or I have to reset the corpus and start the training over again, in which case these words are perhaps going to get a heavier weight as ‘spam’ than in my current corpus?
Seems to me that some spammer somewhere has decided to be very clever and go for the “lean, mean and simple” approach. Trying to slip “under the wire” instead of breaking through it or flying over it!!
I wonder how other “anti-spam” approaches are dealing with this. Do you know if this is causing problems in the outside world?
Maybe I’m misunderstanding the role of the corpus here but if I reset it, don’t I then have to retrain SpamSieve? Don’t I lose the accuracy built up over the last several months? While that training is going on, I’ll be getting a lot of spam which is now unrecognized as such.
Are the whitelists/blacklists part of the corpus, as well? (Excuse me if that’s explained in the help or other documentation.)
spamsieve letting too much trained junk mail pass through
I’ve used spamsieve for over 1.5 years w/no problems. in the last few weeks, I’ve been getting very obvious spam. Words with sexual connotations, garbage words, receiving greeting cards too, which are all bogus. It’s very baffling. I use Entourage, Sys X 10.4.10.
I don’t know where to reset it to train it again nor why should I? It seemed to get worse with the last one or two updates from spamsieve. I clicked on some links in this topic, but I’m clueless what to do (like reset anything) nor how to do it.
I removed spamsieve from my dock (changed that code from 1 to 0 or whatever it is, that every time I update it I have to into a file so it doesn’t appear on my already-crowded dock).
Sorry if this seems rambling, but I work 12 hour/day, 7 days a week and really don’t have the focus to figure this out. I was soooo happy with the program and not happy with the stupid, obvious junk mail I’m receiving. Of course I always click Train-Spam on each of these, but they keep getting thru to my inbox. Help please and thanks!
Please look at the log to see whether these messages are getting through SpamSieve or if there is a setup problem. If you don’t understand how to read the log, you can send it to me and I’ll recommend what you should do.
It’s normal for the Dock preference to be reset when updating to a new version of SpamSieve. A future version of SpamSieve will address this.
spamsieve letting too much trained junk mail pass through
I DON’T KNOW HOW TO OPEN THE LOG!!! I read that info online about the log. I wrote that I am totally confused!!! Entourage is only showing the scripts logo.
Where is the log???
I am BEGGING you to walk me through this, please!!! don’t send me to a manual page that does not say where is the log!!!
jenny