Results 1 to 5 of 5

Thread: moving and training are different?

  1. #1

    Default moving and training are different?

    Could someone please confirm a basic concept for me?

    With auto-training off ... just because SpamSieve's rule is active and is moving messages into a spam folder doesn't mean that it's being trained. Right? SpamSieve will move suspected spam without actually adding anything to the corpus? To train, to add to the corpus, I have to select messages and "train as good/spam"?

    Do I have this straight?

    Thanks in advance.

  2. #2

    Default

    Quote Originally Posted by paulingraham View Post
    With auto-training off ... just because SpamSieve's rule is active and is moving messages into a spam folder doesn't mean that it's being trained. Right? SpamSieve will move suspected spam without actually adding anything to the corpus? To train, to add to the corpus, I have to select messages and "train as good/spam"?
    Correct. Generally you should only train it with messages that it didn’t automatically move to the correct mailbox.

  3. #3

    Default

    Quote Originally Posted by Michael Tsai View Post
    Generally you should only train it with messages that it didn’t automatically move to the correct mailbox.
    Thanks for the rapid response, Michael. Obviously you make a point of being a presence on your forums. I wish all developers were so conscientious!

    You surprised me with your reply, though. While I figured you (or someone) would confirm the basic concept I was asking about, I have been training it with all messages, because I assumed that if it wasn’t set to train automatically, then I needed to manually say, “Yes, indeedy, all this spammy messages in the Spam Folder are indeed spam.”

    So let me do some more confirming ...

    I should only “Train as Spam” when I’ve got spam that SpamSieve failed to move to my spam folder?

    I should only “Train as Good” when I’ve got good mail that SpamSieve incorrectly moved to the spam folder?

    Shouldn’t I “Train as Good” to improve the ratio of good mail to spam in my corpus? Right now my corpus only has about 14% good mail to work with, but it will take quite a long time to increase that if I only “Train as Good” when I get a false negative!

    Given that I was training one way or the other with every message ... um, should I start over?

    That’s a lot of questions, but hopefully easy to answer! Thanks again.

  4. #4

    Default

    Quote Originally Posted by paulingraham View Post
    While I figured you (or someone) would confirm the basic concept I was asking about, I have been training it with all messages, because I assumed that if it wasn’t set to train automatically, then I needed to manually say, “Yes, indeedy, all this spammy messages in the Spam Folder are indeed spam.”
    In all cases you should start out by doing an initial training with the recommended numbers of spam and good messages. After that, I recommend training SpamSieve with any messages that it didn’t classify correctly, and only with those messages.

    Additionally, I recommend that most people use auto-training (it’s on by default), although there are a few cases where you wouldn’t want to do that.

    Quote Originally Posted by paulingraham View Post
    I should only “Train as Spam” when I’ve got spam that SpamSieve failed to move to my spam folder?

    I should only “Train as Good” when I’ve got good mail that SpamSieve incorrectly moved to the spam folder?
    Correct.

    Quote Originally Posted by paulingraham View Post
    Shouldn’t I “Train as Good” to improve the ratio of good mail to spam in my corpus? Right now my corpus only has about 14% good mail to work with, but it will take quite a long time to increase that if I only “Train as Good” when I get a false negative!
    If you follow the guidelines above, you probably won’t have to worry about the ratio. You’d start out with the proper ratio, and SpamSieve’s auto-training would maintain it. If you had auto-training off and were only training it with mistakes, there wouldn’t be many of them, so the ratio would stay pretty much the same.

    Quote Originally Posted by paulingraham View Post
    Given that I was training one way or the other with every message ... um, should I start over?
    Yes.

  5. #5

    Default

    I understand. Thank you very much for clarifying.

    Quote Originally Posted by Michael Tsai View Post
    In all cases you should start out by doing an initial training with the recommended numbers of spam and good messages ...
    For the record (and possibly other confused users), this is how I got confused and onto the wrong track.

    You see, I had a Mail.app kerblooie just a couple days ago and decided I needed to start fresh with a completely new mail folder (which did indeed help my many Mail problems).

    But that also meant that I didn’t have a good range of messages to train SpamSieve with. I already had an adequate supply of weekend spam... but I don’t get all that much legit mail on the weekend. Getting the right ratio would have involved a very small sample of data. I decided I’d try to manually manage the ratio as new mail came in, and ... well, got kinda confused. ;-)

    However, good mail is starting to come in now that it’s Monday, so I should have an adequate sample to do initial training with soon.

    Thanks again.

Similar Threads

  1. Leopard, Entourage, IMAP and moving junk mail
    By IainF in forum SpamSieve
    Replies: 2
    Last Post: 11-07-2007, 06:11 PM
  2. IMAP and moving Junk email
    By tuqqer in forum SpamSieve
    Replies: 4
    Last Post: 09-10-2007, 03:02 PM
  3. IMAP - Moving a good message back to inbox
    By stevenind in forum SpamSieve
    Replies: 14
    Last Post: 12-07-2006, 07:20 PM
  4. Moving messages to the wrong account
    By whatevrnvrmind in forum SpamSieve
    Replies: 1
    Last Post: 11-12-2006, 02:17 PM
  5. Moving spam messages directly to the trash
    By Gary in forum SpamSieve
    Replies: 5
    Last Post: 08-25-2006, 03:56 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •