TruerWords Logo
Google
 
Web www.truerwords.net

Search TruerWords

Welcome
Sign Up  Log On

“Underwhelmed with Spam, but Impressed with Spammers”

From: Seth Dillingham In Response To: 3807  SpamAssassin is Running
Date Posted: Thursday, May 6, 2004 8:23:50 AM Replies: 0
   
Enclosures: None.

Since installing SpamAssassin, I've been obsessively writing (and tweaking) custom rules, adjusting scores, and training the bayesian classifier.

Before it was installed I would get at least 100 UCE/UBE messages every night, and at least that many again throughout the day. SpamSieve did (and does) a nearly-perfect job of catching it all, but I really hated having to download that much useless mail in the first place. Also, Macrobyte's mail server hosts over 100 email accounts, and nearly all of them were getting some amount of spam: some a lot more than my account, some a lot less.

Last night, only five made it past SpamAssassin and into my mailbox. SpamSieve caught every one of them.

I have to admit, though, that I'm really impressed with the spammers' determination. There are some publicly available, custom rulesets for SA, that target certain types of spam. It wasn't long before the spammers started writing their messages to get around those rules. So, the rules were updated. The spammers clearly study the rules, because a little while later they started getting through again.

If you hadn't noticed, there are three tricks they've been using lately, quite effectively. First, they obfuscate words by putting punctuation be'tw_e!en the letters, or replacing letters with similar-looking characters: tests for specific words will fail, so they have to be rewritten to ignore punctuation and treat \/ (two slashes) the same as 'v', ! the same as I, etc. Second, they've been filling the spam with random words (in html email they color those words the same as the background, so the reader doesn't see it but spamassassin does), or, even worse, lots of famous quotes. This confuses the bayesian classifier, which will see a few spam words but lots of generic words or "good" words. Third, some of them use almost no text at all, and instead just include an image with their message.

Weasels.

The filter-writers can work around these things, but it's not easy, and every additional test requires a few more CPU cycles. It's worth it, in the end, because fewer messages being delivered means the spammers make less money. Eventually, they'll stop trying.

That's the idea, anyway.


Discussion Thread:

There are no replies.

Trackbacks:

There are no trackbacks.


Until August 31
My Amazon sales
benefit the PMC

Homepage Links

Apr 1 - Aug 31
Ad revenue
benefits the PMC


TruerWords
is Seth Dillingham's
personal web site.
Truer words were never spoken.