Spamassassin is soooo leet.
if you're running debian, try "apt-get install spamassassin procmail razor"
this installs it with support for vipul's razor, which is a distributed spam signature database.
workbench users can add:
:0fw
| spamassassin
to their .procmailrc file and it will just "work".
I haven't installed razor on workbench yet, I'm not sure whether I'll bother or not, because it will be alot of trouble. 🙂
http://spamassassin.org/
(I love the logo, it has a copious amount of ninjas in it. w00t! )
I think spamassassin is far more flexible. It's score based, so just because a string is in an otherwise legit mail, it takes away from the score for legit things, and adds for things that indicate spam.
Thus if I sent from a legit mailer-agent, with a real name in the "from" field, the MessageID appeared valid, but I wrote "GET RICH QUICK!!!! NO RISK!" in the email, it wouldn't flag it as spam.
Note in the below example it says "This mail is probably spam" in the results. That's because I ran it in test mode, and if it doesn't get 5 points, it won't output that header, so that's what "spamassasin -t
" is for. 🙂
After that is a "spam" I wrote that violates several rules I'm aware of, just to make an example.
——————————————-
To: Dan Taylor
From: paul timmins
Subject: w00t w00t
Date: 07 Sep 2002 14:50:16 -0400
X-Spam-Status: No, hits=2.3 required=5.0
tests=RICH,SPAM_PHRASE_00_01
version=2.41
X-Spam-Level: **
This is a test.
The message is valid in every way, I just wanted to throw a spam phrase in here:
GET RICH QUICK!
This should not be flagged as spam.
SPAM: ——————– Start SpamAssassin results ———————-
SPAM: This mail is probably spam. The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM:
SPAM: Content analysis details: (2.30 hits, 5 required)
SPAM: RICH (1.7 points) BODY: If only it were that easy
SPAM: SPAM_PHRASE_00_01 (0.6 points) BODY: Spam phrases score is 00 to 01 (low)
SPAM:
SPAM: ——————– End of SpamAssassin results ———————
—————————————–
From: jack Meoff
To: slash
Subject: *****SPAM***** w00t w00t 234wef
Date: 07 Sep 1999 14:50:16 -0400
X-MSmail-priority: high
Recieved: from w00t (10.85.23.23) at 07 Sep 1999 14:50:16 -4000
X-Spam-Status: Yes, hits=18.6 required=5.0
tests=FREE_MONEY,FROM_AND_TO_SAME_6,HGH,LINES_OF_YELLING,
LINES_OF_YELLING_2,MISSING_MIMEOLE,PRIORITY_NO_NAME,
RISK_FREE,SPAM_PHRASE_00_01,SUBJ_HAS_SPACES,
SUBJ_HAS_UNIQ_ID,TO_LOCALPART_EQ_REAL,UPPERCASE_25_50
version=2.41
X-Spam-Flag: YES
X-Spam-Level: ******************
X-Spam-Checker-Version: SpamAssassin 2.41 (1.115.2.8-2002-09-05-exp)
SPAM: ——————– Start SpamAssassin results ———————-
SPAM: This mail is probably spam. The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM:
SPAM: Content analysis details: (18.60 hits, 5 required)
SPAM: SUBJ_HAS_SPACES (4.2 points) Subject contains lots of white space
SPAM: TO_LOCALPART_EQ_REAL (0.1 points) To: repeats local-part as real name
SPAM: HGH (4.0 points) BODY: Human Growth Hormone
SPAM: RISK_FREE (2.8 points) BODY: Risk free. Suuurreeee….
SPAM: FREE_MONEY (0.3 points) BODY: Free money!
SPAM: LINES_OF_YELLING_2 (-0.7 points) BODY: 2 WHOLE LINES OF YELLING DETECTED
SPAM: SPAM_PHRASE_00_01 (0.6 points) BODY: Spam phrases score is 00 to 01 (low)
SPAM: LINES_OF_YELLING (0.3 points) BODY: A WHOLE LINE OF YELLING DETECTED
SPAM: FROM_AND_TO_SAME_6 (0.7 points) From and To are same (6)
SPAM: SUBJ_HAS_UNIQ_ID (0.2 points) Subject contains a unique ID
SPAM: MISSING_MIMEOLE (1.6 points) Message has X-MSMail-Priority, but no X-MimeOLE
SPAM: UPPERCASE_25_50 (1.8 points) message body is 25-50% uppercase
SPAM: PRIORITY_NO_NAME (2.7 points) Message has priority setting, but no X-Mailer
SPAM:
SPAM: ——————– End of SpamAssassin results ———————
GET RICHER AND FULLER BREASTS WITH HGH!
Ever had a problem getting it up and performing to your fullest! Viagra is the answer!
NO RISK! ITS LIKE GETTING FREE MONEY!
This email was a base64 encoded HTML attachment. Here's what it found. Note that I couldn't verify it, because my MUA doesn't want to base64 decode it. 🙂
SPAM: ——————– Start SpamAssassin results ———————-
SPAM: This mail is probably spam. The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM:
SPAM: Content analysis details: (46.90 hits, 5 required)
SPAM: FROM_HAS_MIXED_NUMS (-0.9 points) From: contains numbers mixed in with letters
SPAM: NO_REAL_NAME (-0.3 points) From: does not include a real name
SPAM: OUTLOOK_FW_MSG (-0.2 points) Forwarded email (Outlook style)
SPAM: MIME_ODD_CASE (3.5 points) MiME-Version header (oddly capitalized)
SPAM: DATE_YEAR_ZERO_FIRST (3.2 points) Invalid Date: year begins with zero
SPAM: FROM_HAS_MIXED_NUMS2 (1.9 points) From address matches known spammer format
SPAM: FROM_ENDS_IN_NUMS (1.6 points) From: ends in numbers
SPAM: INVALID_DATE (1.6 points) Invalid Date: header (not RFC 2822)
SPAM: INVALID_MSGID (1.2 points) Message-Id is not valid, according to RFC 2822
SPAM: CLICK_BELOW_CAPS (2.4 points) BODY: Asks you to click below (in caps)
SPAM: EXCUSE_3 (1.9 points) BODY: Claims you can be removed from the list
SPAM: LOW_INTEREST (1.8 points) BODY: Lower Interest Rates
SPAM: OPT_IN (1.6 points) BODY: Talks about opting in
SPAM: MORTGAGE_OBFU (0.7 points) BODY: Attempt at obfuscating the word "mortgage"
SPAM: CLICK_BELOW (0.3 points) BODY: Asks you to click below
SPAM: HTML_FONT_COLOR_RED (-1.2 points) BODY: HTML font color is red
SPAM: HTML_FONT_FACE_ODD (-0.7 points) BODY: HTML font face is not a commonly used face
SPAM: BIG_FONT (-0.4 points) BODY: FONT Size +2 and up or 3 and up
SPAM: HTML_50_70 (-0.1 points) BODY: Message is 50-70% HTML tags
SPAM: SPAM_PHRASE_13_21 (3.0 points) BODY: Spam phrases score is 13 to 21 (high)
SPAM: [score: 14]
SPAM: HTML_FONT_COLOR_UNKNOWN (2.1 points) BODY: HTML font color is unknown to us
SPAM: HTML_FONT_COLOR_NAME (0.3 points) BODY: HTML font color has unusual name
SPAM: CLICK_HERE_CAPS_LINK (1.7 points) BODY: Tells you to click on a URL (in caps)
SPAM: CLICK_HERE_LINK (1.6 points) BODY: Tells you to click on a URL
SPAM: MIME_MISSING_BOUNDARY (-0.2 points) RAW: MIME section missing boundary
SPAM: CARRIAGE_RETURNS (1.3 points) RAW: Message contains a lot of ^M characters
SPAM: BASE64_ENC_TEXT (1.2 points) RAW: Message text disguised using base-64 encoding
SPAM: HTTP_ESCAPED_HOST (3.8 points) URI: Uses %-escapes inside a URL's hostname
SPAM: REMOVE_PAGE (3.4 points) URI: URL of page called "remove"
SPAM: NORMAL_HTTP_TO_IP (2.4 points) URI: Uses a dotted-decimal IP address in URL
SPAM: FORGED_AOL_RCVD (3.9 points) Received forged, contains fake AOL relays
SPAM: FORGED_RCVD_TRAIL (2.9 points) trail of Received: headers seems to be forged
SPAM: MISSING_MIMEOLE (1.6 points) Message has X-MSMail-Priority, but no X-MimeOLE
SPAM:
SPAM: ——————– End of SpamAssassin results ———————
I'm really impressed with this product yet. 100% accuracy on spam since I installed it a few days ago, and 0% false positives.
And best of all, I didn't have to tweak anything.
I agree to a point, but this has been FAR more accurate than spam bouncer ever has been. I'd usually find 4 or 5 legit messages get tossed into the spam box a day. Not cool.
This one hasn't missed an illegit one, or thrown away a legit one. Not one, in the last half week.
Even if you don't prefer it, you have to admire its accuracy out of the box. No tweaking was done. I compiled, installed, and that's it.
And as far as customizability, check out /usr/share/spamassassin – These can be overridden in a local config file stored in your home directory.
Plus it's not procmail, so it's more versatile, and more importantly on workbench, has a lower memory profile, so it processes faster, even though it checks for more things.
Oh, and admit it, this is _way_ cool:
SPAM: DATE_IN_FUTURE_03_06 (1.0 points) Date: is 3 to 6 hours after Received: date
Now I know spambouncer is cool and all, but it never analyzed to this depth.
leet. let me know what you think 😉