Date: 2003-11-17Author: Chris ThielenReply to this Subject: Feedback
Leave feedback here, if you wish. -- Chris
Date: 2003-11-18Author: ScottReply to this Subject: Excellent
I have found this to be an invaluable tool, saving me tons of time. I used to build similar REs by hand. Not any more. Thank you Chris!
Date: 2003-11-18Author: Chris ThielenReply to this Subject: Requesting feedback on the Javascript rule testing
I hacked together the web-based Javascript rule eval (on the CGI page) in a couple of hours. I think it works OK on Mozilla. If you have problems with it, please leave some feedback. Include the browser you are using, and its version.
Date: 2003-12-05Author: IcebirdReply to this Subject: Masked Words
Excellent Work.
But if you have Words with 2 or more >> . << the rules dont match. For example: In some SpamMails are words like >> d..iscre.tely << the rules with your great script don?t match them.
Date: 2003-12-12Author: Chris ThielenReply to this Subject: Re: Masked Words
Icebird said: > But if you have Words with 2 or more >> . << the rules dont match
Icebird, Version h of the script addresses your suggestion. Use the -m <n> command line switch to set how many characters wide the gap pattern is.
Date: 2003-12-16Author: Jens BeneckeReply to this Subject: This is SOOOO great ...
Hi Chris,
this is one of the most valuable tools in my fight against SPAM. I use quite a number of blacklists and turned off catch-all on all domains, but SA catches just about all of catches rest.
If you ever come to Hamburg feel free to call me and pick up a free beer. :-)
kiste:~# spamstats Die Dez 16 10:24:48 CET 2003 - Werte der letzten 24h Mails blockiert von bekannten Spammern (badmailfrom) : 140 Mails blockiert von bekannten Spam-IPs (rblsmtpd) : 8801 Mails blockiert an unbenutzte Addr. (pseudo-catchalls) : 14509 Angenommene, aber als SPAM erkannte Mails (SpamAssassin): 543 Angenommene, 'saubere' Mails : 1046
Date: 2004-01-13Author: AlexBReply to this Subject: NewM?dia Publishers AG
Pls ceck the obfuscation result from: "NewM?dia Publishers AG"
the "?" [e with ^] istn't interpreted correctly it seems
Alex
Date: 2004-01-14Author: Jens BeneckeReply to this Subject: include mixed characters?
Hi,
I keep getting tons of spam where things like
"Genierc and Sepur Viarga (Caiils) available online! Most trusted online source! Cilais or (Spuer Vagira) takes affect right away & lasts 24-36 huors!"
The obfuscation obviously rules don't find this SPAM, as it's not obfuscated, but "character mixed".
I'd like to be able to specify "mix second to last-but-one character", so it will catch "v[iagr][iagr][iagr][iagr]a" for example, if I specify "viagra".
Optionally, of course.
Thanks :-)
Date: 2004-01-15Author: Chris ThielenReply to this Subject: Re: include mixed characters?
Jens, I'll experiment with adding mixing the characters around and post the results here, however I'm guessing there will be some collateral damage. However, I do have a plan for exactly this type of misspelled spam in the works, so stay tuned (It's an eval rule, not a huge regexp like this one).
Chris
Date: 2004-02-27Author: JensReply to this Subject: Re: include mixed characters?
Hi,
I found that blocking mails that contain URLs which contain "?AFF_ID=" will hit almost all of this obfuscated stuff, with no false positive (yet).
But I have also found that many of your obfuscated rules hit Base64 encoded plain text, i.e. attachments. (e.g. if you generate a rule for "sex") How can I exclude attachment parts from being scanned by your rules? (Please also mail me, I don't have regular web access right now.)
Thanks!
Jens
Date: 2004-02-27Author: Chris ThielenReply to this Subject: Re: include mixed characters?
> Hi, > > I found that blocking mails that contain URLs which contain "?AFF_ID=" will hit almost all of this obfuscated stuff, with no false positive (yet). > > But I have also found that many of your obfuscated rules hit Base64 encoded plain text, i.e. attachments. (e.g. if you generate a rule for "sex") How can I exclude attachment parts from being scanned by your rules? (Please also mail me, I don't have regular web access right now.)
Jens, Yeah... BASE64 is a problem right now. I understand that in SA version 3.0.0 (Next major release) will only scan the appropriate attachment types and will skip right over binary attachments. In the meantime, maybe scoring short phrases lower would mitigate the problems with matching binary data.
Otherwise, one could make a rule that detects binary attachments, change the CMOScript rules to be prefixed with __, and use meta rules to score CMOScript rules only when there isn't a binary attachment.
Just some thoughts, although I wouldn't expend too much energy on it, as SA 3.0.0 should fix the fundamental flaw of scanning all attachments.
Chris
Date: 2004-01-19Author: ChrisReply to this Subject: Problem with rule
I started receiving some spam today with the word spelled correctly, however, there is a @ or % in the words, for instance, presc%ription or prescr@iption. Using your excellent page I can get a match on the % but not on a work with @. I tried writing a rule looking for and tagging the ! through * characters but I botched it up as SA displayed an error where there was none before. Any help would be appreciated.
Thanks Chris
Date: 2004-01-20Author: Chris ThielenReply to this Subject: Re: Problem with rule
Chris,
There is an option available that would suit this problem: simple gap (-s). This option will make the gap detection far more lenient and should catch words like presc%ription or prescr@iption.
Chris Thielen
Date: 2004-01-21Author: ChrisReply to this Subject: Rule test
Thanks Chris, I generated a couple of test rules on your site and have found that using the rule generated with the -s option will catch "via!gra" and "pres#cription" and words with these characters in them $%^&* but will not catch words such as "via@gra" or "pres@cription". The @ seems to cause a problem, or am I missing something here when generating the rule?
Thanks for a great site!
Chris
Date: 2004-01-23Author: Chris ThielenReply to this Subject: Re: Rule Test
Chris,
I'm not sure what is happening, but using -s *should* catch words with "@" in them. If you are still having problems, feel free to email me privately. Include the generated rule, and a sample message that it is not matching on.
Chris Thielen
Date: 2004-02-22Author: BrianReply to this Subject: Re: Rule test
Tried your rules generator to attemp to catch viaagra and the like, unfortunately it just skips right over them. I tried the -s and the -m of 2 or 3 and still it gets thru.
Date: 2004-02-24Author: Chris ThielenReply to this Subject: Re: Rule test
> Tried your rules generator to attemp to catch viaagra and the like, unfortunately it just skips right over them. I tried the -s and the -m of 2 or 3 and still it gets thru.
Hi Brian,
Yes, unfortunately CMOScript doesn't handle misspellings currently (repeated/added letters or missing letters other than vowels) as you have noticed. The main spamassassin team is looking into some ways to detect misspellings of "naughty words", possibly using the bayes database. Stay tuned, something may be in the next major release (3.0.0) of SpamAssassin.
- Chris
Date: 2006-10-31Author: AnonymousReply to this Subject: Re: Problem with rule
> Chris, > > There is an option available that would suit this problem: simple gap (-s). This option will make the gap detection far more lenient and should catch words like presc%ription or prescr@iption. > > Chris Thielen
Date: 2004-01-22Author: daveReply to this Subject: viagra and other rules
I have started getting spams with viagra and other words spelled out like this v:i:a:g:r:a The LOCAL_OBFU_ONLY_VGR do not catch this. What is the best way to catch these?
dave
Date: 2004-01-23Author: daveReply to this Subject: Re: viagra and other rules
I have managed to modify the LOCAL_OBFU_ONLY_VGR rule set to catch words broken up with colons and semi-colons if anyone is interested
Date: 2004-01-23Author: Chris ThielenReply to this Subject: Re: viagra and other rules
Dave,
This may be the same solution you found, but using the "Simple Gap" option (-s) should catch words like v:i:a:g:r:a.
By default, the gap detection routine matches only certain characters that I thought might be common gap filler (such as spaces, dashes, tildes, and a ton of other stuff). This was to try to cut down on false positive matches. To make the gap detection more lenient, -s searches for "any non-word character, or a '_'". Another option would be to edit obfu.pl and use the most lenient possible gap detection: '.' which matches *anything*. I'll consider adding a flag for this in a later version.
In general, longer words (5-6+ chars) won't cause as many false positives as shorter words. Thus, it's safer to make longer words more lenient. For very commonly obfuscated words like viagra, you might want to enable Simple Gap (-s), Multi-gap (-m) with "n" set to 2 or 3, and turn off Obfu Only (don't use -o; I can do this personally, because I never get legit mail that contains "viagra" but your mileage may vary)
Date: 2008-04-04Author: BenjaminReply to this Subject: error on page
Hi Your script sounds very interresting, but I cant make this site work: http://sandgnat.com/cmos/cmos.jsp
I get the following error all the time: Line: 55 Char: 4 Error: Invalid range in character set Code: 0 URL: http://sandgnat.com/cmos/cmos.jsp? ....
Subject: Feedback
-- Chris