You are using the CGI based CMOScript gateway. See Chris's Mediocre Obfuscation Script for more detail regarding CMOScript.

Instructions:

Easy mode:
Input some words here (eg: money vicodin click) and click submit. You may then view the results in the second text area below which is labeled "Generated obfuscated rule(s) file".

Note: Enter non-obfuscated words such as "money".
Do NOT enter obfuscated versions such as "m0n3y".
Do NOT enter URLs or email addresses.

Advanced Mode:
  1. First, input a simple rulesfile. There are two ways to do this:
    1. Select an example (Requires JavaScript):
    2. Type or paste source rules file below.
  2. Choose Generation Options (if you wish)
  3. Submit your query
  4. View the generated results
  5. If you have a browser with JavaScript 1.5 or above, you can enter text (from a spam, etc.) and see if the text matches any of the generated rules.
+/- Show/Hide tips

Generation Options:

Obfu only (-o): ONLY match obfuscated text (Match vi@gra but not viagra).
No gap (-g): Turn off G~A~P~P~Y T~E~X~T detection. The -s option has no effect when this option is enabled.
Simple gap (-s): Use simple gap detection pattern: [\W_] (Should make faster rules than the standard gap pattern, but may trigger more false positives, especially on binary data such as images).
Multi-gap (-m <n>): n=Specify gap width value (Match up to n characters per gap; may cause false positives).
Duplicate chars (-d <n>): n=Allow <n> duplicate chars, eg: -d 2 can match "viaagra". Uses backreferences; may cause performance problems; naively guesses backreference numerals, so do not include groupings in your rule unless you use this grouping form (?: ). Note: the Javascript tester doesn't seem to be compatible with these backreferences. They do work in SpamAssassin, though.
No short word gap (-w): Disable special restrictive gap pattern for short (less than 4 chars) words.
No Unicode Entities (-u): Don't match unicode entities (Entities beyond &#256;) -- Matching unicode entities is still experimental. Please provide feedback.
High ASCII (-h): Output high ASCII characters (> 127) as the character itself, not the hexadecimal representation \xFF (May cause problems copying and pasting from your browser. I recommend using this only if you are executing the script on your own machine)
Vowels optional (-v): Make vowels optional (Currently "False Positive City": Not recommended!)
Don't Rename Rules (-r): Don't prepend OB_ to rule names. Note: enabling this option will cause the web interface to report an erroneous WARNING that no rules were found.
Debug (-D): Dump some internal information to the output (Useful to see how CMOScript works).

Source (non-obfuscated) Rules File:


(note: this submit button operates on the "Source Rules File" and does not use the word(s) in the "Easy mode" text box)

Generated (obfuscated) Rules file:

JavaScript rule testing:

You can test the rules you have just generated right here in your browser. For instance, if you were testing a rule: "body VIAGRA /viagra/", try typing "v1ag.ra" in the space below, then click "Check text for matches".
This feature requires a browser that supports JavaScript 1.5.
Known compatible browsers: Mozilla Firebird 0.6.1, MSIE 5.5 (5.50.4807.2300CO), MSIE 6.0 (6.0.2800.1106.xpsp2.030422-1633), Konqueror 3.1.3


Feedback

Comments? Questions? Problems with the script, or this page? Leave feedback here
Last 3 threads shown.
Displaying threads 15 through 17.
Date: 2006-09-21 Author: Olivier Reply to this
Subject: Reverse character range with Unicode
Hi,

I just noticed that when using Unicode (not using -u) the rules regerated contain character range that are in reverse order:

body LOCAL_OBFU_CSIM_VIAGRA /(?!\bviagra\b)(?:\b[vu]|\B(?:\\\/|\xCE\xBD))[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[il1:\|\*\xCC-\xCF\xEC-\xEF\xA6]|(?:\xC4[\xB0-\xAF]|\xC4[\xAF-\xAE]|...

See the [\xB0-\xAF] and then the [\xAF-\xAE] and etc. (I trucated the rule)

So SA will complain that

[2677] warn: config: invalid regexp for rule LOCAL_OBFU_CSIM_VIAGRA: /(?!\bviagra\b)(?:\b[vu]|\B(?:\\\/|\xCE\xBD))[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[il1:\|\*\xCC-\xCF\xEC-\xEF\xA6]|(?:\xC4[\xB0-\xAF]|\xC4[\xAF-\xAE]|...
Invalid [] range "\xB0-\xAF" in regex; marked by <-- HERE in m/(?i)(?!\bviagra\b)(?:\b[vu]|\B(?:\\\/|\xCE\xBD))[\x01-\x2F\x3A-\x40\x5B-\x60\|\x7F-\xA1\xA4-\xA8\xAB-\xAD\xAF-\xB1\xB4\xB7-\xBB\xBF\xF7]?(?:[il1:\|\*\xCC-\xCF\xEC-\xEF\xA6]|(?:\xC4[\xB0-\xAF <-- HERE ]|\xC4[\xAF-\xAE]|...

When using -u (no Unicode)the rule is shorter and there is no such character range trouble.
Date: 2007-01-31 Author: andrea Reply to this
Subject: new kind of spam
How would you fight against this?:
"Good day,

Via_grra $1, 80
Cia_aliss $3, 00
Levi_trra $3, 35

http://www.progenyid.*com ( Important ! Remove "*" )

--
But for all the notice anyone took, he might just as well not have answered at all. Im tired! he bellowed finally, after nearly half an hour. No,"

Cosider that: VIAGRA can be written in multiple ways like VIdsfsfA_AxxxGRA and it's pretty impossible to set a rule for so many combination.
The link often changes and the last part looks like random book part.
Any idea?
thanx

Date: 2007-02-06 Author: Chris Thielen Reply to this
Subject: Re: new kind of spam
> How would you fight against this?:
> "Good day,
>
> Via_grra $1, 80
> Cia_aliss $3, 00
> Levi_trra $3, 35

Hi Andrea,

Catching obfuscations like this is something of a cat-and-mouse game. There is a balance to be made between false positives and false negatives on obfuscated text. In this particular case, I suggest setting the multi-gap width to 2 or 3 (-m 3), and setting duplicate chars to 2 or 3 (-d 3). You may need to also enable simple gap (-s).

Turning all these options on will increase the leniency of the matches, but has more potential to cause false positives. It also won't be able to capture the obfuscation examples you have mentioned at the bottom, but again there is a balance that must be made.

SpamAssassin is designed to use a variety of heuristics, not just one or two. I highly recommend getting a working bayes system and enabling the network tests (URIBL, DNSBL, etc).

Chris
Date: 2007-02-14 Author: andrea Reply to this
Subject: Re: new kind of spam
> > How would you fight against this?:
> > "Good day,
> >
> > Via_grra $1, 80
> > Cia_aliss $3, 00
> > Levi_trra $3, 35
>
> Hi Andrea,
>
> Catching obfuscations like this is something of a cat-and-mouse game. There is a balance to be made between false positives and false negatives on obfuscated text. In this particular case, I suggest setting the multi-gap width to 2 or 3 (-m 3), and setting duplicate chars to 2 or 3 (-d 3). You may need to also enable simple gap (-s).
>
> Turning all these options on will increase the leniency of the matches, but has more potential to cause false positives. It also won't be able to capture the obfuscation examples you have mentioned at the bottom, but again there is a balance that must be made.
>
> SpamAssassin is designed to use a variety of heuristics, not just one or two. I highly recommend getting a working bayes system and enabling the network tests (URIBL, DNSBL, etc).
>
> Chris
>

thanx for your answer.
Actually i focused on "*" to catch those mails cause it seams to be repeated and i did this rule : body COTUS_ASTERISCO /[\w\s]{0,5}\x22\W\x22[\w\s]{0,5}/i
It does a good job but it also blocks some nonspam mails.
I used to check my rules with RegexBuddy and when i test false positive mails against COTUS_ASTERISCO i doesn't return anything.
It looks like i'm doing rules in a different language than the one spamassassin uses.
In addition i can't find a secure way to test rules in Windows (can't use -lint option).
Your tester seams good but it doesn't let me test multiple lines.
thanx for your time
Date: 2008-04-04 Author: Benjamin Reply to this
Subject: error on page
Hi
Your script sounds very interresting, but I cant make this site work: http://sandgnat.com/cmos/cmos.jsp

I get the following error all the time:
Line: 55
Char: 4
Error: Invalid range in character set
Code: 0
URL: http://sandgnat.com/cmos/cmos.jsp? ....

Do you have any idea whats wrong?

- Benjamin