more fun with regex (spamassassin rules)

Daniel Maher daniel.maher at
Tue Feb 20 21:58:25 CET 2007

Please note, for the "replace this with that" rule noted below, the "describe" and "score" strings should be swapped:


body        UBI_URL_OBFU01          /(remove|replace|substitute) ?(the)? ?(("|').("|')|space) ?(in|from|to make) (the)? ?(link|url|address)? ?(above|below|work)/i

score       UBI_URL_OBFU01          6

describe    UBI_URL_OBFU01          URL obfuscation (01)


Mea culpa. :P



 °v°  Daniel Maher
/(_)\ Administrateur Système Unix
 ^ ^  Unix System Administrator


Four elements!


From: Daniel Maher 
Sent: February 20, 2007 3:30 PM
To: 'MailScanner discussion'
Subject: more fun with regex (spamassassin rules)




First thing's first, thanks to everybody that responded to my regex request.  In case you're still in need of a spamassassin rule to find the "replace this with that" spams, here you go:


body            UBI_URL_OBFU01          /(remove|replace|substitute) ?(the)? ?(("|').("|')|space) ?(in|from|to make) (the)? ?(link|url|address)? ?(above|below|work)/i

describe        UBI_URL_OBFU01          6

score           UBI_URL_OBFU01          URL obfuscation (01)


I've found that it works quite nicely!  Feel free to name it whatever you like, of course. :)


Next up, I'm having a problem with another regex which detects the illegal characters in the common spam of this type lately.  If I use it via egrep from the command line, it matches properly; however, spamassassin does not appear to match it:


$ egrep -i "https?:\/\/([a-z0-9._\-]{1,30}(:[a-z0-9._\-]{1,30})?\@)?[a-z0-9.-]{1,30}[^a-z0-9.-\/:'\[][a-z0-9.-\@]{1,30}"


This will, for example, successfully match:

http://www.domain .com




The same regex as a spamassassin rule:

body            UBI_URL_OBFU02          /https?:\/\/([a-z0-9._\-]{1,30}(:[a-z0-9._\-]{1,30})?\@)?[a-z0-9.-]{1,30}[^a-z0-9.-\/:'\[][a-z0-9.-\@]{1,30}/i

score           UBI_URL_OBFU02          1.5

describe        UBI_URL_OBFU02          URL obfuscation (02)


Unfortunately, this rule will not trigger on either of the domains noted above.


Any ideas?



 °v°  Daniel Maher
/(_)\ Administrateur Système Unix
 ^ ^  Unix System Administrator


Four elements!


-------------- next part --------------
An HTML attachment was scrubbed...

More information about the MailScanner mailing list