Email is only an image - tag as spam?
Mariano Absatz
mailscanner at LISTS.COM.AR
Thu Dec 4 15:35:44 GMT 2003
Jody,
these standard SA 2.6 rules should match these messages:
# HTML_IMAGE_AREA - lots of image area (absolute)
body HTML_IMAGE_AREA_04 eval:html_range('image_area','400000','500000')
body HTML_IMAGE_AREA_05 eval:html_range('image_area','500000','600000')
body HTML_IMAGE_AREA_06 eval:html_range('image_area','600000','700000')
body HTML_IMAGE_AREA_07 eval:html_range('image_area','700000','800000')
body HTML_IMAGE_AREA_08 eval:html_range('image_area','800000','900000')
body HTML_IMAGE_AREA_09 eval:html_range('image_area','900000')
describe HTML_IMAGE_AREA_04 HTML has 4-5 kilopixels of images
describe HTML_IMAGE_AREA_05 HTML has 5-6 kilopixels of images
describe HTML_IMAGE_AREA_06 HTML has 6-7 kilopixels of images
describe HTML_IMAGE_AREA_07 HTML has 7-8 kilopixels of images
describe HTML_IMAGE_AREA_08 HTML has 8-9 kilopixels of images
describe HTML_IMAGE_AREA_09 HTML has over 9 kilopixels of images
# HTML_IMAGE_ONLY - not much text with images (absolute)
body HTML_IMAGE_ONLY_02 eval:html_image_only('0000','0200')
body HTML_IMAGE_ONLY_04 eval:html_image_only('0200','0400')
body HTML_IMAGE_ONLY_06 eval:html_image_only('0400','0600')
body HTML_IMAGE_ONLY_08 eval:html_image_only('0600','0800')
body HTML_IMAGE_ONLY_10 eval:html_image_only('0800','1000')
body HTML_IMAGE_ONLY_12 eval:html_image_only('1000','1200')
describe HTML_IMAGE_ONLY_02 HTML: images with 0-200 bytes of words
describe HTML_IMAGE_ONLY_04 HTML: images with 200-400 bytes of words
describe HTML_IMAGE_ONLY_06 HTML: images with 400-600 bytes of words
describe HTML_IMAGE_ONLY_08 HTML: images with 600-800 bytes of words
describe HTML_IMAGE_ONLY_10 HTML: images with 800-1000 bytes of words
describe HTML_IMAGE_ONLY_12 HTML: images with 1000-1200 bytes of
words
# HTML_IMAGE_RATIO - more image area than text (ratio)
body HTML_IMAGE_RATIO_02 eval:html_image_ratio('0.000','0.002')
body HTML_IMAGE_RATIO_04 eval:html_image_ratio('0.002','0.004')
body HTML_IMAGE_RATIO_06 eval:html_image_ratio('0.004','0.006')
body HTML_IMAGE_RATIO_08 eval:html_image_ratio('0.006','0.008')
body HTML_IMAGE_RATIO_10 eval:html_image_ratio('0.008','0.010')
body HTML_IMAGE_RATIO_12 eval:html_image_ratio('0.010','0.012')
body HTML_IMAGE_RATIO_14 eval:html_image_ratio('0.012','0.014')
describe HTML_IMAGE_RATIO_02 HTML has a low ratio of text to image area
describe HTML_IMAGE_RATIO_04 HTML has a low ratio of text to image area
describe HTML_IMAGE_RATIO_06 HTML has a low ratio of text to image area
describe HTML_IMAGE_RATIO_08 HTML has a low ratio of text to image area
describe HTML_IMAGE_RATIO_10 HTML has a low ratio of text to image area
describe HTML_IMAGE_RATIO_12 HTML has a low ratio of text to image area
describe HTML_IMAGE_RATIO_14 HTML has a low ratio of text to image area
And these are the standard scores for them:
score HTML_IMAGE_AREA_05 0.283 1.342 1.122 2.199
score HTML_IMAGE_AREA_07 1.615 1.681 1.997 1.022
score HTML_IMAGE_ONLY_02 2.751 2.244 1.472 1.230
score HTML_IMAGE_ONLY_04 1.898 1.527 1.136 1.001
score HTML_IMAGE_ONLY_06 1.531 1.709 0.527 1.439
score HTML_IMAGE_ONLY_08 0.525 0.837 0 0
score HTML_IMAGE_ONLY_10 0.615 1.138 0.431 0.019
score HTML_IMAGE_ONLY_12 0.787 1.012 0.483 0
score HTML_IMAGE_RATIO_04 0.821 0.892 0.667 1.050
score HTML_IMAGE_RATIO_06 0.935 0.317 0.649 0
score HTML_IMAGE_RATIO_08 0.605 0.408 0.413 0.359
score HTML_IMAGE_RATIO_10 0.535 0.488 0.619 0.315
score HTML_IMAGE_RATIO_12 0.324 0 0 0
score HTML_IMAGE_RATIO_14 0 0.276 0 0
score HTML_IMAGE_AREA_04 0
score HTML_IMAGE_AREA_09 0
score HTML_IMAGE_AREA_08 0
score HTML_IMAGE_RATIO_02 0
score HTML_IMAGE_AREA_06 0
Strangely enough (I'll never understand the "genetic algorithms" used to
generate these scores) some of them "in the middle" are 0... that is,
HTML_IMAGE_ONLY_06 and HTML_IMAGE_ONLY_10 are non-0, but
HTML_IMAGE_ONLY_08 is 0 (in the fourth column).
What you can do is to raise these scores in spam.assassin.conf so they
are more likely to trigger.
One of the things I've seen are messages which apparently are only
comprised of an image, but that have hidden text (same color as
background), even specially crafted "non-spam-looking" text that
decreases the score and avoids some of these rules... I've even seen
almost identical messages to score somehow above 5 and the next day score
below 3... evidently many spammers are checking their messages with
SpamAssassin, and adjusting them... playing around with some scores
(especially, raising these "0" scores) might help you a lot (but be
careful with false positives, check your logs).
HTH
El 4 Dec 2003 a las 9:04, Jody Cleveland escribió:
> Hello,
>
> I've noticed a new trend with spam lately. I've been getting emails that
> are one big image, which aren't caught by mailscanner or spamassassin.
> Is there a rule somewhere, where I can specify that if an email contains
> only an image to tag it as spam?
>
>
> --
> Jody Cleveland
> (cleveland at mail.winnefox.org)
--
Mariano Absatz
El Baby
----------------------------------------------------------
Suicidal twin kills sister by mistake!
More information about the MailScanner
mailing list