From matt at coders.co.uk Thu Oct 5 07:42:47 2006 From: matt at coders.co.uk (Matt Hampton) Date: Thu, 05 Oct 2006 08:42:47 +0100 Subject: Extension I'm working on Message-ID: <4524B777.6040109@coders.co.uk> I was bored last night so I have started writing an extension to MailScanner which sits alongside the content rules. Basically it reads the message, identifies any URL's (actually I'm hacking the Phishing code to store the URLs for me to save parsing the message twice) and the it looks up the full URL in a DNS lookup, if that returns NXDOMAIN then it looks up the host part. It does this via and md5 of the URL so that it is anonymous. A successful DNS lookup will return a TXT record of the format: "Adult/Image_Galleries/Fetishes" The data is based on the DMOZ classification. This could then be used as the basis of a Block list: "Adult/Image_Galleries/Midgets" Allow # That has to be worth a look ;-) "Adult" Block I currently only have the "Adult" branch of the data in a DNS zone - this has 46,000 entries and is running at 3.2Mb. The full zone file is 0.5 Gb. Is this worth continuing with? Can anyone see a use for this? matt From ka at pacific.net Thu Oct 5 15:20:12 2006 From: ka at pacific.net (Ken A) Date: Thu, 05 Oct 2006 08:20:12 -0700 Subject: Extension I'm working on In-Reply-To: <4524B777.6040109@coders.co.uk> References: <4524B777.6040109@coders.co.uk> Message-ID: <452522AC.2050805@pacific.net> Sounds interesting. How do you generate the DNS zone? Why not make it an SA plugin, similar to the 'urirhssub' tests, so it could take advantage of scoring system and wider testing? Ken A. Pacific.Net Matt Hampton wrote: > I was bored last night so I have started writing an extension to > MailScanner which sits alongside the content rules. > > Basically it reads the message, identifies any URL's (actually I'm > hacking the Phishing code to store the URLs for me to save parsing the > message twice) and the it looks up the full URL in a DNS lookup, if that > returns NXDOMAIN then it looks up the host part. > > It does this via and md5 of the URL so that it is anonymous. > > A successful DNS lookup will return a TXT record of the format: > > "Adult/Image_Galleries/Fetishes" > > The data is based on the DMOZ classification. > > This could then be used as the basis of a Block list: > > "Adult/Image_Galleries/Midgets" Allow # That has to be worth a look ;-) > "Adult" Block > > I currently only have the "Adult" branch of the data in a DNS zone - > this has 46,000 entries and is running at 3.2Mb. The full zone file is > 0.5 Gb. > > Is this worth continuing with? Can anyone see a use for this? > > matt From matt at coders.co.uk Thu Oct 5 16:26:11 2006 From: matt at coders.co.uk (Matt Hampton) Date: Thu, 05 Oct 2006 17:26:11 +0100 Subject: Extension I'm working on In-Reply-To: <452522AC.2050805@pacific.net> References: <4524B777.6040109@coders.co.uk> <452522AC.2050805@pacific.net> Message-ID: <45253223.5000400@coders.co.uk> Ken A wrote: > Sounds interesting. How do you generate the DNS zone? I take a dump from DMOZ of their complete directory (2Gb of uncompressed data) and parse it. I calculate the md5sum of the URL and then use this as the basis of the lookup i.e. abcdef01234567890abcdef01234567890.zone.file Will return the TXT record of the appropriate classification. This method means that the complete URL can be tested anonymously. > Why not make it an SA plugin, similar to the 'urirhssub' tests, so it could take advantage > of scoring system and wider testing? Because it may not be Spam as such. It might be someone mailing a friend - "did you see this film last night: http://link". In this case it won't be spam and is more a content thing and so more relevant to MailScanner than SA. Well that was my thinking! It would also make more sense to "block" (gives the option of sending a "this email was inappropriate" to the recipient rather than just dropping it because it was classified as spam. Thanks for the comments. Matt