Ways to filter other document types

Glenn Steen glenn.steen at gmail.com
Sat Sep 9 09:41:34 IST 2006


On 08/09/06, Alex Neuman van der Hans <alex at nkpanama.com> wrote:
> Any ideas on how to block word or pdf spam, as mentioned in:
>
> http://searchsecurity.techtarget.com/columnItem/0,294698,sid14_gci1214687,00.html
>
> It could be an interesting option to add to the phishing filter or to
> SA, as spammers are now using these formats. Something like the TNEF
> unpack-then-repack approach we're getting now, would probably be the way
> to go.
>
> Anybody already doing something similar?

Not doing so, no. But it should be possible to make the same type of
approach as done withing htdig and other web indexers... Just use a
program that strips all the gunk and look at the actual text:-). Isn't
there some SA plugin for this already (akin to fuzzyocr...)?

-- 
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se


More information about the MailScanner mailing list