Custom function for 'Required SpamAssassin Score' runs multiple times for each message

Jules Field MailScanner at ecs.soton.ac.uk
Mon Aug 3 18:29:10 IST 2009


Some of the MailScanner.conf settings are looked up more than once. This 
is necessary.

All you need to do in your Custom Function is add a fast-expiring cache 
for your results data.
This is dead easy to do in Perl (a hash containing a map from message-id 
to result, and a hash containing a map from message-id to expiry time, a 
few seconds into the future (e.g. $expiry{$id} = time+10;). If you get a 
lookup for message $id then you look to see if time>$expiry{$id}. If 
it's not then the cache is valid and you return $mycache{$id}, and if it 
has expired then you delete $mycache{$id} and work it out from fresh.

Very easy to code and gives you a very fast lookup.

That's why I have never worried about it, Perl is a perfect language to 
implement a cache in about 5 lines of code, so I leave you to do it if 
you need to.

Sometimes you find it will actually always look up your value for the 
*same* message 5 times in a row, and not look it up for any other 
message in between, even if there is a large message batch. In that case 
all you need to store is the id of the last message and the result you 
calculated. If the id is the same as last time, return the 
previously-calculated result, else work it out and store it for next 
time. That's even simpler to code.

But do ensure you check what happens when the message batch size > 1, so 
you know whether a cache is needed, or just a simple "did we just work 
that value out?" question.

Jules.

On 03/08/2009 17:34, Blatter, Nicholas wrote:
> I have been working on a new install of MailScanner and have run into a
> potential problem when using a custom function for the 'Required
> SpamAssassin Score' setting.
>
> The custom function is a modified version of the SQLSpamScore function
> written by Julian Field.  It allows our users to customize their SA spam
> score via the web and appears to be working correctly except that
> MailScanner seems to be running the function several times for each
> message that the server receives.
>
> I've modified the function to be more verbose and have it write the
> current message ID and SQL result to the log every time the main
> function is called.  I'm seeing the same MailScanner instance (same pid
> each time) call the function 4 times, each time with the exact same
> message ID and SQL result.  The number of times doesn't seem to be
> dependant upon the number of MailScanner children.
>
> I hope this isn't too much of a stupid question, but I searched around
> and couldn't see a reason that the custom function would be called more
> than once.  I am also using SQL-based functions for many other config
> options (Spam Checks, Is Definitely (Not) Spam, etc) and have no such
> problems with those.
>
> Thanks for your time,
>
> Nick
>    

Jules

-- 
Julian Field MEng CITP CEng
www.MailScanner.info
Buy the MailScanner book at www.MailScanner.info/store

Need help customising MailScanner?
Contact me!
Need help fixing or optimising your systems?
Contact me!
Need help getting you started solving new requirements from your boss?
Contact me!

PGP footprint: EE81 D763 3DB0 0BFD E1DC 7222 11F6 5947 1415 B654
Follow me at twitter.com/JulesFM and twitter.com/MailScanner


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the MailScanner mailing list