MailScanner, huge bayes_toks, out-of-mem problem

Robert Waldner waldner at WALDNER.PRIV.AT
Tue Nov 25 09:10:27 GMT 2003


Running MS 4.25-6, SA 2.60-2, on Debian GNU/Linux.

Tracing out-of-mem problems, I noticed that once in a while a 
 MailScanner-process (via SA, I guess?) will read from bayes_toks until 
 memory is exhausted, eg it does nothing but

pread(6, "\0\0\0\0\1\0\0\0x\37\0\0\0\0\0\0\nK\0\0r\1\1\3\0\2\363"..., 4096, 32997376) = 4096
mremap(0x424cb000, 10108928, 10108928, MREMAP_MAYMOVE) = 0x424cb000
pread(6, "\0\0\0\0\1\0\0\0\nK\0\0x\37\0\0\0\0\0\0\230\0<\n\0\2\347"..., 4096, 78684160) = 4096
pread(6, "\0\0\0\0\1\0\0\0y\37\0\0\0\0\0\0|P\0\0t\1\5\3\0\2\374\17"..., 4096, 33001472) = 4096
mremap(0x424cb000, 10108928, 10113024, MREMAP_MAYMOVE) = 0x424cb000

fd 6 -> /var/spool/.spamassassin/bayes_toks

Ok, bayes_toks is _huge_, ~ 82M, but what I'm trying to understand is:
 - how does MailScanner (SA?) determine which process will read in the whole 
 - how does the size of bayes_toks correspond to the size of the 
   MailScanner-process? bayes_toks is ~ 82M, I've seen 
   MailScanner-processes with size 203M (RSS 58M), but usually they're 
   size < 50k, RSS < 30M (all values according to `top`)

Yes, I could do with a smaller ham/spam corpus, but what I'd like to 
 know before I nuke the bayes-db is if that's the problem and why. 
 Maybe I could just throw in more memory, currently the box has 144 MB 
 RAM, 256 MB swap.

-- If a maintenance programmer can't quote entire Monty Python
-- movies from memory, he or she has no business being a programmer. 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url :

More information about the MailScanner mailing list