It seems that viruses CAN slip through MailScanner under high load!

Wed Sep 3 14:52:33 IST 2003

Please can you double and triple check that you have the correct version of
SweepViruses.pm.
Please
         pwd
         ls -l SweepViruses.pm
         sum SweepViruses.pm
and mail me the output.

At 14:41 03/09/2003, you wrote:
>Hi
>
>Bad news I'm afraid. We've just upgraded to MailScanner 4.23-11 and
>viruses are still slipping through.  Admittedly our server is still
>under load.
>
>Thanks for any help.
>
>Joan
>
>
>
>On Fri, 29 Aug 2003 03:16:47 +0100 Brian Hoy <brian.hoy at OPUS.CO.NZ>
>wrote:
>
> > Hi all,
> >
> > Thanks to everyone for their comments and advice.  It is very much
> > appreciated.  And especially to Julian for finding and fixing the
> problem so
> > quickly!
> >
> > Our sendmail config does have the load settings configured that many of you
> > mentioned, but still the mail was flowing in!  The input queue was growing
> > faster than Mailscanner could scan it, and the problem just kept
> compounding.
> >
> > The reason is that the "load average" stats are not always a good
> measure of
> > the real stress that the machine is under.  If a machine is heavily using
> > swap space, then the disks and motherboard I/O bandwidth are being consumed
> > (and CPU also if the disks are ATA, rather than SCSI), yet no useful
> work is
> > being done.
> >
> > If a process is waiting on a page fault, I do not think that it is
> placed in
> > the OS's run queue until the page is loaded (and another page swapped out -
> > still more disk I/O!).  If this is true then the load average does not
> > increase, yet the machine is clearly starting to struggle with the load.
> > This is what happened to us the other day.
> >
> > If you want to experiment with this idea, compile this C program:
> >
> > // Compile with gcc -o vm_tester vm_tester.c
> > //
> > #include <stdio.h>
> > #include <malloc.h>
> >
> > #define NUM_PASSES 10
> > #define MB_TO_ALLOC 128
> > #define BYTES_TO_ALLOC (MB_TO_ALLOC * 1024*1024)
> >
> > int main(void)
> > {
> >   char *mem;
> >   int pass, r, c;
> >
> >   if ((mem = (char *) malloc(BYTES_TO_ALLOC)) == NULL)
> >   {
> >     printf("malloc() failed");
> >     exit(-1);
> >   }
> >
> >   for (pass=0; pass<NUM_PASSES; pass++)
> >   {
> >     for (c=0; c<4096; c++)
> >     {
> >       for (r=0; r<BYTES_TO_ALLOC/4096; r++)
> >       {
> >         mem[r*4096 + c]++;
> >       }
> >     }
> >   }
> >
> >   return 0;
> > }
> >
> > // -----------------------------------------------
> >
> > It allocates 128M of RAM, and increments bytes in a way that generates as
> > many page faults as possible.  As an initial suggestion, run as many of
> > these programs as needed to consume all your RAM and watch your other
> > processes struggle to get a slice of the CPU.  BTW, don't do this on a
> > production server, or try to consume more memory than your total VM - you
> > have been warned!
> >
> > Use top and vmstat to watch things.  If you start running more of these
> > programs, then you find that the load average does not increase that much,
> > but your disks are flat out, and machine responsiveness goes right out the
> > window (esp on ATA disks).
> >
> > I still think my suggestion (in my first post) for an "unfair" way of
> > selecting messages for scanning under "high load" has merit.  When our mail
> > gateway was stressed out the other day, I was using strace to monitor the
> > system calls in the MailScanner processes, and they were spending 5-30mins
> > just doing the stat() calls before locking messages for scanning.
> >
> > When you machine is really overloaded, let's do anything to concentrate the
> > meagre available resources on clearing the queue in the most expedient
> fashion.
> >
> > Perhaps "high load" can be determined by the length of the input queue
> > (rather than the misleading system load average), and be user configurable.
> >
> > For example, if the input queue has in excess of 1000 messages waiting,
> peel
> > off any 30 for scanning.  Ensure that no other MailScanner process
> evaluates
> > the length of the queue until a user configurable time has passed (15
> > mins?).  I know this is easier said than done, but I think it really would
> > help when the machine is steaming up shit creek.
> >
> > Another thought....Sendmail names all it's df and qf files, such that an
> > alphabetical listing is sorted by ascending time order too!  If the other
> > MTAs are the same, then perhaps this fact could be used to remove all the
> > stat()s and still meet the fairness algorithm?
> >
> > Comments anyone?
> >
> > Regards,
> > Brian
>
>----------------------
>Joan Bryan
>Unix Systems Administrator
>Information Systems
>Telephone: +44 (0) 20 7848 2671
>mailto:joan.bryan at kcl.ac.uk

--
Julian Field
www.MailScanner.info
MailScanner thanks transtec Computers for their support