MS/perl segfaults

Thu Jan 22 23:11:50 GMT 2009

David Lee wrote:
>
> So suppose we continue to model this using "timestamp in a database" 
> thinking, but actually store, read and process those timestamps in the 
> inbound file itself.  I realise that this implementation detail will 
> be MTA-specific, but I think that might slot cleanly into MS's 
> existing MTA-specific code.  (Julian?)
>
Personally i think that a database (sqlite) would be more appropriate as 
we DO NOT KNOW what  is causing the fault - reading the file from disk 
could be causing it.

As we got hit by this yesterday (unfortuately the queue didn't get saved 
as a collegue resolved it as I was in hospital with my little boy :-( 
).  My thinking would be

record in a table a <Child PID>, <timestamp> <MSGID> when a message 
batch is started.  When a message is placed in the delivery queue ALL 
records are deleted.

If a message gets  three or more entries in the table - delay its 
processing by a random ammount of time between (say 3-9) minutes 
multiplied by the number of failures to try and ensure that it moves 
around batches.  This should ensure that the message with the faliure 
should repeatedly end up in a different batch.

Benefits:      Does not rely on a file timestamp - simply on the failure 
count and then a random delay. 
                   Should cope with multiple files causing failure.
                   Files likely to be causing failures rapidly get 
backed off from processing and should ensure mail continues to flow.

Negatives:    Causes valid email in a failure batch to be backed off - 
however by multiply a random number should cause them to be processed in 
a separate batch
                   Requires a database - however the cache already uses 
sqlite.

matt