Log analyzer

Sat Jun 26 08:06:29 IST 2004

On Jun 25, 2004, at 3:17 AM, Steve Freegard wrote:

> John,
>
>> Oh... hm.  Except for the php and mysql parts, yeah :-}
>>
>> I'll have to think more about it though.  Maybe it wouldn't be such a
>> bad thing to run it that way.  It just wasn't the way I was
>> thinking of
>> running it (I was thinking of basic perl script that runs against
>> syslog output only (no database) and spits out a textual report).
>>
>
> Fine - concievably with a bit of work you could get rid of the
> PHP/Apache
> parts and run the MySQL database on a separate box and have
> MailScanner log
> to that, then take all the SQL from the MailWatch reports and use Perl
> to
> query the database and produce text-based reports instead.
>

Right, but I want to remove the SQL part too.

>
>> When you say "does almost everything", which part(s) does it not do?
>
> I'm rather cautious of saying 'it does everything' - but the only
> things
> from your list it doesn't do exactly are:
>
>>> 3) did it have a virus, and if the log knows, which one? and
>>> if it did, was it deleted as a silent virus?
>
>> (and how did you go about determining when viruses were being deleted?
>> or do you still deliver silent viruses?  I'm thinking I might start
>> doing that.)
>
> It will show the virus name if infected but I don't check if it's a
> silent
> virus or not - however, if you are using Sendmail - then you can tell
> if a
> message was delivered and when or where to as the MailWatch add-on
> 'sendmail_relay' records all the relay lines scraped in real-time from
> the
> maillog (I'm actually about to CVS commit a new version that records
> relay
> information, RBL rejections, Unknown Users and Unresolveable Domains
> which
> will be in the next release).

But sendmail only reports it if sendmail sees it.  If mailscanner
deletes a silent virus, then sendmail never sees the message again, so
sendmail can't tell you "oh, that got deleted".

Plus, the place where I'm actually concerned about this isn't sendmail.
  It's in my glue scripts for using MailScanner with CommuniGate Pro.
I'll get back to this later.

> Why the fascination with silent viruses - personally I can't think of
> a good
> reason to want to report on these??

So that you know what happened to the message?  Right now, you don't
really know what happened to the message.  What you know:

1) sendmail accepted the message
2) when mailscanner finds messages to scan, you find out how many
messages, but NOT _which_ messages, so you never know with certainty
when a particular message was picked up (nor _IF_ it was picked up).
3) if a virus or dangerous content was found in a message, you get a
report of that, and what it was
4) if the message is marked as spam, you get the spamcheck output.
5) if the message was spam, you get the spam actions

What you don't know:

a) exactly when a message was picked up by mailscanner
b) what virus or dangerous content actions were applied to it
c) when mailscanner finished with a message, and if no
(virus,content,spam) actions were applied, what mailscanner did with
it.

You can _assume_ that mailscanner did certain things at certain times,
but you don't _know_ with certainty because whole sets of actions
aren't being logged _by_mailscanner_.

>>> 6) what spam actions were applied to it?
>
> It does this - providing you don't use a ruleset as currently I
> haven't been
> brave enough to try and write a MailScanner ruleset parser.

I'm not sure why you would need to parse a ruleset for this.  The
MailScanner syslog output should tell you what actions were applied to
the message.

> <SNIP>
>>> So, then I can run a report which will tell me, with absolute
>>> certainty,
>>> exactly what happened to each and every message.
> </SNIP>
>
> Again using the sendmail_relay add-on - this is easy as each message
> then
> carries a log of when it was sent, where it was sent (hostname of the
> destination MX), which host sent the message (if you have multiple
> scanners
> logging to a single database) and what the response was from the remote
> sever (e.g. 'Message queued for delivery (id=i23489dfsd)').

Right, but that's a sendmail report.  That means it only gets generated
if sendmail sees it.  If mailscanner loses it (properly or improperly)
then sendmail can't generate that log entry.  So, you don't actually
_know_ what happened to the message.

>>> And, from that, I can
>>> perhaps do a grep (or something) that will look for messages that had
>>> certain characteristics, or determine my average spam
>>> score (which I
>>> can't do now, because MS only reports messages that were marked as
>>> spam), or see that "the reason this message never arrived is
>>> because it
>>> contained a virus" or something.  Or, tell me "W messages in,
>>> X messages
>>> delivered/relayed, Y messages still processing or in the mqueue, Z
>>> messages missing." and then tell me _which_ messages are
>> missing (so I
>>> can inform the sender and maybe the original recipients).
>>>
>
> Erm - I've *never* seen MailScanner 'loose' a message - from the dual
> MTA
> design it isn't possible.

Here's what I've seen:

CommuniGate Pro issues a rule action that says "invoke the
CommuniGatePro-to-MailScanner converter on this message".

No report of any message with that Message-ID ever coming back.

I don't believe mailscanner was the thing that dropped the message.
But the point is, I don't know.  I can't know.  I can't know because
the mailscanner logging is inadequate.  Thus, when my boss says "what
happened to that message?"  I can't tell him.  I can't prove to him
where the fault was.

When he says "I want you to develop some means picking any random
message and proving it's exact path through the system", I can't.
Because MailScanner doesn't tell us (even on our legacy sendmail
systems, Mailscanner doesn't tell us) what it does with the message.
If it really was a silent virus, I don't know what happened to it (and,
keep in mind, because I'm trying to _prove_ what happened to the
message, not just make random conjecture about it, so I can't assume
"that's missing because it must have been a silent virus" -- either I
know exactly what happened to it, or I don't).

I personally believe it is in my 2nd glue script (when Mailscanner
calls sendmail2, that's actually a script that translates back to CGP
format and submits it back as a new message), but I can't prove it, and
I can't move forward and re-enabling my scripts until I can prove it.
Which means having to keep around my legacy systems for scanning, for a
while longer than I wanted to.

-------------------------- MailScanner list ----------------------
To leave, send    leave mailscanner    to jiscmail at jiscmail.ac.uk
Before posting, please see the Most Asked Questions at
http://www.mailscanner.biz/maq/     and the archives at
http://www.jiscmail.ac.uk/lists/mailscanner.html