SQL Redux

Magnusson, Andrew Andrew.Magnusson at COCC.COM
Mon Aug 4 15:14:17 IST 2003


I've got it as the default, inserting when the child dies (which of course
happens fairly regularly per the config file, so there's usually no more
than a 3- or 4-hour lag before any given email is entered into the
database).

Andrew Magnusson
Internet Product Analyst
COCC
1-877-678-0444 extension 640



-----Original Message-----
From: Chris-Bellsouth [mailto:ctrudeau at BELLSOUTH.NET]
Sent: Monday, August 04, 2003 10:07 AM
To: MAILSCANNER at JISCMAIL.AC.UK
Subject: Re: SQL Redux


I for one am interested in seeing this....as I have sort of a similar
requirement...

Are you comfortable with the process that performs the INSERTS only
executing when a child dies?  Or have you modified this so that the data in
the temp logfiles is INSERTed in more of a real-time fashion.

CT



----- Original Message -----
From: "Magnusson, Andrew" <Andrew.Magnusson at COCC.COM>
To: <MAILSCANNER at JISCMAIL.AC.UK>
Sent: Monday, August 04, 2003 9:29 AM
Subject: Re: SQL Redux


> Brief note: I'm the one who modified the code to split the data into three
> tables, in order to make it easier and more consistent for our internal
> reporting and analysis scripts. (Once they're complete, I'll see what I
can
> do about getting them released to you folks if anyone's interested. The
> focus is on email usage reports for a large number of email domains that
> pass through our servers.)
>
> If there's really no performance hit to logging after every email instead
of
> in batches, I'll see about changing my code to do it that way too. Nothing
> worse than having to wait for all of the children to finish logging before
I
> can do a full restart of MailScanner.
>
> Andrew Magnusson
> Internet Product Analyst
> COCC
> 1-877-678-0444 extension 640
>
>
>
> -----Original Message-----
> From: Steve Freegard [mailto:steve.freegard at LBSLTD.CO.UK]
> Sent: Monday, August 04, 2003 5:04 AM
> To: MAILSCANNER at JISCMAIL.AC.UK
> Subject: Re: SQL Redux
>
>
> Hi Trevor,
>
> >Could someone do a quick round-up?
>
> Certainly - there are two current methods for SQL Logging:
>
> 1) The code included in recent MailScanner CustomConfig.pm code
>
> Originally written by Julian with the SQL bits from me, laterly changed by
> someone else (not sure who though!).  This version uses temporary files to
> log the data and loads the data into MySQL every time MailScanner is
> restarted, or when it auto-restarts.
>
> It splits the data into three SQL tables - maillog_mail, maillog_recipient
> and maillog_report.
>
> Currently - I know of no interface that uses this schema.
>
> 2) The code included (as a patch to CustomConfig.pm) with MailWatch
>
> As Julian's original code, except that the temporary files have been
removed
> so that the data is inserted per message batch processed by MailScanner.
>
> This puts all data into a table called maillog which is then used by
> MailWatch for display and reporting.
>
> You can get MailWatch from http://www.smf.f2s.com/mailscanner/ which is
> currently at version 0.2 - there haven't been later patches for bugs - but
> there were a couple of files missed from the tarball when I created it
> (create.sql and CustomConfig.pm).
>
> Several people have had a problem when trying to use MailWatch with
FreeBSD
> - but the problem seems to be with Perl DBI and MySQL, and this is still
> under investigation.
>
> I'm currently awaiting for approval for a Sourceforge project for
MailWatch
> - which I should get in the next couple of days.  I'll release 0.3 shortly
> after which will contain the new MailWatch.pm file containing the SQL
> Logging routines making it easier to install (and easier for me and Julian
> to work out which code people are using!) and a couple of fixes.
>
> Kind regards,
> Steve.
>
> -----Original Message-----
> From: Julian Field [mailto:mailscanner at ECS.SOTON.AC.UK]
> Sent: 03 August 2003 21:38
> To: MAILSCANNER at JISCMAIL.AC.UK
> Subject: Re: SQL Redux
>
>
> You want MailWatch.
> Hopefully someone else can comment on its exact current state.
>
> At 20:32 01/08/2003, you wrote:
> >Ok, so um, for those of us who aren't keeping several patchset branches
in
> >our heads but who now want to start sql logging, what are the options?
> >Could someone do a quick round-up?
> >
> >I need to provide a web interface that provides plenty of detail about
each
> >message by middle of next week and I'm wondering whether I should do my
own
> >thing with a flat file log (since I have little time) or configure in
> >support for one of the existing sql logging mechanisms.
> >
> >"Mailwatch for Mailscanner" seems to be at version 0.2 but I recall
> mentions
> >of patches for bugs post 0.2.  Is there a later version available?
> >
> >There's the sql logging code already in CustomConfig.pm, but is there a
web
> >interface for built yet for the tables it creates?  And I'm assuming I'll
> >need the latest version of mailscanner to get the fixes listed in this
> >thread.
> >
> >"David While's Mailstats" looks nice (though I'm not doing virus scanning
> >and don't particularly care about geo-locating stuff), but I need a
> >per-message interface, and mailstats seems to be more for performance
> >reporting than for log analysis.
> >
> >The mailscanner-mrtg package again is for performance reporting, not log
> >analysis.
> >
> >...So...
> >
> >What's my best bet for a web interface to logged data that includes such
> >things as subject, recipients, spam tests, etc in the short term?
> >
> >-t.
> >
> >
> > >-----Original Message-----
> > >From: Kearney, Rob [mailto:RKearney at AZERTY.COM]
> > >Sent: Thursday, July 31, 2003 10:47 AM
> > >To: MAILSCANNER at JISCMAIL.AC.UK
> > >Subject: Re: SQL Redux
> > >
> > >
> > >oh.. and yes..
> > >
> > >thanks for the code tippets.. I'll have to change this. as I'm
> > >not good with
> > >perl  either.
> > >
> > >-rob
> > >
> > >-----Original Message-----
> > >From: Steve Freegard [mailto:steve.freegard at lbsltd.co.uk]
> > >Sent: Thursday, July 31, 2003 11:19 AM
> > >To: 'Kearney, Rob '; 'MAILSCANNER at JISCMAIL.AC.UK '
> > >Subject: RE: SQL Redux
> > >
> > >
> > >Hi Rob,
> > >
> > >I'm not really much good with Perl (maybe Julian can back me
> > >up on this) -
> > >but my understanding is that in calling your SQLRTLogging
> > >procedure without
> > >the Init & End procedures will mean that the
> > >connection/disconnection/prepare and execution of the SQL will
> > >happen for
> > >every message batch processed by MailScanner which would slow things up

> > >quite considerably depending on the volume of messages you processes.
> > >
> > >The most expensive processes are connecting and preparing the
> > >statement, so
> > >it's better only to do this once (per child), then running the prepared
> > >statements once per message batch.
> > >
> > >A better way is to have:
> > >
> > >InitSQLRTLogging:  (this is done once per MailScanner child)
> > > - Connect to the database
> > > - Prepare each SQL statement required
> > >
> > >SQLRTLogging: (done once for each message batch)
> > > - Tidy-up the data to make it suitable for SQL
> > > - Execute the prepared statements
> > >
> > >EndSQLRTLogging: (done once as each child dies)
> > > - Disconnect from the database
> > >
> > >Cheers,
> > >Steve.
> > >
> > >-----Original Message-----
> > >From: Kearney, Rob
> > >To: MAILSCANNER at JISCMAIL.AC.UK
> > >Sent: 31/07/03 15:56
> > >Subject: Re: SQL Redux
> > >
> > >here is what we did for SQL logging, to bypass temp-file stuff.
> > >
> > >Just took the SQLLogging and made SQLRTLogging, to write
> > >directly to DB,
> > >We
> > >have not noticed any degradation in performance
> > >Basically, we took the functions of SQLLogging and
> > >EndSQLLogging and put
> > >them together.
> > >(dont forget Init and End scripts also
> > >
> > >---
> > >sub SQLRTLogging {
> > >  my($message) = @_;
> > >  my($dbh);
> > >  $dbh =
> > >DBI->connect("DBI:mysql:mailscanner:localhost:mysql_socket=/var
> > >/database
> > >/mys
> > >ql/mysql.sock",
> > >                      "mailscanner", "mailscanner",
> > >                      {'PrintError' => 0})
> > >   or MailScanner::Log::DieLog("Cannot connect to the database: %s",
> > >                                $DBI::errstr);
> > >
> > >  my $id = $message->{id};
> > >  my $size = $message->{size};
> > >  my $from = $message->{from};
> > >  my ($from_user, $from_domain);
> > >
> > >  # split the from address into user and domain bits.
> > >  # This may be unnecessary for you; we use it to more easily determine
> > >  # inbound vs outbound email in a multi-domain environment.
> > >  # HINT: refine queries using SQL 'join' with a table containing local
> > >  # domains.
> > >
> > >  ($from_user, $from_domain) = split /\@/, $from;
> > >
> > >  my @to   = @{$message->{to}};
> > >  my $subject = $message->{subject};
> > >  my $clientip = $message->{clientip};
> > >  my $archives = join(',', @{$message->{archiveplaces}});
> > >  my $isspam = $message->{isspam};
> > >  my $ishighspam = $message->{ishigh};
> > >  my $sascore = $message->{sascore};
> > >  my $spamreport = $message->{spamreport};
> > >
> > >  # Get rid of control chars and tidy-up SpamAssassin report
> > >  $spamreport =~ s/\n/ /g;
> > >  $spamreport =~ s/\t//g;
> > >
> > >  # Get timestamp, and format it so it is suitable to use with MySQL
> > >  my($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
> > >localtime();
> > >  my($timestamp) = sprintf("%d-%02d-%02d %02d:%02d:%02d",
> > >                           $year+1900,$mon+1,$mday,$hour,$min,$sec);
> > >
> > ># maillog_mail insert
> > >    my @fields=($timestamp, $id, $size, $from_user, $from_domain,
> > >                $subject, $clientip, $archives, $isspam, $ishighspam,
> > >                $sascore, $spamreport);
> > >    map { s/\'/\\'/g } @fields;
> > >
> > >    # Insert @fields into a database table
> > >    my($sth) = $dbh->prepare("INSERT INTO maillog_mail (time, msg_id,
> > >size,
> > >from_user, from_domain, subject, clientip, archives, isspam,
> > >ishighspam,
> > >sascore, spamreport) VALUES (?,?,?,?,?,?,?,?,?,?,?,?)");
> > >
> > >$sth->execute($fields[0],$fields[1],$fields[2],$fields[3],$fiel
> > >ds[4],$fi
> > >elds
> > >[5],$fields[6],$fields[7],$fields[8],$fields[9],$fields[10],$fi
> > >elds[11])
> > >or
> > >MailScanner::Log::DieLog("Cannot insert row: %s", $DBI::errstr);
> > >
> > >
> > >  my($file, $text);
> > >  while(($file, $text) = each %{$message->{allreports}}) {
> > >    $file = "the entire message" if $file eq "";
> > >    # Use the sanitised filename to avoid problems caused by people
> > >forcing
> > >    # logging of attachment filenames which contain nasty SQL
> > >instructions.
> > >    $file = $message->{file2safefile}{$file} or $file;
> > >    $text =~ s/\n/ /;  # Make sure text report only contains 1 line
> > >    $text =~ s/\t/ /; # and no tab characters
> > >
> > >    my @fields = ($id, $file, $text);
> > >    map { s/\'/\\'/g } @fields;
> > >
> > >    my($sth) = $dbh->prepare("INSERT INTO maillog_report (msg_id,
> > >filename,
> > >filereport) VALUES (?,?,?)");
> > >    $sth->execute($fields[0],$fields[1],$fields[2]) or
> > >MailScanner::Log::DieLog("Cannot insert row: %s", $DBI::errstr);
> > >
> > >  }
> > >
> > >  for (@to) {
> > >    # again, split the recipient's email into user and domain halves
> > >first.
> > >    # see comment above about splitting the email like this.
> > >
> > >    my ($to_user, $to_domain);
> > >    ($to_user, $to_domain) = split /\@/, $_;
> > >    my @fields = ($id, $to_user, $to_domain);
> > >    map { s/\'/\\'/g } @fields;
> > >    my($sth) = $dbh->prepare("INSERT INTO maillog_recipient (msg_id,
> > >to_user, to_domain) VALUES (?,?,?)");
> > >    $sth->execute($fields[0],$fields[1],$fields[2]) or
> > >MailScanner::Log::DieLog("Cannot insert row: %s", $DBI::errstr);
> > >  }
> > >
> > >  # Close database connection
> > >  $dbh->disconnect();
> > >
> > >}
> > >
> > >  MailScanner::Log::InfoLog("Ending SQL Real-Time Logging");
> > >}
> > >
> > >1;
> > >
> > >
> > >
> > >-rob
> > >
> > >--
> > >This email and any files transmitted with it are confidential and
> > >intended solely for the use of the individual or entity to whom they
> > >are addressed. If you have received this email in error please notify
> > >the sender and delete the message from your mailbox.
> > >
> > >This footnote also confirms that this email message has been swept by
> > >MailScanner (www.mailscanner.info) for the presence of
> > >computer viruses.
> > >
>
> --
> Julian Field
> www.MailScanner.info
> Professional Support Services at www.MailScanner.biz
> MailScanner thanks transtec Computers for their support
>
> --
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the sender and delete the message from your mailbox.
>
> This footnote also confirms that this email message has been swept by
> MailScanner (www.mailscanner.info) for the presence of computer viruses.
>
> --
> *** This message originates from COCC, Inc.
>
> If the reader of this message, regardless of the address or routing, is
not an intended recipient, you are hereby notified that you have received
this transmittal in error and any review; use, distribution, dissemination
or copying is strictly prohibited.  If you have received this message in
error, please delete this e-mail and all files transmitted with it from your
system and immediately notify COCC, Inc. by sending reply e-mail to the
sender of this message.
>
> Thank you. ***

--
*** This message originates from COCC, Inc.

If the reader of this message, regardless of the address or routing, is not an intended recipient, you are hereby notified that you have received this transmittal in error and any review; use, distribution, dissemination or copying is strictly prohibited.  If you have received this message in error, please delete this e-mail and all files transmitted with it from your system and immediately notify COCC, Inc. by sending reply e-mail to the sender of this message.

Thank you. ***



More information about the MailScanner mailing list