message from mailscanner: ignoring text in character set
Warren Burstein
warren at SOFTOV.CO.IL
Fri Nov 11 00:46:28 GMT 2005
[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "US-ASCII" character set. ]
[ Some characters may be displayed incorrectly. ]
I'm making some progress with the error I mentioned earlier this week.
I've noticed that a handler for all character sets gets installed at
some point (this happens three different places in Message.pm), but
isn't in place when the first batch of emails is processed, and I'm
trying to figure out why.
I'm also puzzled by the subroutine FixMaliciousSubjects in
SweepContent.pm. What sort of harm can the Subject line do? And in
particular, what harm can be caused by trailing whitespace, removed on
line 252?
$newsubject =~ s/\s*$//g;
I think that this can cause a problem if an encoded subject line had a
trailing space. I don't see any problem with removing the trailing
space, except that the subject line won't get re-encoded, and so you may
wind up with 8-bit characters in the Subject line (instead of turning
them into quoted-printable or base64), and if the character set isn't
your default one, the MUA could display it in the wrong charset. The
way this happens is that FixMaliciousSubjects removes the trailing
whitespace, and since $newsubject is no longer equal to $subject, it
sets $message->{subjectwasunsafe}. That makes one of the Deliver...
functions in Message.pm replace the Subject: to what
FixMaliciousSubjects changed it.
I noticed this by chance - I was shortening a word-encoded subject just
to save space, and happened to cut it off at a space - hard to see when
it's encoded - and when it got to my mailbox it was no longer encoded,
and missing the character set. What was sent said
Subject: =?windows-1255?B?5fjp5fog?=
but what got delivered to the mailbox was
Subject: \345\370\351\345\372
Warren Burstein wrote:
> I'm running MailScanner-4.47.4-2 on CentOS release 3.4 (which I
> understand is a derivative of Redhat Enterprise Edition).
>
> When I run MailScanner in Debug mode, if a message is in the queue
> with a subject containing text in windows-1255, I see the following
> message:
>
> ignoring text in character set `WINDOWS-1255'
> at /usr/lib/MailScanner/MailScanner/Sendmail.pm line 359
>
> I searched the archives and found in
> http://www.jiscmail.ac.uk/cgi-bin/wa.exe?A2=ind02&L=MAILSCANNER&P=R309317&I=-3
> that there was a similar message in 2002 regarding windows-1252, and
> it was fixed. I also read that this was not something to worry about,
> so I'm not worrying, but I like to get rid of error messages so that
> if there is a real problem it will stand out.
>
> So, if anyone remembers what was done to make this work for
> windows-1252, could you tell me, and I'll see if I can do likewise for
> 1255?
>
> thanks
------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).
Support MailScanner development - buy the book off the website!
More information about the MailScanner
mailing list