message from mailscanner: ignoring text in character set

Warren Burstein warren at SOFTOV.CO.IL
Fri Nov 11 00:46:28 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

I'm making some progress with the error I mentioned earlier this week.  
I've noticed that a handler for all character sets gets installed at 
some point (this happens three different places in Message.pm), but 
isn't in place when the first batch of emails is processed, and I'm 
trying to figure out why.

I'm also puzzled by the subroutine FixMaliciousSubjects in 
SweepContent.pm.  What sort of harm can the Subject line do?  And in 
particular, what harm can be caused by trailing whitespace, removed on 
line 252?
              $newsubject =~ s/\s*$//g;

I think that this can cause a problem if an encoded subject line had a 
trailing space.  I don't see any problem with removing the trailing 
space, except that the subject line won't get re-encoded, and so you may 
wind up with 8-bit characters in the Subject line (instead of turning 
them into quoted-printable or base64), and if the character set isn't 
your default one, the MUA could display it in the wrong charset.  The 
way this happens is that FixMaliciousSubjects removes the trailing 
whitespace, and since $newsubject is no longer equal to $subject, it 
sets $message->{subjectwasunsafe}.  That makes one of the Deliver... 
functions in Message.pm replace the Subject: to what 
FixMaliciousSubjects changed it.

I noticed this by chance - I was shortening a word-encoded subject just 
to save space, and happened to cut it off at a space - hard to see when 
it's encoded - and when it got to my mailbox it was no longer encoded, 
and missing the character set.  What was sent said
   Subject: =?windows-1255?B?5fjp5fog?=
but what got delivered to the mailbox was
   Subject: \345\370\351\345\372

Warren Burstein wrote:

> I'm running MailScanner-4.47.4-2 on  CentOS release 3.4 (which I 
> understand is a derivative of Redhat Enterprise Edition).
>
> When I run MailScanner in Debug mode, if a message is in the queue 
> with a subject containing text in windows-1255, I see the following 
> message:
>
> ignoring text in character set `WINDOWS-1255'
> at /usr/lib/MailScanner/MailScanner/Sendmail.pm line 359
>
> I searched the archives and found in 
> http://www.jiscmail.ac.uk/cgi-bin/wa.exe?A2=ind02&L=MAILSCANNER&P=R309317&I=-3 
> that there was a similar message in 2002 regarding windows-1252, and 
> it was fixed.  I also read that this was not something to worry about, 
> so I'm not worrying, but I like to get rid of error messages so that 
> if there is a real problem it will stand out.
>
> So, if anyone remembers what was done to make this work for 
> windows-1252, could you tell me, and I'll see if I can do likewise for 
> 1255?
>
> thanks

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!



More information about the MailScanner mailing list