You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2007/06/08 14:45:18 UTC

[Bug 5505] New: parsing of mbx format tidbits

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5505

           Summary: parsing of mbx format tidbits
           Product: Spamassassin
           Version: 3.2.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: minor
          Priority: P3
         Component: Libraries
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: Mark.Martinec@ijs.si


Just a couple of details I noticed while reviewing Message.pm.
Actually I don't know what 'mbx' format is, and whether it is
actually in use. Just so that it does not get into oblivion.

1.)
Constants.pm:
  use constant MBX_SEPARATOR => qr/^([\s|\d]\d-[a-zA-Z]{3}-\d{4}\s\d{2}:...
Message.pm:
  if (/([\s|\d]\d)-([a-zA-Z]{3})-(\d{4})\s(\d{2}):(\d{2}):(\d{2})/) {

The [\s|\d] looks wrong on both occasions, can there really be a '|'
at the beginning? What was probably meant was [\s\d] or a (?:\s|\d)

2.)
  # Munge the mbx message separator into mbox format as a sort of
  ...
  if (/From:\s[^<]+<([^>]+)>/) {
    ...
  } elsif (/From:\s([^<^>]+)/) {
A requirement for a separator after a colon is bogus,
anything after a colon is a mail header body.

Actually the parsing should be looking for envelope sender information
(like a Return-Path), and not for author's address.


3.)
sub get_pristine_header {
...
  my(@ret) = $self->{pristine_headers} =~ /^\Q$hdr\E:[ \t]+(.*? ...
should be:
  my(@ret) = $self->{pristine_headers} =~ /^\Q$hdr\E:[ \t]*(.*? ...
There is no requirement in RFC 2822 for a separator
to follow a colon in header fields.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.