You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2013/04/20 23:47:33 UTC

[Bug 6928] New: Add sa-learn option to learn from RFC822 attachment to message rather than full message

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6928

            Bug ID: 6928
           Summary: Add sa-learn option to learn from RFC822 attachment to
                    message rather than full message
           Product: Spamassassin
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Learner
          Assignee: dev@spamassassin.apache.org
          Reporter: jhardin@impsec.org
    Classification: Unclassified

It's a fairly common practice to have users forward misclassified emails to a
training mailbox address as RFC-822 attachments.

If the site admin doesn't know to (or how to) extract these attachments and
instead learns from the raw training mailbox, the training won't be correct -
it will include the local forward headers, and if a given spam is addressed to
multiple recipients it would be learned once for each forwarded copy.

It would be much easier in this situation to have a command-line option
(perhaps --attachment) to tell sa-learn to extract and learn from an RFC-822
attachment to the message being provided (if present) rather than from the
whole message.

sa-learn already unwraps attachments that are present due to SA markup with
report_safe = 1. At first glance it looks like it would be pretty easy to
implement by adding another clause here in remove_spamassassin_markup looking
for Content-Type = message/rfc822 with no other qualifiers, but only if the
command line option were provided:


        # Ok, we found the encapsulated piece ...
    if ($ct =~ m@^(?:message/rfc822|text/plain);\s+x-spam-type=original@ ||
        ($ct eq "message/rfc822" &&
         $cd eq $self->{conf}->{'encapsulated_content_description'}))
        {


...maybe something like:

   || ($ct eq "message/rfc822" &&
defined(@self->{conf}->{'opt_extract_attachment'}))


This solution wouldn't work for a forwarded message having SA markup using
report_safe = 1, though. That would require two unwraps.

-- 
You are receiving this mail because:
You are the assignee for the bug.