You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Herb Martin <He...@learnquick.com> on 2005/07/27 04:21:25 UTC

Removing message/rfc822 attachments to separate files

When forwarding a batch of missed spam (or ham) from
Outlook back to SpamAssassin the best way seems to be
for our users to select more than a single message,
and use the menu:  Action->Forward which puts them
all in as attachments.

(Selecting a single item just does a normal forward.)

While I will eventually be able to write a parser to
go through these and find the 'outer' Mime part marker,
skip over the Content headers of each message and then
save the following lines up to the next marker to a
separate file:

	------=_NextPart_000_0700_01C59212.7AE17630
	Content-Type: message/rfc822
	Content-Transfer-Encoding: 7bit
	Content-Disposition: attachment

	From: <mu...@yahoo.com>

...it would be nice to use a pre-written program or
module.  Perl is best for me.

My initial searches at CPAN and Google have not
found this but my guess it that someone has such,
or even that I am looking right at it in the search
results but overlooking the module.

Other methods (for my users) include opening each
email separate, choosing menu:  Action->Resend, filling
in a "to address" (for each message) and answer a number
of prompts (Ok) to send.

--
Herb Martin


Re: Removing message/rfc822 attachments to separate files

Posted by Chris Lear <ch...@laculine.com>.
* Herb Martin wrote (28/07/2005 06:21):
[...]
> 
> After writing the following and trying 
> Mail::SpamAssassin::Message (off and on all afternoon)
> I stumbled upon the tool intended for the job:
> 
> MIME::Parser from MIME::Toolkit (which was already on
> my system) -- the pod doc examples had almost exactly
> what I need (added one line to first example):
> 
> <http://www.globedomain.com/cgi-bin/perldiver/perldiver.cgi?action=2010&modu
> le=MIME%3A%3AParser>
> 
> This does it -- the whole thing -- if I don't mind 
> submitting one file per run (with a command script
> loop for all of them of course):
> 
> #!/usr/bin/perl -w
> 
> use MIME::Parser;
> 
> my $parser = new MIME::Parser;       # Create parser
> $parser->output_dir("./tmp");        # Give output dir
> $parser->extract_nested_messages(0); # Extract messages whole?    
> $entity = $parser->parse(\*STDIN);   # Parse an input filehandle  
> print "Entity: $entity\n\n" if $entity;
> 
> __END__
> 

I use a similar thing. Sorry for not posting about it earlier - I
thought you had a better solution with Mail::SpamAssassin::Message.
One thing to watch out for if any Thunderbird users want to use it:
Thunderbird's attachments will be extracted with spaces in the filenames
(because the filenames are the message subjects) and sa-learn doesn't
handle them well. I use this to fix it:
http://www.tenacious.us/projects/code/nospace.pl

--
Chris

Re: Removing message/rfc822 attachments to separate files

Posted by Kai Schaetzl <ma...@conactive.com>.
Herb Martin wrote on Thu, 28 Jul 2005 00:21:54 -0500:

> I understand the latter, but No, the method sends the full 
> headers/messages encapsulated as message/rfc822 top level parts.

Ah, good, then my guess was wrong :-)

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org




RE: Removing message/rfc822 attachments to separate files

Posted by Herb Martin <He...@learnquick.com>.
> -----Original Message-----
> From: Kai Schaetzl [mailto:maillists@conactive.com] 
> 
> Herb Martin wrote on Tue, 26 Jul 2005 21:21:25 -0500:
> > When forwarding a batch of missed spam (or ham) from 
> > Outlook back to 
> > SpamAssassin the best way seems to be for our users to select more 
> > than a single message, and use the menu:  Action->Forward 
> > which puts them all in as attachments.
> 
> I guess this adds only the message bodies? Just want to 
> remmember you that Bayes uses header tokens as well. If you 
> can you should train with headers included.

I understand the latter, but No, the method sends the full
headers/messages encapsulated as message/rfc822 top level parts.

The only change I see between the Mime Markers are these 4
lines (including the blank):

------=_NextPart_000_067D_01C591D1.7F02A7C0
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment

From:  etc.............
<snip header and body>

------=_NextPart_000_067D_01C591D1.7F02A7C0

FYI:  Mail::SpamAssassin::Message (and Node) do seems to
have what I need, but so far on quick examination and a
brief initial code attempt it escapes my understanding
to use this immediately.

After writing the following and trying 
Mail::SpamAssassin::Message (off and on all afternoon)
I stumbled upon the tool intended for the job:

MIME::Parser from MIME::Toolkit (which was already on
my system) -- the pod doc examples had almost exactly
what I need (added one line to first example):

<http://www.globedomain.com/cgi-bin/perldiver/perldiver.cgi?action=2010&modu
le=MIME%3A%3AParser>

This does it -- the whole thing -- if I don't mind 
submitting one file per run (with a command script
loop for all of them of course):

#!/usr/bin/perl -w

use MIME::Parser;

my $parser = new MIME::Parser;       # Create parser
$parser->output_dir("./tmp");        # Give output dir
$parser->extract_nested_messages(0); # Extract messages whole?    
$entity = $parser->parse(\*STDIN);   # Parse an input filehandle  
print "Entity: $entity\n\n" if $entity;

__END__

This method is so much cleaner than the others I have
tried -- users can just email a whole batch of Spam
(or Ham) messages to our Spam (or Ham) "Multi" account
for automatic processing.  No change to individual 
message headers -- easy to do once or twice a day for
those who get a lot of spam.

Thank you so much for your help -- sometimes it is 
encouraging just to have someone throwing back ideas
and suggestions.

--
Herb


Re: Removing message/rfc822 attachments to separate files

Posted by Kai Schaetzl <ma...@conactive.com>.
Herb Martin wrote on Tue, 26 Jul 2005 21:21:25 -0500:

> When forwarding a batch of missed spam (or ham) from 
> Outlook back to SpamAssassin the best way seems to be 
> for our users to select more than a single message, 
> and use the menu:  Action->Forward which puts them 
> all in as attachments.

I guess this adds only the message bodies? Just want to remmember you that 
Bayes uses header tokens as well. If you can you should train with headers 
included.

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org




[exim] RE: Removing message/rfc822 attachments to separate files

Posted by Herb Martin <He...@learnquick.com>.
> -----Original Message-----
> From: Loren Wilton [mailto:lwilton@earthlink.net] 
> Sent: Tuesday, July 26, 2005 10:55 PM
> To: users@spamassassin.apache.org
> Subject: Re: Removing message/rfc822 attachments to separate files
> 
> If you could set up public ham and spam folders for your 
> users and have them drag the messages into those folders, and 
> then harvest the folders using IMAP things would work with 
> tools likely to be mostly found already lying around.

Thanks, but this is not an Exchange setup nor any other public
folders.  Even if there were public folders here, it would be
up to me to get the files back to SpamAssassin -- easy but just
another step for me.

Much easier for me to just get them to attach them back to the 
server into a Ham and Spam account there.

Of course 'easier' means I must find or write a simple file
splitter, and I am sort of jammed up for the next couple of
days, so it might be the weekend before the thing gets written
unless there is a module that does all of the real work.

BTW, it it doesn't exist (seems unlikely) this would benefit
many others.

Looks like it was right under my nose in Mail::SpamAssassin::Message.

--
Herb Martin


-- 
## List details at http://www.exim.org/mailman/listinfo/exim-users 
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://www.exim.org/eximwiki/


Re: Removing message/rfc822 attachments to separate files

Posted by Loren Wilton <lw...@earthlink.net>.
If you could set up public ham and spam folders for your users and have them
drag the messages into those folders, and then harvest the folders using
IMAP things would work with tools likely to be mostly found already lying
around.

        Loren


RE: Removing message/rfc822 attachments to separate files

Posted by Herb Martin <He...@learnquick.com>.
> Mail::SpamAssassin::Message ?  ;)
> 
> You can also read through PerMsgStatus which has code to wrip 
> out an encapsulated message.

Thanks.

Right under my nose -- but better to feel silly than
to have to re-invent the code.


--
Herb Martin


Re: Removing message/rfc822 attachments to separate files

Posted by Theo Van Dinter <fe...@apache.org>.
On Tue, Jul 26, 2005 at 09:21:25PM -0500, Herb Martin wrote:
> While I will eventually be able to write a parser to
> go through these and find the 'outer' Mime part marker,
> skip over the Content headers of each message and then
> save the following lines up to the next marker to a
> separate file:
[...]
> ...it would be nice to use a pre-written program or
> module.  Perl is best for me.

Mail::SpamAssassin::Message ?  ;)

You can also read through PerMsgStatus which has code to wrip out an
encapsulated message.

-- 
Randomly Generated Tagline:
Bart:	Dad, you killed the Zombie Flanders!
 
 Homer:	He was a zombie?
 
 		   Treehouse of Horror III