You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Tyler Nally <tn...@technally.com> on 2006/10/11 23:24:46 UTC
Parsing Email
Hello,
I've a project that I'm needing to solve. Fax machines (for a client)
have been replaced with the phone company's fax server that e-mails
the incomming fax (.tif) images to a specific e-address at the clients
place of business.
Just so happens, the e-mail passes through a mail server that will
inspect it for e-viri as well as run it through spamassassin before
it forwards it onto their machine. That mail server that pre-processes
the clients e-mail is a machine I administer.
What I'd like to do... is capture the contents of these particular
fax e-mails as its passing through the machine I administer and either:
1- copy the fax images (detach the images from e-mail messages)
and store these images on that server (whether as a file
or put into a database as a blob)
2- create a database record that will essentially catalog the
incoming fax to associate a fax file image (or db blob ID)
A- and also search a database for existing origination fax #'s
so that the fax can be associated as to the right company
that sent it. In this case.. the DB used is a MySQL
database that exists on this particular machine as well.
Now.. what I need help in understanding... is ... assuming that
I can handle each e-mail separately as it comes through, how do I
parse the e-mail (like the way Spamassassin does) to have the
ability to pull the component parts from the e-mail (from:,
subject:, and MIME-encapsulated fax image) in order to be able
to use these pieces (somehow) for the customer care module.
I'm well versed in PHP... I used to do a lot of perl (many moons
ago) and I'd like to make this work without too awful much pain.
I think ultimately, I'll probably let the normal copy of the e-mail
go onto the customers destination. I'd cause an extra Cc: to
go through a specific e-mail account on the server where anything
that is delivered to this account is strained by this e-mail
parsing program that'll split the e-mail up into it's pieces,
and distribute/use the chunks it in a manner that I can manipulate
it later in the process.
Any help to point me in the right direction?
Thanks a lot....
Tyler Nally
Re: Parsing Email
Posted by Theo Van Dinter <fe...@apache.org>.
On Wed, Oct 11, 2006 at 02:48:28PM -0700, Vincent Li wrote:
> my $fh;
> open $fh, "<", shift;
> my @message = <$fh>;
>
> use Mail::SpamAssassin::Message;
> my $msg = Mail::SpamAssassin::Message->new(
> {
> 'message' => \@message,
> }
FYI, new() accepts a file handle, an array, a scalar, or undef (which
causes it to use \*STDIN). So you don't need to slurp the message data
in first. :)
--
Randomly Selected Tagline:
All in a days work for "Confuse-a-Cat".
Re: Parsing Email
Posted by Vincent Li <vl...@vcn.bc.ca>.
On Wed, 11 Oct 2006, Theo Van Dinter wrote:
> On Wed, Oct 11, 2006 at 05:24:46PM -0400, Tyler Nally wrote:
>> Now.. what I need help in understanding... is ... assuming that
>> I can handle each e-mail separately as it comes through, how do I
>> parse the e-mail (like the way Spamassassin does) to have the
>> ability to pull the component parts from the e-mail (from:,
>> subject:, and MIME-encapsulated fax image) in order to be able
>> to use these pieces (somehow) for the customer care module.
>
> :) I answered this kind of question for someone on IRC a week or two ago,
> here's a quick example of how to use Mail::SpamAssassin::Message:
Yeah, I learned to use Message.pm from felicity :)
>
> use Mail::SpamAssassin::Message;
> my $msg = Mail::SpamAssassin::Message->new() || die "Message error?";
> my $count = 0;
> foreach my $p ($msg->find_parts(qr/^image\b/i, 1)) {
> open(OUT, ">message.".$count++) || die "can't write file message.$count: $!";
> binmode OUT;
> print OUT $p->decode();
> close(OUT);
> }
>
>
> So that parses a message from STDIN, goes through and finds all image parts,
> and writes them out to files called message.#.
I used code below to retrieve the spam forwarded as attachment from
squirrelmail and feeds spam to sa-learn
-------
#!/usr/bin/perl
use strict;
use warnings;
my $fh;
open $fh, "<", shift;
my @message = <$fh>;
use Mail::SpamAssassin::Message;
my $msg = Mail::SpamAssassin::Message->new(
{
'message' => \@message,
}
) || die "Message error?";
#foreach my $p ($msg->find_parts(qr/^(text|image|application)\b/i, 1)) {
foreach my $p ($msg->find_parts(qr/^message\b/i, 0)) {
eval {
no warnings ;
my $type = $p->{'type'};
my $attachname = $p->{'name'};
print "Content type is: $type\n";
print "write file name: $attachname\n";
open my $out, ">", "$attachname" || die "Can't write file
$attachname:$!";
binmode $out;
print $out $p->decode();
};
# warn $@ if $@;
}
__END__
>
> Use "perldoc Mail::SpamAssassin::Message" and "perldoc
> Mail::SpamAssassin::Message::Node" for more information about functions and
> such. :)
>
> --
> Randomly Selected Tagline:
> "Zero equals Zero" - Prof. Farr
>
Vincent Li http://pingpongit.homelinux.com
Opensource .Implementation. .Consulting.
Platform .Fedora. .Debian. .Mac OS X.
Blog http://bl0g.blogdns.com
Re: Parsing Email
Posted by Theo Van Dinter <fe...@apache.org>.
On Wed, Oct 11, 2006 at 05:24:46PM -0400, Tyler Nally wrote:
> Now.. what I need help in understanding... is ... assuming that
> I can handle each e-mail separately as it comes through, how do I
> parse the e-mail (like the way Spamassassin does) to have the
> ability to pull the component parts from the e-mail (from:,
> subject:, and MIME-encapsulated fax image) in order to be able
> to use these pieces (somehow) for the customer care module.
:) I answered this kind of question for someone on IRC a week or two ago,
here's a quick example of how to use Mail::SpamAssassin::Message:
use Mail::SpamAssassin::Message;
my $msg = Mail::SpamAssassin::Message->new() || die "Message error?";
my $count = 0;
foreach my $p ($msg->find_parts(qr/^image\b/i, 1)) {
open(OUT, ">message.".$count++) || die "can't write file message.$count: $!";
binmode OUT;
print OUT $p->decode();
close(OUT);
}
So that parses a message from STDIN, goes through and finds all image parts,
and writes them out to files called message.#.
Use "perldoc Mail::SpamAssassin::Message" and "perldoc
Mail::SpamAssassin::Message::Node" for more information about functions and
such. :)
--
Randomly Selected Tagline:
"Zero equals Zero" - Prof. Farr
Re: Parsing Email
Posted by Kelson <ke...@speed.net>.
Tyler Nally wrote:
> 1- copy the fax images (detach the images from e-mail messages)
> and store these images on that server (whether as a file
> or put into a database as a blob)
If you're running Sendmail, you can use MIMEdefang <www.mimedefang.org>
for this. It has a built-in function, action_replace_with_url, which
does exactly what you want.
--
Kelson Vibber
SpeedGate Communications <www.speed.net>