You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Dan Mahoney, System Admin" <da...@prime.gushi.org> on 2009/12/26 15:12:36 UTC
Adding remote-ip/ESMTPID/X-Envelope to logging output?
Hey there,
Background: Sendmail with spamd running on a different box, spamc called
from global procmail file.
I'm doing some nightly log-combing to look for interesting patterns,
including against other network traffic (like erroneous DNS lookups, I
think I might be on to something).
However, one of the annoying things about spamassassin's logging is that
it fails to log the remote connecting ip, even though it places it in
special places in the logs:
take for example:
Dec 26 08:41:51 quark spamd[87490]: spamd: connection from prime.gushi.org
[72.9.101.130] at port 62430
Dec 26 08:41:51 quark spamd[87490]: spamd: processing message
<ca...@mail.gmail.com> for danm:58
Dec 26 08:41:53 quark spamd[87490]: FuzzyOcr: Scan canceled, message has
less than -5 points (-6.601).
Dec 26 08:41:53 quark spamd[87490]: spamd: clean message (-6.6/5.0) for
danm:58 in 1.9 seconds, 3788 bytes.
Dec 26 08:41:53 quark spamd[87490]: spamd: result: . -6 -
AWL,BAYES_00,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS
scantime=1.9,size=3788,user=danm,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=62430,mid=<ca...@mail.gmail.com>,bayes=0.000000,autolearn=ham,shortcircuit=no
>From those logs, there's nothing at all that tells me what the relaying ip
is, even though it's "special" to spamd, used to determine the ASN, etc.
The sendmail logs (grepped for that messageid) are more useful:
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423:
from=<as...@lists.digium.com>, size=2735, class=0,
nrcpts=1,
msgid=<ca...@mail.gmail.com>,
proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
But again, those give me half the picture (and are on two different
machines), and I'd need the long msgid line to correlate them.
Is logging output configurable that I could add the value of the "relay="
line into the output? Or perhaps the value of "X-Envelope-To?"
Also, does spamc have any concept of the "short" (ESMTP) messageid, as
defined by sendmail's queues? (nBQDcLck027423). In terms of parsing logs,
this is a much more useful correlation point, since it's that identifier
that every other milter uses, and every other thing that writes to maillog
uses. (But I understand if it's not possible since the API is different).
For example, grepping for that self-same messageid, other than spamc,
gives me the whole story. Sender, recipient, every milter it's been
through.
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423:
from=<as...@lists.digium.com>, size=2735, class=0,
nrcpts=1,
msgid=<ca...@mail.gmail.com>,
proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
insert (1): header: X-DomainKeys: Sendmail DomainKeys Filter v1.0.2
prime.gushi.org nBQDcLck027423
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
insert (1): header: Authentication-Results: prime.gushi.org; dkim=none
(no signature)\n\theader.i=unknown; x-dkim-adsp=none
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
insert (1): header: X-DKIM: Sendmail DKIM Filter v2.8.3 prime.gushi.org
nBQDcLck027423
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
insert (1): header: Authentication-Results: prime.gushi.org;
sender-id=pass header.sender=asterisk-users-bounces@lists.digium.com;
spf=pass smtp.mfrom=asterisk-users-bounces@lists.digium.com
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
insert (1): header: X-SenderID: Sendmail Sender-ID Filter v1.0.0
prime.gushi.org nBQDcLck027423
Dec 26 08:38:24 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
add: header: X-Greylist: Default is to whitelist mail, not delayed by
milter-greylist-4.0.1 (prime.gushi.org [72.9.101.130]); Sat, 26 Dec 2009
08:41:49 -0500 (EST)
Dec 26 08:38:28 <mail.info> prime sm-mta[27436]: nBQDcLck027423:
to=<da...@prime.gushi.org>, delay=00:00:05, xdelay=00:00:03, mailer=local,
pri=33624, dsn=2.0.0, stat=Sent
Thoughts?
-Dan Mahoney
--
"When I'm lost, and confused, and trying to make a U-turn, nothing annoys
me more than someone telling me to watch out for the tombstone!"
"How often does that happen, Fab?"
-David Feld & Tom Fabry, sometime in High School.
--------Dan Mahoney--------
Techie, Sysadmin, WebGeek
Gushi on efnet/undernet IRC
ICQ: 13735144 AIM: LarpGM
Site: http://www.gushi.org
---------------------------
Re: [sa-list] Re: Adding remote-ip/ESMTPID/X-Envelope to logging
output?
Posted by John Hardin <jh...@impsec.org>.
On Mon, 28 Dec 2009, Dan Mahoney, System Admin wrote:
> Somewhere within the spamd guts there's a print/printf line that prints
> that last line, and is supplied a list of variables. I mean only to add
> a couple more. I'm quite surprised it's not a tunable.
>
> I'm also surprised that, unline sendmail, spamd doesn't put a single
> token in EVERY logline, even if that's a unique messageid known only to
> SA (as is the case with sendmail logs).
May I suggest opening an enhancement bug for that so that the devs
become/remain aware of your request/suggestion?
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
The fetters imposed on liberty at home have ever been forged out
of the weapons provided for defense against real, pretended, or
imaginary dangers from abroad. -- James Madison, 1799
-----------------------------------------------------------------------
80 days since President Obama won the Nobel "Not George W. Bush" prize
Re: [sa-list] Re: Adding remote-ip/ESMTPID/X-Envelope to logging
output?
Posted by "Dan Mahoney, System Admin" <da...@prime.gushi.org>.
On Sun, 27 Dec 2009, Shane Williams wrote:
> One way to find what you want is to grab the msg id (or mid) from the
> spamd line, and grep for that out of the sendmail log for the remote
> IP. As I recall when I wrote something that searched like this, I had
> to do some special character quoting on some of the mid's, but did
> finally get it working.
Yeah, it's possible to have my parser do that kind of cross-correlation,
and I imagine it's what I'll have to do (along with using syslog to send
files from one server to the other), as well as keeping them local.
I just figured if there was a single local.cf tweak I could add that would
change my logline, that might be the easier and more correct way, as
getting the score, plus all the rules matched, plus the long-messageid,
plus the short messageID is rather complex. (Even within just spamd it
requires looking at multiple lines)
For example, this line contains the score:
Dec 28 02:37:35 quark spamd[9203]: spamd: identified spam (20.1/5.0) for
danm:58 in 0.4 seconds, 3920 bytes.
But this one, which contains almost everything else of use, does NOT
contain the decimal score, and there's nothing there at ALL to
cross-correlate them (and this is running in debugmode). I can guess
based on the size, score, scantime and uid, but those are hardly unique,
especially
during a deluge.
Dec 28 02:37:35 quark spamd[9203]: spamd: result: Y 20 -
ANY_BOUNCE_MESSAGE,BOUNCE_MESSAGE
scantime=0.4,size=3920,user=danm,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=53762,mid=<20...@prime.gushi.org>,bayes=0.001253,autolearn=disabled,shortcircuit=spam
Somewhere within the spamd guts there's a print/printf line that prints
that last line, and is supplied a list of variables. I mean only to add a
couple more. I'm quite surprised it's not a tunable.
I'm also surprised that, unline sendmail, spamd doesn't put a single token
in EVERY logline, even if that's a unique messageid known only to SA (as
is the case with sendmail logs).
-Dan
>
> On Sat, 26 Dec 2009, Dan Mahoney, System Admin wrote:
>
>> Hey there,
>>
>> Background: Sendmail with spamd running on a different box, spamc called
>> from global procmail file.
>>
>> I'm doing some nightly log-combing to look for interesting patterns,
>> including against other network traffic (like erroneous DNS lookups, I
>> think I might be on to something).
>>
>> However, one of the annoying things about spamassassin's logging is that it
>> fails to log the remote connecting ip, even though it places it in special
>> places in the logs:
>>
>> take for example:
>>
>> Dec 26 08:41:51 quark spamd[87490]: spamd: connection from prime.gushi.org
>> [72.9.101.130] at port 62430
>> Dec 26 08:41:51 quark spamd[87490]: spamd: processing message
>> <ca...@mail.gmail.com> for danm:58
>> Dec 26 08:41:53 quark spamd[87490]: FuzzyOcr: Scan canceled, message has
>> less than -5 points (-6.601).
>> Dec 26 08:41:53 quark spamd[87490]: spamd: clean message (-6.6/5.0) for
>> danm:58 in 1.9 seconds, 3788 bytes.
>> Dec 26 08:41:53 quark spamd[87490]: spamd: result: . -6 -
>> AWL,BAYES_00,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS
>> scantime=1.9,size=3788,user=danm,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=62430,mid=<ca...@mail.gmail.com>,bayes=0.000000,autolearn=ham,shortcircuit=no
>>
>>> From those logs, there's nothing at all that tells me what the relaying ip
>> is, even though it's "special" to spamd, used to determine the ASN, etc.
>>
>> The sendmail logs (grepped for that messageid) are more useful:
>>
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423:
>> from=<as...@lists.digium.com>, size=2735, class=0,
>> nrcpts=1,
>> msgid=<ca...@mail.gmail.com>,
>> proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
>>
>> But again, those give me half the picture (and are on two different
>> machines), and I'd need the long msgid line to correlate them.
>>
>> Is logging output configurable that I could add the value of the "relay="
>> line into the output? Or perhaps the value of "X-Envelope-To?"
>>
>> Also, does spamc have any concept of the "short" (ESMTP) messageid, as
>> defined by sendmail's queues? (nBQDcLck027423). In terms of parsing logs,
>> this is a much more useful correlation point, since it's that identifier
>> that every other milter uses, and every other thing that writes to maillog
>> uses. (But I understand if it's not possible since the API is different).
>>
>> For example, grepping for that self-same messageid, other than spamc, gives
>> me the whole story. Sender, recipient, every milter it's been through.
>>
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423:
>> from=<as...@lists.digium.com>, size=2735, class=0,
>> nrcpts=1,
>> msgid=<ca...@mail.gmail.com>,
>> proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
>> insert (1): header: X-DomainKeys: Sendmail DomainKeys Filter v1.0.2
>> prime.gushi.org nBQDcLck027423
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
>> insert (1): header: Authentication-Results: prime.gushi.org; dkim=none (no
>> signature)\n\theader.i=unknown; x-dkim-adsp=none
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
>> insert (1): header: X-DKIM: Sendmail DKIM Filter v2.8.3 prime.gushi.org
>> nBQDcLck027423
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
>> insert (1): header: Authentication-Results: prime.gushi.org; sender-id=pass
>> header.sender=asterisk-users-bounces@lists.digium.com; spf=pass
>> smtp.mfrom=asterisk-users-bounces@lists.digium.com
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
>> insert (1): header: X-SenderID: Sendmail Sender-ID Filter v1.0.0
>> prime.gushi.org nBQDcLck027423
>> Dec 26 08:38:24 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
>> add: header: X-Greylist: Default is to whitelist mail, not delayed by
>> milter-greylist-4.0.1 (prime.gushi.org [72.9.101.130]); Sat, 26 Dec 2009
>> 08:41:49 -0500 (EST)
>> Dec 26 08:38:28 <mail.info> prime sm-mta[27436]: nBQDcLck027423:
>> to=<da...@prime.gushi.org>, delay=00:00:05, xdelay=00:00:03, mailer=local,
>> pri=33624, dsn=2.0.0, stat=Sent
>>
>> Thoughts?
>>
>> -Dan Mahoney
>>
>>
>
>
--
"And, a special guest, from the future, miss Ria Pischell. Miss Pischell,
as you all know, is the inventor of the Statiophonic Oxygenetic
Amplifiagraphaphonadelaverberator, and it's pretty hard to imagine life
without one of those.
-Rufus, Bill & Ted's Bogus Journey
--------Dan Mahoney--------
Techie, Sysadmin, WebGeek
Gushi on efnet/undernet IRC
ICQ: 13735144 AIM: LarpGM
Site: http://www.gushi.org
---------------------------
Re: Adding remote-ip/ESMTPID/X-Envelope to logging output?
Posted by Shane Williams <sh...@shanew.net>.
One way to find what you want is to grab the msg id (or mid) from the
spamd line, and grep for that out of the sendmail log for the remote
IP. As I recall when I wrote something that searched like this, I had
to do some special character quoting on some of the mid's, but did
finally get it working.
On Sat, 26 Dec 2009, Dan Mahoney, System Admin wrote:
> Hey there,
>
> Background: Sendmail with spamd running on a different box, spamc called from
> global procmail file.
>
> I'm doing some nightly log-combing to look for interesting patterns,
> including against other network traffic (like erroneous DNS lookups, I think
> I might be on to something).
>
> However, one of the annoying things about spamassassin's logging is that it
> fails to log the remote connecting ip, even though it places it in special
> places in the logs:
>
> take for example:
>
> Dec 26 08:41:51 quark spamd[87490]: spamd: connection from prime.gushi.org
> [72.9.101.130] at port 62430
> Dec 26 08:41:51 quark spamd[87490]: spamd: processing message
> <ca...@mail.gmail.com> for danm:58
> Dec 26 08:41:53 quark spamd[87490]: FuzzyOcr: Scan canceled, message has less
> than -5 points (-6.601).
> Dec 26 08:41:53 quark spamd[87490]: spamd: clean message (-6.6/5.0) for
> danm:58 in 1.9 seconds, 3788 bytes.
> Dec 26 08:41:53 quark spamd[87490]: spamd: result: . -6 -
> AWL,BAYES_00,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS
> scantime=1.9,size=3788,user=danm,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=62430,mid=<ca...@mail.gmail.com>,bayes=0.000000,autolearn=ham,shortcircuit=no
>
>> From those logs, there's nothing at all that tells me what the relaying ip
> is, even though it's "special" to spamd, used to determine the ASN, etc.
>
> The sendmail logs (grepped for that messageid) are more useful:
>
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423:
> from=<as...@lists.digium.com>, size=2735, class=0, nrcpts=1,
> msgid=<ca...@mail.gmail.com>,
> proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
>
> But again, those give me half the picture (and are on two different
> machines), and I'd need the long msgid line to correlate them.
>
> Is logging output configurable that I could add the value of the "relay="
> line into the output? Or perhaps the value of "X-Envelope-To?"
>
> Also, does spamc have any concept of the "short" (ESMTP) messageid, as
> defined by sendmail's queues? (nBQDcLck027423). In terms of parsing logs,
> this is a much more useful correlation point, since it's that identifier that
> every other milter uses, and every other thing that writes to maillog uses.
> (But I understand if it's not possible since the API is different).
>
> For example, grepping for that self-same messageid, other than spamc, gives
> me the whole story. Sender, recipient, every milter it's been through.
>
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423:
> from=<as...@lists.digium.com>, size=2735, class=0, nrcpts=1,
> msgid=<ca...@mail.gmail.com>,
> proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
> insert (1): header: X-DomainKeys: Sendmail DomainKeys Filter v1.0.2
> prime.gushi.org nBQDcLck027423
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
> insert (1): header: Authentication-Results: prime.gushi.org; dkim=none (no
> signature)\n\theader.i=unknown; x-dkim-adsp=none
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
> insert (1): header: X-DKIM: Sendmail DKIM Filter v2.8.3 prime.gushi.org
> nBQDcLck027423
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
> insert (1): header: Authentication-Results: prime.gushi.org; sender-id=pass
> header.sender=asterisk-users-bounces@lists.digium.com; spf=pass
> smtp.mfrom=asterisk-users-bounces@lists.digium.com
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter
> insert (1): header: X-SenderID: Sendmail Sender-ID Filter v1.0.0
> prime.gushi.org nBQDcLck027423
> Dec 26 08:38:24 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter add:
> header: X-Greylist: Default is to whitelist mail, not delayed by
> milter-greylist-4.0.1 (prime.gushi.org [72.9.101.130]); Sat, 26 Dec 2009
> 08:41:49 -0500 (EST)
> Dec 26 08:38:28 <mail.info> prime sm-mta[27436]: nBQDcLck027423:
> to=<da...@prime.gushi.org>, delay=00:00:05, xdelay=00:00:03, mailer=local,
> pri=33624, dsn=2.0.0, stat=Sent
>
> Thoughts?
>
> -Dan Mahoney
>
>
--
Public key #7BBC68D9 at | Shane Williams
http://pgp.mit.edu/ | System Admin - UT iSchool
=----------------------------------+-------------------------------
All syllogisms contain three lines | shanew@shanew.net
Therefore this is not a syllogism | www.ischool.utexas.edu/~shanew
Re: Adding remote-ip/ESMTPID/X-Envelope to logging output?
Posted by Dave Funk <db...@engineering.uiowa.edu>.
On Sat, 26 Dec 2009, Dan Mahoney, System Admin wrote:
> Hey there,
>
> Background: Sendmail with spamd running on a different box, spamc called from
> global procmail file.
>
> I'm doing some nightly log-combing to look for interesting patterns,
> including against other network traffic (like erroneous DNS lookups, I think
> I might be on to something).
>
> However, one of the annoying things about spamassassin's logging is that it
> fails to log the remote connecting ip, even though it places it in special
> places in the logs:
>
> take for example:
>
> Dec 26 08:41:51 quark spamd[87490]: spamd: connection from prime.gushi.org
[snip..]
> Is logging output configurable that I could add the value of the "relay="
> line into the output? Or perhaps the value of "X-Envelope-To?"
>
> Also, does spamc have any concept of the "short" (ESMTP) messageid, as
> defined by sendmail's queues? (nBQDcLck027423). In terms of parsing logs,
> this is a much more useful correlation point, since it's that identifier that
> every other milter uses, and every other thing that writes to maillog uses.
> (But I understand if it's not possible since the API is different).
>
> For example, grepping for that self-same messageid, other than spamc, gives
> me the whole story. Sender, recipient, every milter it's been through.
>
[snip..]
> Thoughts?
>
> -Dan Mahoney
What about using a milter to tie spamassassin into your sendmail instead
of running spamc from procmail? The milter has access to all the desired
info (both from sendmail & spamd) and you can log what ever you desire.
By the time you get to procmail the message is already in the 'delivery'
phase and some of your desired info is no longer available.
--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{