You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Dan Mahoney, System Admin" <da...@prime.gushi.org> on 2009/12/26 15:12:36 UTC

Adding remote-ip/ESMTPID/X-Envelope to logging output?

Hey there,

Background: Sendmail with spamd running on a different box, spamc called 
from global procmail file.

I'm doing some nightly log-combing to look for interesting patterns, 
including against other network traffic (like erroneous DNS lookups, I 
think I might be on to something).

However, one of the annoying things about spamassassin's logging is that 
it fails to log the remote connecting ip, even though it places it in 
special places in the logs:

take for example:

Dec 26 08:41:51 quark spamd[87490]: spamd: connection from prime.gushi.org 
[72.9.101.130] at port 62430
Dec 26 08:41:51 quark spamd[87490]: spamd: processing message 
<ca...@mail.gmail.com> for danm:58
Dec 26 08:41:53 quark spamd[87490]: FuzzyOcr: Scan canceled, message has 
less than -5 points (-6.601).
Dec 26 08:41:53 quark spamd[87490]: spamd: clean message (-6.6/5.0) for 
danm:58 in 1.9 seconds, 3788 bytes.
Dec 26 08:41:53 quark spamd[87490]: spamd: result: . -6 - 
AWL,BAYES_00,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS 
scantime=1.9,size=3788,user=danm,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=62430,mid=<ca...@mail.gmail.com>,bayes=0.000000,autolearn=ham,shortcircuit=no

>From those logs, there's nothing at all that tells me what the relaying ip 
is, even though it's "special" to spamd, used to determine the ASN, etc.

The sendmail logs (grepped for that messageid) are more useful:

Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: 
from=<as...@lists.digium.com>, size=2735, class=0, 
nrcpts=1, 
msgid=<ca...@mail.gmail.com>, 
proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]

But again, those give me half the picture (and are on two different 
machines), and I'd need the long msgid line to correlate them.

Is logging output configurable that I could add the value of the "relay=" 
line into the output?  Or perhaps the value of "X-Envelope-To?"

Also, does spamc have any concept of the "short" (ESMTP) messageid, as 
defined by sendmail's queues? (nBQDcLck027423).  In terms of parsing logs, 
this is a much more useful correlation point, since it's that identifier 
that every other milter uses, and every other thing that writes to maillog 
uses. (But I understand if it's not possible since the API is different).

For example, grepping for that self-same messageid, other than spamc, 
gives me the whole story.  Sender, recipient, every milter it's been 
through.

Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: 
from=<as...@lists.digium.com>, size=2735, class=0, 
nrcpts=1, 
msgid=<ca...@mail.gmail.com>, 
proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
insert (1): header: X-DomainKeys:  Sendmail DomainKeys Filter v1.0.2 
prime.gushi.org nBQDcLck027423
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
insert (1): header: Authentication-Results:  prime.gushi.org; dkim=none 
(no signature)\n\theader.i=unknown; x-dkim-adsp=none
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
insert (1): header: X-DKIM:  Sendmail DKIM Filter v2.8.3 prime.gushi.org 
nBQDcLck027423
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
insert (1): header: Authentication-Results: prime.gushi.org; 
sender-id=pass header.sender=asterisk-users-bounces@lists.digium.com; 
spf=pass smtp.mfrom=asterisk-users-bounces@lists.digium.com
Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
insert (1): header: X-SenderID: Sendmail Sender-ID Filter v1.0.0 
prime.gushi.org nBQDcLck027423
Dec 26 08:38:24 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
add: header: X-Greylist: Default is to whitelist mail, not delayed by 
milter-greylist-4.0.1 (prime.gushi.org [72.9.101.130]); Sat, 26 Dec 2009 
08:41:49 -0500 (EST)
Dec 26 08:38:28 <mail.info> prime sm-mta[27436]: nBQDcLck027423: 
to=<da...@prime.gushi.org>, delay=00:00:05, xdelay=00:00:03, mailer=local, 
pri=33624, dsn=2.0.0, stat=Sent

Thoughts?

-Dan Mahoney

-- 

"When I'm lost, and confused, and trying to make a U-turn, nothing annoys
me more than someone telling me to watch out for the tombstone!"

"How often does that happen, Fab?"

-David Feld & Tom Fabry, sometime in High School.

--------Dan Mahoney--------
Techie,  Sysadmin,  WebGeek
Gushi on efnet/undernet IRC
ICQ: 13735144   AIM: LarpGM
Site:  http://www.gushi.org
---------------------------


Re: [sa-list] Re: Adding remote-ip/ESMTPID/X-Envelope to logging output?

Posted by John Hardin <jh...@impsec.org>.
On Mon, 28 Dec 2009, Dan Mahoney, System Admin wrote:

> Somewhere within the spamd guts there's a print/printf line that prints 
> that last line, and is supplied a list of variables.  I mean only to add 
> a couple more.  I'm quite surprised it's not a tunable.
>
> I'm also surprised that, unline sendmail, spamd doesn't put a single 
> token in EVERY logline, even if that's a unique messageid known only to 
> SA (as is the case with sendmail logs).

May I suggest opening an enhancement bug for that so that the devs 
become/remain aware of your request/suggestion?

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The fetters imposed on liberty at home have ever been forged out
   of the weapons provided for defense against real, pretended, or
   imaginary dangers from abroad.               -- James Madison, 1799
-----------------------------------------------------------------------
  80 days since President Obama won the Nobel "Not George W. Bush" prize

Re: [sa-list] Re: Adding remote-ip/ESMTPID/X-Envelope to logging output?

Posted by "Dan Mahoney, System Admin" <da...@prime.gushi.org>.
On Sun, 27 Dec 2009, Shane Williams wrote:

> One way to find what you want is to grab the msg id (or mid) from the
> spamd line, and grep for that out of the sendmail log for the remote
> IP.  As I recall when I wrote something that searched like this, I had
> to do some special character quoting on some of the mid's, but did
> finally get it working.

Yeah, it's possible to have my parser do that kind of cross-correlation, 
and I imagine it's what I'll have to do (along with using syslog to send 
files from one server to the other), as well as keeping them local.

I just figured if there was a single local.cf tweak I could add that would 
change my logline, that might be the easier and more correct way, as 
getting the score, plus all the rules matched, plus the long-messageid, 
plus the short messageID is rather complex.  (Even within just spamd it 
requires looking at multiple lines)

For example, this line contains the score:

Dec 28 02:37:35 quark spamd[9203]: spamd: identified spam (20.1/5.0) for 
danm:58 in 0.4 seconds, 3920 bytes.

But this one, which contains almost everything else of use, does NOT 
contain the decimal score, and there's nothing there at ALL to 
cross-correlate them (and this is running in debugmode).  I can guess 
based on the size, score, scantime and uid, but those are hardly unique, 
especially 
during a deluge.

Dec 28 02:37:35 quark spamd[9203]: spamd: result: Y 20 - 
ANY_BOUNCE_MESSAGE,BOUNCE_MESSAGE 
scantime=0.4,size=3920,user=danm,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=53762,mid=<20...@prime.gushi.org>,bayes=0.001253,autolearn=disabled,shortcircuit=spam

Somewhere within the spamd guts there's a print/printf line that prints 
that last line, and is supplied a list of variables.  I mean only to add a 
couple more.  I'm quite surprised it's not a tunable.

I'm also surprised that, unline sendmail, spamd doesn't put a single token 
in EVERY logline, even if that's a unique messageid known only to SA (as 
is the case with sendmail logs).

-Dan

>
> On Sat, 26 Dec 2009, Dan Mahoney, System Admin wrote:
>
>> Hey there,
>> 
>> Background: Sendmail with spamd running on a different box, spamc called 
>> from global procmail file.
>> 
>> I'm doing some nightly log-combing to look for interesting patterns, 
>> including against other network traffic (like erroneous DNS lookups, I 
>> think I might be on to something).
>> 
>> However, one of the annoying things about spamassassin's logging is that it 
>> fails to log the remote connecting ip, even though it places it in special 
>> places in the logs:
>> 
>> take for example:
>> 
>> Dec 26 08:41:51 quark spamd[87490]: spamd: connection from prime.gushi.org 
>> [72.9.101.130] at port 62430
>> Dec 26 08:41:51 quark spamd[87490]: spamd: processing message 
>> <ca...@mail.gmail.com> for danm:58
>> Dec 26 08:41:53 quark spamd[87490]: FuzzyOcr: Scan canceled, message has 
>> less than -5 points (-6.601).
>> Dec 26 08:41:53 quark spamd[87490]: spamd: clean message (-6.6/5.0) for 
>> danm:58 in 1.9 seconds, 3788 bytes.
>> Dec 26 08:41:53 quark spamd[87490]: spamd: result: . -6 - 
>> AWL,BAYES_00,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS 
>> scantime=1.9,size=3788,user=danm,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=62430,mid=<ca...@mail.gmail.com>,bayes=0.000000,autolearn=ham,shortcircuit=no
>> 
>>> From those logs, there's nothing at all that tells me what the relaying ip 
>> is, even though it's "special" to spamd, used to determine the ASN, etc.
>> 
>> The sendmail logs (grepped for that messageid) are more useful:
>> 
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: 
>> from=<as...@lists.digium.com>, size=2735, class=0, 
>> nrcpts=1, 
>> msgid=<ca...@mail.gmail.com>, 
>> proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
>> 
>> But again, those give me half the picture (and are on two different 
>> machines), and I'd need the long msgid line to correlate them.
>> 
>> Is logging output configurable that I could add the value of the "relay=" 
>> line into the output?  Or perhaps the value of "X-Envelope-To?"
>> 
>> Also, does spamc have any concept of the "short" (ESMTP) messageid, as 
>> defined by sendmail's queues? (nBQDcLck027423).  In terms of parsing logs, 
>> this is a much more useful correlation point, since it's that identifier 
>> that every other milter uses, and every other thing that writes to maillog 
>> uses. (But I understand if it's not possible since the API is different).
>> 
>> For example, grepping for that self-same messageid, other than spamc, gives 
>> me the whole story.  Sender, recipient, every milter it's been through.
>> 
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: 
>> from=<as...@lists.digium.com>, size=2735, class=0, 
>> nrcpts=1, 
>> msgid=<ca...@mail.gmail.com>, 
>> proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
>> insert (1): header: X-DomainKeys:  Sendmail DomainKeys Filter v1.0.2 
>> prime.gushi.org nBQDcLck027423
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
>> insert (1): header: Authentication-Results:  prime.gushi.org; dkim=none (no 
>> signature)\n\theader.i=unknown; x-dkim-adsp=none
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
>> insert (1): header: X-DKIM:  Sendmail DKIM Filter v2.8.3 prime.gushi.org 
>> nBQDcLck027423
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
>> insert (1): header: Authentication-Results: prime.gushi.org; sender-id=pass 
>> header.sender=asterisk-users-bounces@lists.digium.com; spf=pass 
>> smtp.mfrom=asterisk-users-bounces@lists.digium.com
>> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
>> insert (1): header: X-SenderID: Sendmail Sender-ID Filter v1.0.0 
>> prime.gushi.org nBQDcLck027423
>> Dec 26 08:38:24 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
>> add: header: X-Greylist: Default is to whitelist mail, not delayed by 
>> milter-greylist-4.0.1 (prime.gushi.org [72.9.101.130]); Sat, 26 Dec 2009 
>> 08:41:49 -0500 (EST)
>> Dec 26 08:38:28 <mail.info> prime sm-mta[27436]: nBQDcLck027423: 
>> to=<da...@prime.gushi.org>, delay=00:00:05, xdelay=00:00:03, mailer=local, 
>> pri=33624, dsn=2.0.0, stat=Sent
>> 
>> Thoughts?
>> 
>> -Dan Mahoney
>> 
>> 
>
>

-- 

"And, a special guest, from the future, miss Ria Pischell.  Miss Pischell,
as you all know, is the inventor of the Statiophonic Oxygenetic
Amplifiagraphaphonadelaverberator, and it's pretty hard to imagine life
without one of those.

-Rufus, Bill & Ted's Bogus Journey


--------Dan Mahoney--------
Techie,  Sysadmin,  WebGeek
Gushi on efnet/undernet IRC
ICQ: 13735144   AIM: LarpGM
Site:  http://www.gushi.org
---------------------------


Re: Adding remote-ip/ESMTPID/X-Envelope to logging output?

Posted by Shane Williams <sh...@shanew.net>.
One way to find what you want is to grab the msg id (or mid) from the
spamd line, and grep for that out of the sendmail log for the remote
IP.  As I recall when I wrote something that searched like this, I had
to do some special character quoting on some of the mid's, but did
finally get it working.

On Sat, 26 Dec 2009, Dan Mahoney, System Admin wrote:

> Hey there,
>
> Background: Sendmail with spamd running on a different box, spamc called from 
> global procmail file.
>
> I'm doing some nightly log-combing to look for interesting patterns, 
> including against other network traffic (like erroneous DNS lookups, I think 
> I might be on to something).
>
> However, one of the annoying things about spamassassin's logging is that it 
> fails to log the remote connecting ip, even though it places it in special 
> places in the logs:
>
> take for example:
>
> Dec 26 08:41:51 quark spamd[87490]: spamd: connection from prime.gushi.org 
> [72.9.101.130] at port 62430
> Dec 26 08:41:51 quark spamd[87490]: spamd: processing message 
> <ca...@mail.gmail.com> for danm:58
> Dec 26 08:41:53 quark spamd[87490]: FuzzyOcr: Scan canceled, message has less 
> than -5 points (-6.601).
> Dec 26 08:41:53 quark spamd[87490]: spamd: clean message (-6.6/5.0) for 
> danm:58 in 1.9 seconds, 3788 bytes.
> Dec 26 08:41:53 quark spamd[87490]: spamd: result: . -6 - 
> AWL,BAYES_00,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS 
> scantime=1.9,size=3788,user=danm,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=62430,mid=<ca...@mail.gmail.com>,bayes=0.000000,autolearn=ham,shortcircuit=no
>
>> From those logs, there's nothing at all that tells me what the relaying ip 
> is, even though it's "special" to spamd, used to determine the ASN, etc.
>
> The sendmail logs (grepped for that messageid) are more useful:
>
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: 
> from=<as...@lists.digium.com>, size=2735, class=0, nrcpts=1, 
> msgid=<ca...@mail.gmail.com>, 
> proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
>
> But again, those give me half the picture (and are on two different 
> machines), and I'd need the long msgid line to correlate them.
>
> Is logging output configurable that I could add the value of the "relay=" 
> line into the output?  Or perhaps the value of "X-Envelope-To?"
>
> Also, does spamc have any concept of the "short" (ESMTP) messageid, as 
> defined by sendmail's queues? (nBQDcLck027423).  In terms of parsing logs, 
> this is a much more useful correlation point, since it's that identifier that 
> every other milter uses, and every other thing that writes to maillog uses. 
> (But I understand if it's not possible since the API is different).
>
> For example, grepping for that self-same messageid, other than spamc, gives 
> me the whole story.  Sender, recipient, every milter it's been through.
>
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: 
> from=<as...@lists.digium.com>, size=2735, class=0, nrcpts=1, 
> msgid=<ca...@mail.gmail.com>, 
> proto=ESMTP, daemon=MTA, relay=lists.digium.com [216.207.245.17]
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
> insert (1): header: X-DomainKeys:  Sendmail DomainKeys Filter v1.0.2 
> prime.gushi.org nBQDcLck027423
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
> insert (1): header: Authentication-Results:  prime.gushi.org; dkim=none (no 
> signature)\n\theader.i=unknown; x-dkim-adsp=none
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
> insert (1): header: X-DKIM:  Sendmail DKIM Filter v2.8.3 prime.gushi.org 
> nBQDcLck027423
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
> insert (1): header: Authentication-Results: prime.gushi.org; sender-id=pass 
> header.sender=asterisk-users-bounces@lists.digium.com; spf=pass 
> smtp.mfrom=asterisk-users-bounces@lists.digium.com
> Dec 26 08:38:23 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter 
> insert (1): header: X-SenderID: Sendmail Sender-ID Filter v1.0.0 
> prime.gushi.org nBQDcLck027423
> Dec 26 08:38:24 <mail.info> prime sm-mta[27423]: nBQDcLck027423: Milter add: 
> header: X-Greylist: Default is to whitelist mail, not delayed by 
> milter-greylist-4.0.1 (prime.gushi.org [72.9.101.130]); Sat, 26 Dec 2009 
> 08:41:49 -0500 (EST)
> Dec 26 08:38:28 <mail.info> prime sm-mta[27436]: nBQDcLck027423: 
> to=<da...@prime.gushi.org>, delay=00:00:05, xdelay=00:00:03, mailer=local, 
> pri=33624, dsn=2.0.0, stat=Sent
>
> Thoughts?
>
> -Dan Mahoney
>
>

-- 
Public key #7BBC68D9 at            |                 Shane Williams
http://pgp.mit.edu/                |      System Admin - UT iSchool
=----------------------------------+-------------------------------
All syllogisms contain three lines |              shanew@shanew.net
Therefore this is not a syllogism  | www.ischool.utexas.edu/~shanew

Re: Adding remote-ip/ESMTPID/X-Envelope to logging output?

Posted by Dave Funk <db...@engineering.uiowa.edu>.
On Sat, 26 Dec 2009, Dan Mahoney, System Admin wrote:

> Hey there,
>
> Background: Sendmail with spamd running on a different box, spamc called from 
> global procmail file.
>
> I'm doing some nightly log-combing to look for interesting patterns, 
> including against other network traffic (like erroneous DNS lookups, I think 
> I might be on to something).
>
> However, one of the annoying things about spamassassin's logging is that it 
> fails to log the remote connecting ip, even though it places it in special 
> places in the logs:
>
> take for example:
>
> Dec 26 08:41:51 quark spamd[87490]: spamd: connection from prime.gushi.org 
[snip..]
> Is logging output configurable that I could add the value of the "relay=" 
> line into the output?  Or perhaps the value of "X-Envelope-To?"
>
> Also, does spamc have any concept of the "short" (ESMTP) messageid, as 
> defined by sendmail's queues? (nBQDcLck027423).  In terms of parsing logs, 
> this is a much more useful correlation point, since it's that identifier that 
> every other milter uses, and every other thing that writes to maillog uses. 
> (But I understand if it's not possible since the API is different).
>
> For example, grepping for that self-same messageid, other than spamc, gives 
> me the whole story.  Sender, recipient, every milter it's been through.
>
[snip..]
> Thoughts?
>
> -Dan Mahoney

What about using a milter to tie spamassassin into your sendmail instead 
of running spamc from procmail? The milter has access to all the desired 
info (both from sendmail & spamd) and you can log what ever you desire.

By the time you get to procmail the message is already in the 'delivery'
phase and some of your desired info is no longer available.

-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{