You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2006/01/25 23:53:17 UTC

[Bug 4770] New: use ASN data as Bayes token

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770

           Summary: use ASN data as Bayes token
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: minor
          Priority: P5
         Component: Rules
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: jm@jmason.org


Karsten M. Self _aaaages_ ago noted a strong correlation between the ASN a
message was relayed from, and spamminess.  urls:

http://kmself.home.netcom.com/
http://twiki.iwethey.org/Main/SpamByASN
http://linuxmafia.com/~karsten/Images/spam-by-asn.png
http://linuxmafia.com/~karsten/Images/cum-spam-by-asn.png
http://linuxmafia.com/~karsten/monthly-asn-report-current.txt
http://linuxmafia.com/~karsten/Download/procmail-asn-header

I thought we had a bug tracking this, but it appears we didn't.  Anyway, here's
a bugzilla entry for this.

aspath.routeviews.org seems to provide the most useful data:

dig 101.96.218.195.aspath.routeviews.org. IN TXT

;; QUESTION SECTION:
;101.96.218.195.aspath.routeviews.org. IN TXT

;; ANSWER SECTION:
101.96.218.195.aspath.routeviews.org. 86400 IN TXT "12682 6461 3356 8760"
"195.218.96.0" "19"

;; AUTHORITY SECTION:
aspath.routeviews.org.  86400   IN      NS      routeviews.org.


that's the ASN numbers it passed through, teh IP range, and CIDR mask.

the ASN numbers in particular would be good bayes tokens, if the correlation
still stands.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From parkerm@pobox.com  2006-12-17 10:14 -------
Couple of suggestions/comments:

1) I wouldn't cut out just 127.0.0.1 IPs, I'd cut out all private ips, you can
determine if its a private ip with the following type of check:
if ($scanner->{relays_external}->[0]->{ip_private}) { .....

You might also want to just limit yourself to relays_untrusted.

2) This part really worries me:

"Please make sure
that your use of the plugin does not overload their infrastructure -
this generally means that B<you should not use this plugin in a
high-volume environment> or that you should use a local mirror of the
zone (see ftp://ftp.routeviews.org/dnszones/)."

With that sort of caveat I'll be -1 for inclusion in the base pkg, you could put
it up on the wiki of course.  I think that if its included, even in turned off
state, that enough people will turn it on to possibly cause a problem.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2007-01-21 11:37 -------
(In reply to comment #32)

> I'll replace with the default ASF license block, as seen in the other .pm files,
> if that's ok.

That's perfect, thanks.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2006-12-16 01:09 -------
Chris Pollock noticed that the plugin gives less accurate results than the
procmail recipe by Karsten M. Self (see links in comment #1), and I've also seen
a number of "non-responses" in my own corpus. 

The procmail recipe uses host(1) with "-R 10" (ten retries upon failure) which
is pretty aggressive but gives more accurate results. 

One possible solution is to set up a local mirror of the asn.routeviews.org zone
using the data from ftp://ftp.routeviews.org/dnszones/. However this data is not
available through rsync (which increases bandwidth) and only in BIND format
(which results in enormous memory consumption). [I just asked them if they would
offer them.]

The second possible solution is to not use SA's check_rbl_text() but do direct
queries from within the plugin using Net::DNS with appropriate retries etc. 

Would that be acceptable? Is it necessary to do async lookups or are the plugins
themselves called asynchronously? 




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Re: [Bug 4770] use ASN data as Bayes token

Posted by Matthias Leisi <ma...@leisi.net>.
> oops.  Daryl just pointed out -- we need to sort out a CLA first...
>
> Matthias, could you fax through a
> CLA?	http://www.apache.org/licenses/#clas

I'm currently in the mountains, but will be back on the weekend (and near
a fax machine). I'll send it asap.

-- Matthias



[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3790 is|0                           |1
           obsolete|                            |




------- Additional Comments From jm@jmason.org  2007-01-15 14:13 -------
Created an attachment (id=3826)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3826&action=view)
patch as applied

oops.  Daryl just pointed out -- we need to sort out a CLA first...

Matthias, could you fax through a CLA?	http://www.apache.org/licenses/#clas



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2006-12-19 16:54 -------
Update on the useage of asn.routeviews.org:

1) John Heasly, the developer of the tools around the asn and
asnpath.routeviews.org zone, writes by mail:

| As Joel mentioned, we have a daemon specially written to handle these zones.
| I would not expect there to be any problem with the additional load.

2) John is working on an rbldsnd format of the data. There is one missing(?)
feature in rbldnsd (return a default A/TXT record instead of NXDOMAIN) which I'm
taking up with the developer of rbldsnd. 

3) For those wanting to set up a local mirror, rsync has been made available at
rsync://archive.routeviews.org/routeviews/dnszones:

| rsync rsync://archive.routeviews.org/routeviews/dnszones/aspath.zone .

IMHO this should solve the concerns brought up in comment #13 and #15.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2006-12-04 10:39 -------
just adding a comment --

nowadays this would be best implemented either as Karsten has (upfront as a
message-annotating filter) or in SpamAssassin as a plugin which annotates the
message using add_header().  both ways expose the data for bayes.  in terms of
what makes sense for SA, the latter is more logical IMO -- and less overhead,
since it reduces forks and message parsing required.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


matthias@leisi.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3786 is|0                           |1
           obsolete|                            |




------- Additional Comments From matthias@leisi.net  2006-12-15 11:43 -------
Created an attachment (id=3787)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3787&action=view)
Plugin to add _ASN_ and _ASNCIDR_ tags (revised)

* Moved to Mail::SpamAssassin::Plugin package
* Cleaned up POD doc, added warning on routeviews.org load
* Cleaned up debug output
* Added warning on zero-length items




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2006-12-17 23:55 -------
(In reply to comment #15)

> I would hope that we'd do that for any new DNS test that we include (that's not
> a part of an already used combined zone).  At the very least it'd be nice to
> give the operator a heads up as to why his traffic is suddenly increasing.

I already tried to contact them through the help /at/ routeviews.org address
provided on the site, not specifically for inclusion in SA, but generally for
rsync'ing / mirroring of their zone. 

I haven't received an answer yet -- it may be helpful if somebody has a better
way to contact them (it's hosted by uoregon.edu). 

As to the choice of IP addresses (comment #13 and #14): An update should be
ready later today.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Undefined                   |3.2.0




------- Additional Comments From jm@jmason.org  2006-12-15 09:59 -------
thanks Matthias!  yep, I definitely think this should be kept inactive by
default; I'm pretty sure routeviews would not be happy if we shipped it enabled.

aiming (optimistically) at 3.2.0...



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2006-12-17 13:38 -------
'With that sort of caveat I'll be -1 for inclusion in the base pkg, you could put
it up on the wiki of course.  I think that if its included, even in turned off
state, that enough people will turn it on to possibly cause a problem.'

hmm -- I'd tend to disagree ;)  I doubt many people would enable it, if it
incurs a latency hit for no immediate increase in accuracy (ie it just generates
additional tokens for bayes).  As such I think it'd be OK to have in the base
distro, commented.

(Another alternative might be to ask the guy who runs the zone if he'd be ok
with its inclusion in this form, too.)

apart from that -- Michael's comments about IP choice are correct, though...



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2007-01-15 10:23 -------
cool -- thanks!  will apply shortly



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From jm@jmason.org  2007-01-22 04:32 -------
ok, re-applied with the ASF license header:

svn commit -m "bug 4770: re-apply Mail::SpamAssassin::Plugin::ASN patch, now
that licensing is sorted.  exposes ASN data as a Bayes token and the _ASNCIDR_
and _ASN_ header-rewriting tags.  thanks to Matthias Leisi <matthias /at/
leisi.net>"
Sending        CREDITS
Sending        MANIFEST
Adding         lib/Mail/SpamAssassin/Plugin/ASN.pm
Sending        rules/v320.pre
Transmitting file data ....
Committed revision 498595.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


matthias@leisi.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3787 is|0                           |1
           obsolete|                            |




------- Additional Comments From matthias@leisi.net  2006-12-17 04:23 -------
Created an attachment (id=3788)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3788&action=view)
Plugin to add _ASN_ and _ASNCIDR_ tags (revised 2) 

* Changed from check_rbl_txt to SA's internal async DNS
* Does not require a 0.001 score any more
* Prepend "AS" to the ASN, eg "AS2828" to make it more distinct




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


matthias@leisi.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3789 is|0                           |1
           obsolete|                            |




------- Additional Comments From matthias@leisi.net  2006-12-18 09:11 -------
Created an attachment (id=3790)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3790&action=view)
Plugin to add _ASN_ and _ASNCIDR_ tags (revised 4)

* Use of $scanner->{relays_untrusted} to determine the IP address to look up,
skipping {private_ip}'s (comment #13)
* Add multiple responses (more/less specific networks) to _ASNCIDR_,
space-separated (comment #18)

Regarding load on routeviews.org infrastructure (comment #15): I received an
answer from the project and we are discussing the potential load implication.
There are in fact three nameservers, but they are inconsistently advertised
through DNS. I'll update here as soon as I have more information. 

What is your guesstimate: How much load (eg queries/day) would a
white-/blacklist receive if it were added to SA and enabled by default? If it
were included, but disabled?




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|                            |need resolution before mass-
                   |                            |checks




------- Additional Comments From jm@jmason.org  2007-01-14 06:25 -------
Michael -- is your veto still in place?  please commnent.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2007-01-07 10:18 -------
update: we followed it up just to get to get a definite answer --

>> Are you fine with the plugin pointing to asn.routeviews.org being
>> distributed (and the additional load this may create)?
>
>That is fine.

so we're good to go, IMO.  Michael?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
  Status Whiteboard|pre-mass-check              |




------- Additional Comments From jm@jmason.org  2007-01-15 13:33 -------
ok, thanks -- applied:

: jm 1375...; svn commit -m "bug 4770: add ASN.pm plugin, contributed by
Matthias Leisi <matthias at leisi.net>"  lib/Mail/SpamAssassin/Plugin/ASN.pm
MANIFEST rules/v320.pre
Sending        MANIFEST
Adding         lib/Mail/SpamAssassin/Plugin/ASN.pm
Sending        rules/v320.pre
Transmitting file data ...
Committed revision 496501.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2006-12-18 03:05 -------
'Open issue (see "TODO" section in the POD): For some IP addresses, an AS
announces more than one network (more/less specific, eg a.b.c.d/20 and a.b.c.e/23). 

what's the preferred option to handle these more/less specific announcements?
Just add them to the _ASNCIDR_ tag (eg space separated)? Currently the last
answer wins.'

I'd vote for adding all of the answers, space-separated...



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2006-12-17 06:22 -------
Created an attachment (id=3789)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3789&action=view)
Plugin to add _ASN_ and _ASNCIDR_ tags (revised 3)

* Tested with 3.1.7 (previously only with 3.1.0 and 3.1.3)
* Fixed typo
* Make sure we do not look up 127.0.0.1 -- relays_externals->[0] is
(sometimes?) 127.0.0.1 when called in spamd(8)/Postfix' content_filter context
as opposed to spamassassin(1).
* Removed the _handle_hit call and the "return 1"s, as they caused a default
score of 1.0 on the rule that called the asn_lookup() eval function.
* Added a parameter to the asn_lookup() eval function to specify the number of
simultaneous DNS queries




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From parkerm@pobox.com  2007-01-15 10:14 -------
I remove my veto.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2006-12-16 03:41 -------
it's not necessary to do async lookups, no.  If the addition of a few seconds of
latency is acceptable (which it probably will be, IMO), then that may be the
best option. +1

I would suggest using our own frontend for Net::DNS, though,
Mail::SpamAssassin::DnsResolver -- it works around a Net::DNS bug.

10 retries might be overkill, though ;)

(Down the line, there's plenty of time to modify it to use the async lookup
infrastructure.  that would be a good idea so that the
lookup/retry/lookup/retry/... chain can happen in parallel with other rules. not
urgent though.)

Also, it'd be great if asn.routeviews.org was rsyncable -- I'm sure there'd be a
lot of people willing to set up mirrors, too... that would be a good way to get
the zone usable for internet-scale lookups without hammering that guy's personal
infrastructure.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2007-01-21 08:27 -------
JimJag just noted the CLA as received, so I'll apply this rsn...



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2006-12-31 06:10 -------
I did some statistics on the ASN data gathered over the past couple of days (see
http://matthias.leisi.net/archives/176-Where-does-your-spam-come-from.html). 

It seems that the ASN data alone is not too helpful, but a combined view on ASN
and prefixes announced by these ASNs (the _ASNCIDR_ tag) helps to identify
"hotspots". In that light, the two tags may well be helpful as Bayes tokens.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |




------- Additional Comments From jm@jmason.org  2007-01-17 15:44 -------
btw, it's now possible to file CLAs via email:

'CLAs can be filed
electronically now.  You can send PGP/GPG-signed emails with the
scanned PDFs of the signed CLA form to secretary@apache.org and
legal-archive@apache.org.  This removes the need to fax or send
physical mails of the CLA.'

handy!



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


matthias@leisi.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3788 is|0                           |1
           obsolete|                            |






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2007-01-02 14:21 -------
Yes, see comment #20. John Heasly wrote by mail:

| I would not expect there to be any problem with the additional load.

I'll forward you the complete mail privately.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2006-12-15 10:35 -------
Should I rewrite it so that it fits in the Mail::SpamAssassin::Plugin package? 




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2006-02-21 16:20 -------
Karsten responded by email...

> > I was just googling around my name and spam, ran across SA's Bug 4770:
> > 
> >     http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770
> > 
> > First off, thanks for filing that ;-)
> > 
> > Second:  I've got a user base now, well, two other guys doing
> > spam-by-asn reporting.  One of them's got stats posted on ASN as well:
> > 
> >     http://spam.thegrebs.com/reports/spam_by_asn.pl
> >     http://spam.thegrebs.com/reports/spam_by_cidr.pl
> >     http://spam.thegrebs.com/reports/spam_by_provider.pl
> > 
> > ....specifics are a tad different from my results, but the overall
> > pattern is the same.
> > 
> > 
> > My own historical stats are posted here:
> > 
> >     http://linuxmafia.com/~karsten/monthly-asn-report
> >     http://linuxmafia.com/~karsten/monthly-cidr-report
> > 
> > ... with data from January, 2004 (with a couple of breaks) by month:
> > 
> >     http://linuxmafia.com/~karsten/monthly-asn-report-200401.txt
> >      .
> >      .
> >      .
> >     http://linuxmafia.com/~karsten/monthly-asn-report-200601.txt
> > 
> >     http://linuxmafia.com/~karsten/monthly-cidr-report200401.txt
> >      .
> >      .
> >      .
> >     http://linuxmafia.com/~karsten/monthly-cidr-report-200601.txt
> > 
> > The general rule has held, for _my_ sample (YMMV) that:
> > 
> >   2-5 ASNs account for 25% of all spam.
> >   11 - 30 ASNs account for 50% of all spam.
> > 
> > ... with the concentration actually increasing for the most part over
> > the study period.
> > 
> > Aggregating by CIDR gives spectacular aggregation -- 25% comes in at
> > about 15 CIDR blocks, but most of the top 60 or so CIDRs are very, very
> > spammy.
> > 
> > I think the really valuable place for this would be in the MTA itself
> > rather than spamassassin, but if there's built-in support, so much the
> > better.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2007-01-20 08:41 -------
(In reply to comment #30)
CLA completed, signed and submitted to secretary@apache.org and
legal-archive@apache.org (Cc: Justin).

As someone mentioned on the dev list, there is a superfluous copyright line in
the plugin as attached to this bug:

| # <@LICENSE>
| # Copyright 2006 dnswl.org, Matthias Leisi <ma...@leisi.net>

Justin, can you just remove it when you re-apply the patch or shall I attach a
revised version? 



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2006-12-15 05:33 -------
Matthias Leisi has written a plugin to do this:

http://matthias.leisi.net/archives/174-ASN-and-SpamAssassin.html



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2007-01-02 10:36 -------
so, just to clarify -- the asn.routeviews.org developers are happy for
(off-by-default) support to be included in SA?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2006-12-15 10:47 -------
that'd be awesome, thanks ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From jm@jmason.org  2007-01-21 08:03 -------
thanks!

I'll replace with the default ASF license block, as seen in the other .pm files,
if that's ok.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


matthias@leisi.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matthias@leisi.net






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2006-12-15 09:45 -------
Created an attachment (id=3786)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3786&action=view)
Plugin to add _ASN_ and _ASNCIDR_ tags

The plugin jm referred to in comment #3; it has an Apache license (text copied
from one of the other sa source files) to make reuse easy. Code is pod'ed.

Since the asn.routeviews.org zone is handled by a single nameserveronly (as I
write this: 128.223.61.18) usage should probably be limited to low-volume
sites. 




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From matthias@leisi.net  2006-12-17 23:58 -------
Open issue (see "TODO" section in the POD): For some IP addresses, an AS
announces more than one network (more/less specific, eg a.b.c.d/20 and a.b.c.e/23). 

what's the preferred option to handle these more/less specific announcements?
Just add them to the _ASNCIDR_ tag (eg space separated)? Currently the last
answer wins.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P5                          |P2
  Status Whiteboard|need resolution before mass-|pre-mass-check
                   |checks                      |






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4770] use ASN data as Bayes token

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770





------- Additional Comments From spamassassin@dostech.ca  2006-12-17 14:07 -------
(In reply to comment #14)
> (Another alternative might be to ask the guy who runs the zone if he'd be ok
> with its inclusion in this form, too.)

I would hope that we'd do that for any new DNS test that we include (that's not
a part of an already used combined zone).  At the very least it'd be nice to
give the operator a heads up as to why his traffic is suddenly increasing.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.