You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by John Hardin <jh...@impsec.org> on 2009/04/29 17:18:45 UTC

419 emailBL?

On Wed, 29 Apr 2009, Jesse Thompson wrote:

> A word of caution.  Be very careful how you use the list.  The intended 
> usage for the list is to prevent (or monitor) local users from sending 
> email to the listed addresses.  The phishers frequently use compromised 
> end-user accounts to receive the phishing replies, so there is a high 
> risk of false positives, especially if you attempt to classify messages 
> containing one these addresses as spam.

Thread fork!

Would it be useful to have a similar list for 419 fraud contact addresses?

Discuss...

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   You do not examine legislation in the light of the benefits it
   will convey if properly administered, but in the light of the
   wrongs it would do and the harms it would cause if improperly
   administered.                                  -- Lyndon B. Johnson
-----------------------------------------------------------------------
  9 days until the 64th anniversary of VE day

Re: 419 emailBL?

Posted by Henrik K <he...@hege.li>.
On Mon, May 04, 2009 at 10:51:14PM +0200, mouss wrote:
>
> That said, I am surprised because you defended the fact that the
> freemail plugin includes the list of freemail domains...

Think about it. Maybe few thousand freemail domains, that hardly change. Why
would that require realtime updating? They can simply be updated with
sa-update. It's strange that someone would have to "defend" this.

> This wasn't intended as a list to download. I didn't even check the
> license. I was simply replying to your "I'm surprised there still hasn't
> been an emailBL around". or if you prefer: the idea of an "emailBL
> around" isn't new. note also that SARE has a ruleset with phone numbers
> and "snail mail" infos found in spam.

Ideas are another thing, but implementing it is simple actually. We are
already at alpha stage with almost finished plugin for SA and harvesting
lots of addresses. Results to be seen..


Re: 419 emailBL?

Posted by mouss <mo...@ml.netoyen.net>.
Henrik K a écrit :
> On Sun, May 03, 2009 at 06:25:01PM +0200, mouss wrote:
>> I can't use a dnsbl on recipient addresses in postfix. This requires
>> additionnal code (exceptionally if the records are hashed...). MySQL on
>> the other hand is supported by many daemons. Sure, SA would need a mysql
>> access db plugin, but that would be beneficial for other things I think.
> 
> MySQL is not a global solution.
> 

it is for me since I use it. I don't see why I should load gigas in
rbldnsd when I can query mysql. but I agree that this is a personal
view. so let's leave it like this.

That said, I am surprised because you defended the fact that the
freemail plugin includes the list of freemail domains...

> Fixing up a postfix policyd is no problem and exim supports it out of the
> box, md5 is hardy "exceptional" function.
> 

sure.

>>> Personally I'm only interested in "freemails", I don't know how feasible it
>>> would be to create a global email blacklist. 419/phishers are pretty much
>>> the only spam that's hard to catch. I'm surprised there still hasn't been an
>>> emailBL around, 
>>
>> http://www.419scam.org/419-bl.htm
> 
> Sorry I'm not interested in wgetting a humongous list, which happens also to
> be 2 days old, also no mention of anything about freshness. :)
> 

This wasn't intended as a list to download. I didn't even check the
license. I was simply replying to your "I'm surprised there still hasn't
been an emailBL around". or if you prefer: the idea of an "emailBL
around" isn't new. note also that SARE has a ruleset with phone numbers
and "snail mail" infos found in spam.

Re: 419 emailBL?

Posted by Henrik K <he...@hege.li>.
On Sun, May 03, 2009 at 06:25:01PM +0200, mouss wrote:
>
> I can't use a dnsbl on recipient addresses in postfix. This requires
> additionnal code (exceptionally if the records are hashed...). MySQL on
> the other hand is supported by many daemons. Sure, SA would need a mysql
> access db plugin, but that would be beneficial for other things I think.

MySQL is not a global solution.

Fixing up a postfix policyd is no problem and exim supports it out of the
box, md5 is hardy "exceptional" function.

> > Personally I'm only interested in "freemails", I don't know how feasible it
> > would be to create a global email blacklist. 419/phishers are pretty much
> > the only spam that's hard to catch. I'm surprised there still hasn't been an
> > emailBL around, 
> 
> 
> http://www.419scam.org/419-bl.htm

Sorry I'm not interested in wgetting a humongous list, which happens also to
be 2 days old, also no mention of anything about freshness. :)


Re: 419 emailBL?

Posted by mouss <mo...@ml.netoyen.net>.
Benny Pedersen a écrit :
> On Sun, May 3, 2009 18:25, mouss wrote:
>> stock postfix. something I can't do with a dnsbl since there is no
>> reject_rhsbl_recipient...
> 

correction: There is no DNSBL check that acts on the full email address.
reject_rhsbl_recipient will lookup the domain part.

> http://www.docunext.com/blog/2006/12/07/sorbs-settings/

or simply

http://www.postfix.org/postconf.5.html#reject_rhsbl_recipient

Re: 419 emailBL?

Posted by Benny Pedersen <me...@junc.org>.
On Sun, May 3, 2009 18:25, mouss wrote:
> stock postfix. something I can't do with a dnsbl since there is no
> reject_rhsbl_recipient...

http://www.docunext.com/blog/2006/12/07/sorbs-settings/

-- 
http://localhost/ 100% uptime and 100% mirrored :)


Re: 419 emailBL?

Posted by mouss <mo...@ml.netoyen.net>.
Henrik K a écrit :
> On Sun, May 03, 2009 at 03:14:22PM +0200, mouss wrote:
>> Henrik K a écrit :
>>> On Sun, May 03, 2009 at 03:40:47AM +0200, mouss wrote:
>>>> with rsync or the like, you can simply add the addresses (no MD5, no
>>>> anything) to an access list that your MTA can use.
>>> You don't get free rsyncs for big players like uribl for reason (um, traffic
>>> etc?).
>> some DNSBLs are available via rsync.
>>
>> $ wc -l psbl.txt
>>  1494939 psbl.txt
>> $ ls -l psbl.txt
>> ... 20969353 ...
> 
> Like I said, no one is stopping offering it. It's up to the list or someone
> donating resources to such list. But the bigger/more popular the list,
> harder it is to create a reliable rsync-network that can handle hoardes of
> clients checking stuff every 15 minutes.
> 
>>> If we had a big emailbl, obviously it would be impractical as well.
>>> You really want to be updated every 5-15 minutes, which DNS allows.
>>>
>> It is possible to use a mechanism similar to SA update:
>> - use DNS to see if there is an update
>> - if so, download changes since some recent version
> 
> See the DNS part? You already got answer there so why complicate things? ;)
> 

not the same.

1- here, you do one dns check every 5-15 minutes. the number has nothing
to do with the amount of mail you see.
2- and the query is not done while checking mail. it's asynchronous and
adds no latency to mail checking.
3- it requires no integration with MTA or whatever. I can use this with
stock postfix. something I can't do with a dnsbl since there is no
reject_rhsbl_recipient...



>>> Of course no one stops such list offering the plain text emails as plain
>>> file. But do you want potentially millions of emails in a file?
>>>
>> 1- I prefer that over latency
> 
> You can use rbldnsd, if the data is available.. I just meant why would you
> want to have a complicated setup, especially if you are going to use the
> data possibly on several levels (MTA, SA). Transferring files around and
> reloading daemons is silly.
> 

I can't use a dnsbl on recipient addresses in postfix. This requires
additionnal code (exceptionally if the records are hashed...). MySQL on
the other hand is supported by many daemons. Sure, SA would need a mysql
access db plugin, but that would be beneficial for other things I think.

(and with local data, you can support regular expressions [except for
the "simple" wildcard things]. AFAIK, rbldnsd doesn't support these).

>> - the disabled addresses do not need to be "shared" anymore.
> 
> I'm asking because I don't know: is that reality? Do you get confirmation
> from i.e. gmail that some account is disabled? From the list point of view
> it's simple enough to wait a month or so to see if the email is still found
> in spams. Reporting etc is another thing and not necessarily concern of the
> list.
> 

I have no evidence for email addresses, but fraud domains/subdomains get
disabled (except at "uncollaborative" sites or registrars. but there I
blacklist the whole domain...).

> Personally I'm only interested in "freemails", I don't know how feasible it
> would be to create a global email blacklist. 419/phishers are pretty much
> the only spam that's hard to catch. I'm surprised there still hasn't been an
> emailBL around, 


http://www.419scam.org/419-bl.htm


> but maybe this time it becomes reality.. atleast to have
> some scoring in SA.
> 
>> I don't have a "fixed" opinion. I am just trying to see if using the
>> well-known dns hack (dnsbl) is the best choice.
> 
> DNS is simple and effective remote database for simple queries. Unless
> someone invents even better and easy to use global solution.
> 



Re: 419 emailBL?

Posted by Henrik K <he...@hege.li>.
On Sun, May 03, 2009 at 03:14:22PM +0200, mouss wrote:
> Henrik K a écrit :
> > On Sun, May 03, 2009 at 03:40:47AM +0200, mouss wrote:
> >> with rsync or the like, you can simply add the addresses (no MD5, no
> >> anything) to an access list that your MTA can use.
> > 
> > You don't get free rsyncs for big players like uribl for reason (um, traffic
> > etc?).
> 
> some DNSBLs are available via rsync.
> 
> $ wc -l psbl.txt
>  1494939 psbl.txt
> $ ls -l psbl.txt
> ... 20969353 ...

Like I said, no one is stopping offering it. It's up to the list or someone
donating resources to such list. But the bigger/more popular the list,
harder it is to create a reliable rsync-network that can handle hoardes of
clients checking stuff every 15 minutes.

> > If we had a big emailbl, obviously it would be impractical as well.
> > You really want to be updated every 5-15 minutes, which DNS allows.
> > 
> 
> It is possible to use a mechanism similar to SA update:
> - use DNS to see if there is an update
> - if so, download changes since some recent version

See the DNS part? You already got answer there so why complicate things? ;)

> > Of course no one stops such list offering the plain text emails as plain
> > file. But do you want potentially millions of emails in a file?
> > 
> 
> 1- I prefer that over latency

You can use rbldnsd, if the data is available.. I just meant why would you
want to have a complicated setup, especially if you are going to use the
data possibly on several levels (MTA, SA). Transferring files around and
reloading daemons is silly.

> - the disabled addresses do not need to be "shared" anymore.

I'm asking because I don't know: is that reality? Do you get confirmation
from i.e. gmail that some account is disabled? From the list point of view
it's simple enough to wait a month or so to see if the email is still found
in spams. Reporting etc is another thing and not necessarily concern of the
list.

Personally I'm only interested in "freemails", I don't know how feasible it
would be to create a global email blacklist. 419/phishers are pretty much
the only spam that's hard to catch. I'm surprised there still hasn't been an
emailBL around, but maybe this time it becomes reality.. atleast to have
some scoring in SA.

> I don't have a "fixed" opinion. I am just trying to see if using the
> well-known dns hack (dnsbl) is the best choice.

DNS is simple and effective remote database for simple queries. Unless
someone invents even better and easy to use global solution.

Cheers,
Henrik

Re: 419 emailBL?

Posted by mouss <mo...@ml.netoyen.net>.
Henrik K a écrit :
> On Sun, May 03, 2009 at 03:40:47AM +0200, mouss wrote:
>> with rsync or the like, you can simply add the addresses (no MD5, no
>> anything) to an access list that your MTA can use.
> 
> You don't get free rsyncs for big players like uribl for reason (um, traffic
> etc?).

some DNSBLs are available via rsync.

$ wc -l psbl.txt
 1494939 psbl.txt
$ ls -l psbl.txt
... 20969353 ...


> If we had a big emailbl, obviously it would be impractical as well.
> You really want to be updated every 5-15 minutes, which DNS allows.
> 

It is possible to use a mechanism similar to SA update:
- use DNS to see if there is an update
- if so, download changes since some recent version


> Of course no one stops such list offering the plain text emails as plain
> file. But do you want potentially millions of emails in a file?
> 

1- I prefer that over latency
2- do we _now_ have millions of such addresses? if not, premature
optimization...


here is how I see things:

- criminals (AFF, phish, ...) use some email addresses
- these addresses get listed
- the addresses are reported to domains owners
- domain owners disable these addresses (if the domain owner is the
criminal, then the full domain can be listed, and/or it can be reported
to the registrar... etc.)
- the disabled addresses do not need to be "shared" anymore.
- ... etc


I don't have a "fixed" opinion. I am just trying to see if using the
well-known dns hack (dnsbl) is the best choice.


Re: 419 emailBL?

Posted by Henrik K <he...@hege.li>.
On Sun, May 03, 2009 at 03:40:47AM +0200, mouss wrote:
> 
> with rsync or the like, you can simply add the addresses (no MD5, no
> anything) to an access list that your MTA can use.

You don't get free rsyncs for big players like uribl for reason (um, traffic
etc?). If we had a big emailbl, obviously it would be impractical as well.
You really want to be updated every 5-15 minutes, which DNS allows.

Of course no one stops such list offering the plain text emails as plain
file. But do you want potentially millions of emails in a file?


Re: [SA] 419 emailBL?

Posted by Adam Katz <an...@khopis.com>.
>> And if bandwidth at the server is a problem, would publishing the ruleset
>> updates via the Coral Cache network work?
> 
> Unfortunately, no.  In fact, they kind of suck as a CDN.  We
> originally were putting updates through there and would regularly have
> issues w/ 404s, corrupt or incomplete downloads, etc.
> 
> It may have improved since the 2005 or so timeframe when we started w/
> updates, but ...  Haven't checked in a while.

Still has the same issues.  I'll be removing them from my sa-update
channels mirror files very soon.

Re: 419 emailBL?

Posted by John Hardin <jh...@impsec.org>.
On Wed, 29 Apr 2009, Theo Van Dinter wrote:

> On Wed, Apr 29, 2009 at 8:06 PM, John Hardin <jh...@impsec.org> wrote:
>>> And 135k doesn't add up to a lot of bandwidth?
>> And if bandwidth at the server is a problem, would publishing the ruleset
>> updates via the Coral Cache network work?
>
> Unfortunately, no.  In fact, they kind of suck as a CDN.  We
> originally were putting updates through there and would regularly have
> issues w/ 404s, corrupt or incomplete downloads, etc.
>
> It may have improved since the 2005 or so timeframe when we started w/
> updates, but ...  Haven't checked in a while.

I've edited my MIRRORED.BY, we'll see how it goes...

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The real opiate of the masses isn't religion; it's the belief that
   somewhere there is a benefit that can be delivered without a
   corresponding cost.                       -- Tom of "Radio Free NJ"
-----------------------------------------------------------------------
  9 days until the 64th anniversary of VE day

Re: 419 emailBL?

Posted by Theo Van Dinter <fe...@apache.org>.
On Wed, Apr 29, 2009 at 8:06 PM, John Hardin <jh...@impsec.org> wrote:
>> And 135k doesn't add up to a lot of bandwidth?
>
> ...so don't look for updates more than once every day or two.

Yeah, but I think the point was that a frequently changing ruleset
would be downloaded frequently.

> And if bandwidth at the server is a problem, would publishing the ruleset
> updates via the Coral Cache network work?

Unfortunately, no.  In fact, they kind of suck as a CDN.  We
originally were putting updates through there and would regularly have
issues w/ 404s, corrupt or incomplete downloads, etc.

It may have improved since the 2005 or so timeframe when we started w/
updates, but ...  Haven't checked in a while.

Re: 419 emailBL?

Posted by Theo Van Dinter <fe...@apache.org>.
On Wed, Apr 29, 2009 at 7:56 PM, Adam Katz <an...@khopis.com> wrote:
>> I guess it depends what you mean by "enormous".  A sought rule update is 135k.
>
> And 135k doesn't add up to a lot of bandwidth?  I suppose it depends
> on the number of users, and I'm figuring worst-case scenario, e.g.
> when/if it ships enabled in the default SA install.

Well, it depends what you're measuring.  :)

The update itself isn't large, it's just 135k, which is the not
"enormous" bit.  135k in and of itself is a pretty tiny file, but I'm
not sure what "enormous" means in this context -- megs?  gigs?

The aggregate bandwidth could very well be large, depending on update
publish frequency, client update frequency, number of clients, client
bandwidth, etc.  From what I've seen, the standard SA updates w/ the
same ~130k size and the current number of users ... isn't a lot of
bandwidth.

There are some pretty standard ways to deal with this issue though, such as:

a) have lots of mirrors, same idea as your P2P idea though less
dynamic  (oh, that was another thought I had ... go short of using
torrents since they're resource heavy and instead make our own P2P
protocol doing a dynamic http/mirrored.by system)

b) split the channel into a frequent / not frequent channel (or stable
/ testing, or split based on content, or ...) for patterns which don't
change often, there's no reason to keep sending them out.  same idea I
mentioned before.

c) shrink or hold update size steady in face of updates.  hard.

d) make updates less frequently.  defeats the purpose?  clearly every
15m is different than every day is different than weekly ...


To be perfectly honest, I really don't worry about the "omg, update
bandwidth" issue right now.  I worry that there aren't enough updates
right now.  The only auto-generated one, sought, is daily, and the
manual ones now are more than weekly on average.  I don't know if
sought could even be produced faster, you need a certain amount of
incoming ham and spam to sample and produce test rules, and enough
diversity of mails to test against to avoid "obvious" bad rules...

Re: 419 emailBL?

Posted by John Hardin <jh...@impsec.org>.
On Wed, 29 Apr 2009, Adam Katz wrote:

> Theo Van Dinter wrote:
>> On Wed, Apr 29, 2009 at 6:24 PM, Adam Katz <an...@khopis.com> wrote:
>>> The mechanism for sa-update is brilliant, but
>>> doesn't lend itself to enormous indices of frequently-changing rulesets.
>>
>> I guess it depends what you mean by "enormous".  A sought rule update is 135k.
>
> And 135k doesn't add up to a lot of bandwidth?

...so don't look for updates more than once every day or two.

And if bandwidth at the server is a problem, would publishing the ruleset 
updates via the Coral Cache network work?

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   A superior gunman is one who uses his superior judgment to keep
   himself out of situations that would require the use of his
   superior skills.
-----------------------------------------------------------------------
  9 days until the 64th anniversary of VE day

Re: 419 emailBL?

Posted by Adam Katz <an...@khopis.com>.
Theo Van Dinter wrote:
> On Wed, Apr 29, 2009 at 6:24 PM, Adam Katz <an...@khopis.com> wrote:
>> The mechanism for sa-update is brilliant, but
>> doesn't lend itself to enormous indices of frequently-changing rulesets.
> 
> I guess it depends what you mean by "enormous".  A sought rule update is 135k.

And 135k doesn't add up to a lot of bandwidth?  I suppose it depends
on the number of users, and I'm figuring worst-case scenario, e.g.
when/if it ships enabled in the default SA install.

> The likelihood is, imo, that you would probably split up your updates
> into multiple channels before they really got out of control in size.
> For example, you could do something like a weekly, daily, and
> sub-daily channel, and move rules appropriately between them.  Yes, a
> little more of a PITA for clients, but how much churn do you really
> expect?

How about hierarchical channel support, e.g. a channel's MIRRORED.BY
file is merely itself a sa-update-channels file.

>> Justin:  Perhaps sa-update could support [version].torrent in addition
>> to [version].tar.gz on each mirror?  (This doesn't touch the current
>> DNS-based version/announce system.)  Channels hosted for versions of
>> SA after the supporting release (e.g. 0.4.3.[channel] and "higher")
>> would be allowed to host only the torrent file.
> 
> I had actually thought about doing a P2P sa-update so as to better
> withstand DoS issues, skip the need for a mirrored.by file, etc.  But
> the main issue is that most channel updates are rather small, and so
> therefore the downloads are rather fast.  Compared to doing a torrent,
> which takes relatively a long time to get setup, and just as you
> start, you're done.  Also, it means clients are serving data, which
> makes the "quick sa-update and move on" more of a procedure and you
> have to worry about remote connectivity, etc, etc.
> 
> In the end it didn't seem worthwhile beyond the security aspect, so I
> didn't move beyond the "thinking about" stage.
> 
> (and yes, I know I'm not Justin. ;))

You're close enough on the SA development order.  For BT, I was
actually envisioning much larger rulesets with sought merely heralding
a future with lots of large auto-generated rulesets, but perhaps it
doesn't scale at the right point.  I think I'm trying to squeeze to
much :-p

-- 
Adam Katz
khopesh on irc://irc.freenode.net/#spamassassin
http://khopesh.com/Anti-spam

Re: [SA] 419 emailBL?

Posted by Theo Van Dinter <fe...@apache.org>.
On Wed, Apr 29, 2009 at 6:24 PM, Adam Katz <an...@khopis.com> wrote:
> The mechanism for sa-update is brilliant, but
> doesn't lend itself to enormous indices of frequently-changing rulesets.

I guess it depends what you mean by "enormous".  A sought rule update is 135k.

The likelihood is, imo, that you would probably split up your updates
into multiple channels before they really got out of control in size.
For example, you could do something like a weekly, daily, and
sub-daily channel, and move rules appropriately between them.  Yes, a
little more of a PITA for clients, but how much churn do you really
expect?

> Justin:  Perhaps sa-update could support [version].torrent in addition
> to [version].tar.gz on each mirror?  (This doesn't touch the current
> DNS-based version/announce system.)  Channels hosted for versions of
> SA after the supporting release (e.g. 0.4.3.[channel] and "higher")
> would be allowed to host only the torrent file.

I had actually thought about doing a P2P sa-update so as to better
withstand DoS issues, skip the need for a mirrored.by file, etc.  But
the main issue is that most channel updates are rather small, and so
therefore the downloads are rather fast.  Compared to doing a torrent,
which takes relatively a long time to get setup, and just as you
start, you're done.  Also, it means clients are serving data, which
makes the "quick sa-update and move on" more of a procedure and you
have to worry about remote connectivity, etc, etc.

In the end it didn't seem worthwhile beyond the security aspect, so I
didn't move beyond the "thinking about" stage.


(and yes, I know I'm not Justin. ;))

Re: 419 emailBL?

Posted by Mike Cardwell <sp...@lists.grepular.com>.
mouss wrote:

>>> Is the best way to do this - not via DNS.
>> Depends what you're trying to achieve. I thought the objective was a
>> block list of email addresses that could be queried via the DNS by any
>> application... Your suggestion doesn't really capture the requirements.
> and what is the benefit of using DNS? why not rsync/svn/wget/... ?
> 
>> In this particular example, the list should be used for preventing your
>> users sending emails *to* those addresses. Many organisations rightly or
>> wrongly don't perform spam filtering on their outgoing relays so
>> spamassassin is a bit over the top when you can just use another dns
>> based bl.
>
> with rsync or the like, you can simply add the addresses (no MD5, no
> anything) to an access list that your MTA can use.

It sounds like you're asking me what the benefit of distributing a block 
list via the DNS is? If yes, type "dnsbl" into google. If not, please 
clarify ...

-- 
Mike Cardwell
(https://secure.grepular.com/) (http://perlcv.com/)

Re: 419 emailBL?

Posted by mouss <mo...@ml.netoyen.net>.
Mike Cardwell a écrit :
> Steve Freegard wrote:
> [snip]
>>
>> Is the best way to do this - not via DNS.
> 
> Depends what you're trying to achieve. I thought the objective was a
> block list of email addresses that could be queried via the DNS by any
> application... Your suggestion doesn't really capture the requirements.
> 

and what is the benefit of using DNS? why not rsync/svn/wget/... ?


> In this particular example, the list should be used for preventing your
> users sending emails *to* those addresses. Many organisations rightly or
> wrongly don't perform spam filtering on their outgoing relays so
> spamassassin is a bit over the top when you can just use another dns
> based bl.
> 

with rsync or the like, you can simply add the addresses (no MD5, no
anything) to an access list that your MTA can use.

Re: [SA] 419 emailBL?

Posted by Mike Cardwell <sp...@lists.grepular.com>.
Adam Katz wrote:

>>>> For listing both emails and uri's it would be useful if you could add
>>>> regular expressions. [...]
> 
> Steve Freegard responded:
>>> Yuck; if you want to do stuff using regexp then:
>>>
>>> uri RULE_NAME /<regexp>/
>>> score RULE_NAME nn.nnn
>>>
>>> Is the best way to do this - not via DNS.
> 
> Mike Cardwell defended:
>> Depends what you're trying to achieve. I thought the objective was a
>> block list of email addresses that could be queried via the DNS by any
>> application... Your suggestion doesn't really capture the requirements.
>>
>> In this particular example, the list should be used for preventing your
>> users sending emails *to* those addresses. Many organisations rightly or
>> wrongly don't perform spam filtering on their outgoing relays so
>> spamassassin is a bit over the top when you can just use another dns
>> based bl.
> 
> If by "any application" you mean "any application that can handle
> full-blown perl regular expressions" ... your regex examples are
> nontrivial, so you're already pretty much catering to SA anyway.

You completely misunderstood what I was suggesting. On the server side I 
shove this in my list:

^foo-\d+@example\.com$

Then when the client looks up foo-5@example.com I return a positive 
result. The client needs no regex capability.

-- 
Mike Cardwell
(https://secure.grepular.com/) (http://perlcv.com/)

Re: [SA] 419 emailBL?

Posted by Adam Katz <an...@khopis.com>.
Mike Cardwell wrote:
>>> For listing both emails and uri's it would be useful if you could add
>>> regular expressions. [...]

Steve Freegard responded:
>> Yuck; if you want to do stuff using regexp then:
>>
>> uri RULE_NAME /<regexp>/
>> score RULE_NAME nn.nnn
>>
>> Is the best way to do this - not via DNS.

Mike Cardwell defended:
> Depends what you're trying to achieve. I thought the objective was a
> block list of email addresses that could be queried via the DNS by any
> application... Your suggestion doesn't really capture the requirements.
> 
> In this particular example, the list should be used for preventing your
> users sending emails *to* those addresses. Many organisations rightly or
> wrongly don't perform spam filtering on their outgoing relays so
> spamassassin is a bit over the top when you can just use another dns
> based bl.

If by "any application" you mean "any application that can handle
full-blown perl regular expressions" ... your regex examples are
nontrivial, so you're already pretty much catering to SA anyway.

There's also the question of handling quotes and other forbidden
characters in the TXT field, plus its length limit.  Once that's all
solved, the question of feasibility and efficiency still looms.

Given the options of putting that kind of thing in (A) DNS or (B)
sa-channels, I'd lean towards (B) on the way to (C) something else:

I'm sure Justin Mason (for his sought channel) has thought long and
hard about this.  The mechanism for sa-update is brilliant, but
doesn't lend itself to enormous indices of frequently-changing
rulesets.  Even if it were revised to enable a diff/patch system (hint
hint), it would still fail to distribute the remaining load.

Justin:  Perhaps sa-update could support [version].torrent in addition
to [version].tar.gz on each mirror?  (This doesn't touch the current
DNS-based version/announce system.)  Channels hosted for versions of
SA after the supporting release (e.g. 0.4.3.[channel] and "higher")
would be allowed to host only the torrent file.

Either the self-healing nature of BT would implement the diffing
portion for free, or SA's BT client would merely choose which files in
the torrent to download (assuming there are perl-based clients that
support that... libtorrent does, but that's C-based), as it would
contain full.cf, [n-1].diff, [n-2].diff, [n-3].diff, and [last release
yesterday].diff (or the like).

... this is similar to my proposal for a distributed Blue Frog rehash,
http://khopesh.com/wiki/Ending_spam

-- 
Adam Katz
khopesh on irc://irc.freenode.net/#spamassassin
http://khopesh.com/Anti-spam

Re: 419 emailBL?

Posted by Mike Cardwell <sp...@lists.grepular.com>.
Steve Freegard wrote:

>> For listing both emails and uri's it would be useful if you could add
>> regular expressions. I'm not sure how you'd serve such an RBL though
>> without writing your own custom software or modifying an existing dns
>> server. Eg, it would be nice if you could add entries like this to the rbl:
>>
>> ^(?i)https?://[a-z]+\.example\.com/unsubscribe\.cgi\?id=\d+$
>>
>> And:
>>
>> ^(?i)customer-service-[A-Z]\d+@example\.(?:com|co\.uk)$
>>
> 
> Yuck; if you want to do stuff using regexp then:
> 
> uri RULE_NAME /<regexp>/
> score RULE_NAME nn.nnn
> 
> Is the best way to do this - not via DNS.

Depends what you're trying to achieve. I thought the objective was a 
block list of email addresses that could be queried via the DNS by any 
application... Your suggestion doesn't really capture the requirements.

In this particular example, the list should be used for preventing your 
users sending emails *to* those addresses. Many organisations rightly or 
wrongly don't perform spam filtering on their outgoing relays so 
spamassassin is a bit over the top when you can just use another dns 
based bl.

-- 
Mike Cardwell
(https://secure.grepular.com/) (http://perlcv.com/)

Re: 419 emailBL?

Posted by Steve Freegard <st...@stevefreegard.com>.
Mike Cardwell wrote:
> Steve Freegard wrote:
> 
>>>> A word of caution.  Be very careful how you use the list.  The
>>>> intended usage for the list is to prevent (or monitor) local users
>>>> from sending email to the listed addresses.  The phishers frequently
>>>> use compromised end-user accounts to receive the phishing replies, so
>>>> there is a high risk of false positives, especially if you attempt to
>>>> classify messages containing one these addresses as spam.
>>> Thread fork!
>>>
>>> Would it be useful to have a similar list for 419 fraud contact
>>> addresses?
>>>
>>> Discuss...
>>
>> That was always my intention - there are a couple of us looking at
>> several methods of automatically listing e-mail addresses present in the
>> body of spam or the Reply-To header to specifically target stuff that
>> often slips though with low scores.
>>
>> I'm also looking at listing URIs that are impossible to list in the
>> traditional URIBLs  e.g. groups.yahoo.com/groupname/message/1
> 
> For listing both emails and uri's it would be useful if you could add
> regular expressions. I'm not sure how you'd serve such an RBL though
> without writing your own custom software or modifying an existing dns
> server. Eg, it would be nice if you could add entries like this to the rbl:
> 
> ^(?i)https?://[a-z]+\.example\.com/unsubscribe\.cgi\?id=\d+$
> 
> And:
> 
> ^(?i)customer-service-[A-Z]\d+@example\.(?:com|co\.uk)$
> 

Yuck; if you want to do stuff using regexp then:

uri RULE_NAME /<regexp>/
score RULE_NAME nn.nnn

Is the best way to do this - not via DNS.

Regards,
Steve.

Re: 419 emailBL?

Posted by Mike Cardwell <sp...@lists.grepular.com>.
Steve Freegard wrote:

>>> A word of caution.  Be very careful how you use the list.  The
>>> intended usage for the list is to prevent (or monitor) local users
>>> from sending email to the listed addresses.  The phishers frequently
>>> use compromised end-user accounts to receive the phishing replies, so
>>> there is a high risk of false positives, especially if you attempt to
>>> classify messages containing one these addresses as spam.
>> Thread fork!
>>
>> Would it be useful to have a similar list for 419 fraud contact addresses?
>>
>> Discuss...
> 
> That was always my intention - there are a couple of us looking at
> several methods of automatically listing e-mail addresses present in the
> body of spam or the Reply-To header to specifically target stuff that
> often slips though with low scores.
> 
> I'm also looking at listing URIs that are impossible to list in the
> traditional URIBLs  e.g. groups.yahoo.com/groupname/message/1

For listing both emails and uri's it would be useful if you could add 
regular expressions. I'm not sure how you'd serve such an RBL though 
without writing your own custom software or modifying an existing dns 
server. Eg, it would be nice if you could add entries like this to the rbl:

^(?i)https?://[a-z]+\.example\.com/unsubscribe\.cgi\?id=\d+$

And:

^(?i)customer-service-[A-Z]\d+@example\.(?:com|co\.uk)$

-- 
Mike Cardwell
(https://secure.grepular.com/) (http://perlcv.com/)

Re: 419 emailBL?

Posted by Steve Freegard <st...@stevefreegard.com>.
John Hardin wrote:
> On Wed, 29 Apr 2009, Jesse Thompson wrote:
> 
>> A word of caution.  Be very careful how you use the list.  The
>> intended usage for the list is to prevent (or monitor) local users
>> from sending email to the listed addresses.  The phishers frequently
>> use compromised end-user accounts to receive the phishing replies, so
>> there is a high risk of false positives, especially if you attempt to
>> classify messages containing one these addresses as spam.
> 
> Thread fork!
> 
> Would it be useful to have a similar list for 419 fraud contact addresses?
> 
> Discuss...
> 

That was always my intention - there are a couple of us looking at
several methods of automatically listing e-mail addresses present in the
body of spam or the Reply-To header to specifically target stuff that
often slips though with low scores.

I'm also looking at listing URIs that are impossible to list in the
traditional URIBLs  e.g. groups.yahoo.com/groupname/message/1

Cheers,
Steve.