You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Marc Perkel <ma...@perkel.com> on 2008/09/22 23:49:52 UTC

Trying out a new concept

I don't know how this will work but I'm building the data now. For those 
of you who are familiar with Day old bread lists to detect new domains, 
as you know there's a lag time in the data and they often don't have 
data from all the registries. So - here's a different solution.

What I'm thinking is to accumulate every domain name that interacts with 
my system and storing it in a list. Eventually after a week or so I 
should have a good list. Then the idea is to do a lookup to see if a new 
domain is NOT on the list. This will catch all really new domains, but 
will have some false positives. But - if it is mixed with other 
conditionals it might be a good way to detect and block spam from or 
linking to tasting domains.

Thoughts?


RE: Trying out a new concept

Posted by Jeff Moss <jm...@Huffmancorp.com>.
This will actually work.  I've been involved in a university experiment doing this for over a year now.  Simply put, trying to create a list of new spammer domains is a "count to infinity" problem.  Creating a list of old domains is not.
 
  Jeff Moss

________________________________

From: Marc Perkel [mailto:marc@perkel.com]
Sent: Mon 9/22/2008 5:49 PM
To: users@spamassassin.apache.org
Subject: Trying out a new concept



I don't know how this will work but I'm building the data now. For those
of you who are familiar with Day old bread lists to detect new domains,
as you know there's a lag time in the data and they often don't have
data from all the registries. So - here's a different solution.

What I'm thinking is to accumulate every domain name that interacts with
my system and storing it in a list. Eventually after a week or so I
should have a good list. Then the idea is to do a lookup to see if a new
domain is NOT on the list. This will catch all really new domains, but
will have some false positives. But - if it is mixed with other
conditionals it might be a good way to detect and block spam from or
linking to tasting domains.

Thoughts?




Re: Trying out a new concept

Posted by Matt Kettler <mk...@verizon.net>.
Ken A wrote:
> Marc Perkel wrote:
>>
>>
>> Ken A wrote:
>>> Marc Perkel wrote:
>>>> I don't know how this will work but I'm building the data now. For
>>>> those of you who are familiar with Day old bread lists to detect
>>>> new domains, as you know there's a lag time in the data and they
>>>> often don't have data from all the registries. So - here's a
>>>> different solution.
>>>>
>>>> What I'm thinking is to accumulate every domain name that interacts
>>>> with my system and storing it in a list. Eventually after a week or
>>>> so I should have a good list. Then the idea is to do a lookup to
>>>> see if a new domain is NOT on the list. This will catch all really
>>>> new domains, but will have some false positives. But - if it is
>>>> mixed with other conditionals it might be a good way to detect and
>>>> block spam from or linking to tasting domains.
>>>>
>>>> Thoughts?
>>>>
>>>
>>> How will you keep your list from being easily polluted?
>>>
>>> Ken
>>
>> I'm not dure what you mean. The idea is to detect what's NOT on the
>> list. And also to track new entries for a week or so. I'm just in the
>> data accumulation stage. I only have one day of data. But the idea is
>> to detect new domains.
>>
>
> nevermind. You've since explained that you only plan to add new
> domains to your list if the domains are urls in known spam that you
> detect using other methods. Please don't call it DOB, since it's
> 'unseen' domains you are talking about.
>
> In your initial email, the only condition to be on the list was
> 'interacting with your system', which was very vague.
>

I'd agree, it's not DOB.. But I don't think Marc intended you to believe
it was exactly DOB. He just wanted you to start there so he could
explain his concept better. (This is a common tactic he uses, one which
often backfires on him as many people don't read his entire email). If
you didn't read his post closely, well, that happens, but don't accuse
him of calling it DOB. He was clearly doing a compare/contrast between
the two, not equating them.

In general seems more like a large-scale version of the "seen" database
generated by most greylist systems. It may have some DOB-like behaviors,
but it's not going to exactly be like a DOB system. That said, in some
ways, non-listing in this system could be used for some of the
applications that DOB is used for.

Personally, I might use a list like this to enforce longer greylist
durations in my milter-greylist config, and add smallish scores to
messages (~0.5) in SA and see how it proves out long-term.





Re: Trying out a new concept

Posted by Ken A <ka...@pacific.net>.
Marc Perkel wrote:
> 
> 
> Ken A wrote:
>> Marc Perkel wrote:
>>> I don't know how this will work but I'm building the data now. For 
>>> those of you who are familiar with Day old bread lists to detect new 
>>> domains, as you know there's a lag time in the data and they often 
>>> don't have data from all the registries. So - here's a different 
>>> solution.
>>>
>>> What I'm thinking is to accumulate every domain name that interacts 
>>> with my system and storing it in a list. Eventually after a week or 
>>> so I should have a good list. Then the idea is to do a lookup to see 
>>> if a new domain is NOT on the list. This will catch all really new 
>>> domains, but will have some false positives. But - if it is mixed 
>>> with other conditionals it might be a good way to detect and block 
>>> spam from or linking to tasting domains.
>>>
>>> Thoughts?
>>>
>>
>> How will you keep your list from being easily polluted?
>>
>> Ken
> 
> I'm not dure what you mean. The idea is to detect what's NOT on the 
> list. And also to track new entries for a week or so. I'm just in the 
> data accumulation stage. I only have one day of data. But the idea is to 
> detect new domains.
> 

nevermind. You've since explained that you only plan to add new domains 
to your list if the domains are urls in known spam that you detect using 
other methods. Please don't call it DOB, since it's 'unseen' domains you 
are talking about.

In your initial email, the only condition to be on the list was 
'interacting with your system', which was very vague.

Good luck,
Ken
-- 
Ken Anderson
Pacific.Net


Re: Trying out a new concept

Posted by Matthias Leisi <ma...@leisi.net>.
Karl Pearson schrieb:

> So, what about doing a whois query and 'grep' for the setup date? You

Good luck with parsing the myriad of output formats from the different
whois services. And good luck going after those that do not publish a
setup date (like eg the .de ccTLD).

-- Matthias


Re: Trying out a new concept

Posted by Duane Hill <d....@yournetplus.com>.
On Mon, 22 Sep 2008, Karl Pearson wrote:

> On Mon, 22 Sep 2008, Marc Perkel wrote:
>
>> 
>> 
>> McDonald, Dan wrote:
>>> On Mon, 2008-09-22 at 15:44 -0700, Marc Perkel wrote:
>>> 
>>>> Ken A wrote:
>>>> 
>>>>> Marc Perkel wrote:
>>>>> 
>>>>>> I don't know how this will work but I'm building the data now. For 
>>>>>> those of you who are familiar with Day old bread lists to detect new 
>>>>>> domains, as you know there's a lag time in the data and they often 
>>>>>> don't have data from all the registries. So - here's a different 
>>>>>> solution.
>>>>>> 
>>>>>> What I'm thinking is to accumulate every domain name that interacts 
>>>>>> with my system and storing it in a list. Eventually after a week or so 
>>>>>> I should have a good list. Then the idea is to do a lookup to see if a 
>>>>>> new domain is NOT on the list. This will catch all really new domains, 
>>>>>> but will have some false positives. But - if it is mixed with other 
>>>>>> conditionals it might be a good way to detect and block spam from or 
>>>>>> linking to tasting domains.
>>> 
>>> So, If for years I send mail to hundreds of people in my county, but
>>> never anything to your spamtraps or your legitimate mail, and then one
>>> day I decide to send you a single piece of mail, you will blacklist me
>>> as DOB?
>> 
>> No - that's not how it works. Being a stranger to the list doesn't get you 
>> blacklisted. It's just a factor that when combined with other factors 
>> indicates it's spam. And generally URI spam. I'm just using this as a way 
>> to discover new domains by what's not on a list as opposed to what is on a 
>> list.
>> 
>> And I don't yet know if it will work. I'm still building the list. I just 
>> wanted to throw the concept out there and see if it sparks innovation. It 
>> might turn out to be a dead end.
>
> So, what about doing a whois query and 'grep' for the setup date? You 
> theoretically could then just append that date to the domain name, and have 
> something to cross-reference...

Most whois servers have restrictions on high-volume queries via 
automation. I've been blocked for doing whois queries via a Perl script 
for domains on our server just to verify if a domain has moved away 
without notifying us. Although it is for a relative short period of time, 
it is a nuisance.

-d

Re: Trying out a new concept

Posted by Karl Pearson <ka...@ourldsfamily.com>.
On Mon, 22 Sep 2008, Marc Perkel wrote:

>
>
> McDonald, Dan wrote:
>> On Mon, 2008-09-22 at 15:44 -0700, Marc Perkel wrote:
>> 
>>> Ken A wrote:
>>> 
>>>> Marc Perkel wrote:
>>>> 
>>>>> I don't know how this will work but I'm building the data now. For those 
>>>>> of you who are familiar with Day old bread lists to detect new domains, 
>>>>> as you know there's a lag time in the data and they often don't have 
>>>>> data from all the registries. So - here's a different solution.
>>>>> 
>>>>> What I'm thinking is to accumulate every domain name that interacts with 
>>>>> my system and storing it in a list. Eventually after a week or so I 
>>>>> should have a good list. Then the idea is to do a lookup to see if a new 
>>>>> domain is NOT on the list. This will catch all really new domains, but 
>>>>> will have some false positives. But - if it is mixed with other 
>>>>> conditionals it might be a good way to detect and block spam from or 
>>>>> linking to tasting domains.
>>>>>
>>>>> 
>> 
>> So, If for years I send mail to hundreds of people in my county, but
>> never anything to your spamtraps or your legitimate mail, and then one
>> day I decide to send you a single piece of mail, you will blacklist me
>> as DOB?
>>
>> 
>
> No - that's not how it works. Being a stranger to the list doesn't get you 
> blacklisted. It's just a factor that when combined with other factors 
> indicates it's spam. And generally URI spam. I'm just using this as a way to 
> discover new domains by what's not on a list as opposed to what is on a list.
>
> And I don't yet know if it will work. I'm still building the list. I just 
> wanted to throw the concept out there and see if it sparks innovation. It 
> might turn out to be a dead end.
>
>

So, what about doing a whois query and 'grep' for the setup date? You 
theoretically could then just append that date to the domain name, and 
have something to cross-reference...

---
      _/  _/      _/      _/_/_/       ____________   __o
     _/ _/       _/      _/    _/     ____________  _-\\<._
    _/_/        _/      _/_/_/                     (_)/ (_)
   _/ _/       _/      _/           ......................
  _/   _/ arl _/_/_/  _/ earson    KarlP@ourldsfamily.com
---
http://consulting.ourldsfamily.com
---


Re: Trying out a new concept

Posted by Matthias Leisi <ma...@leisi.net>.

Marc Perkel schrieb:

> And I don't yet know if it will work. I'm still building the list. I
> just wanted to throw the concept out there and see if it sparks
> innovation. It might turn out to be a dead end.

I don't think if this is really innovative (my own recollection goes
back to an experimental DNS server I wrote for that in late 2005 [1],
and I certainly wasn't the first).

It turned out to be not that useful for me, so I dropped the project
(not only for me, also according to data referenced in [2], which of
course may be obsolete by now).

Maybe patterns have changed in the meantime, and it certainly makes
sense to test this again.

-- Matthias

[1]
http://matthias.leisi.net/archives/129-New-version-of-Domain-Age-DNS-Server.html
[2]
http://matthias.leisi.net/archives/128-Time-To-Live-for-spamvertized-domains.html

Re: Trying out a new concept

Posted by Marc Perkel <ma...@perkel.com>.

McDonald, Dan wrote:
> On Mon, 2008-09-22 at 15:44 -0700, Marc Perkel wrote:
>   
>> Ken A wrote:
>>     
>>> Marc Perkel wrote:
>>>       
>>>> I don't know how this will work but I'm building the data now. For 
>>>> those of you who are familiar with Day old bread lists to detect new 
>>>> domains, as you know there's a lag time in the data and they often 
>>>> don't have data from all the registries. So - here's a different 
>>>> solution.
>>>>
>>>> What I'm thinking is to accumulate every domain name that interacts 
>>>> with my system and storing it in a list. Eventually after a week or 
>>>> so I should have a good list. Then the idea is to do a lookup to see 
>>>> if a new domain is NOT on the list. This will catch all really new 
>>>> domains, but will have some false positives. But - if it is mixed 
>>>> with other conditionals it might be a good way to detect and block 
>>>> spam from or linking to tasting domains.
>>>>
>>>>         
>
> So, If for years I send mail to hundreds of people in my county, but
> never anything to your spamtraps or your legitimate mail, and then one
> day I decide to send you a single piece of mail, you will blacklist me
> as DOB?
>
>   

No - that's not how it works. Being a stranger to the list doesn't get 
you blacklisted. It's just a factor that when combined with other 
factors indicates it's spam. And generally URI spam. I'm just using this 
as a way to discover new domains by what's not on a list as opposed to 
what is on a list.

And I don't yet know if it will work. I'm still building the list. I 
just wanted to throw the concept out there and see if it sparks 
innovation. It might turn out to be a dead end.


Re: Trying out a new concept

Posted by "McDonald, Dan" <Da...@austinenergy.com>.
On Mon, 2008-09-22 at 18:17 -0500, Curtis LaMasters wrote:
> Daniel,  I think your missing the point, or I'm completely lost but I
> believe the point of the list is to tag domains with a registration
> date of a week or less when sending mail to you (prevent spam from
> newly registered domains).  I may be off but that's the way I
> understand DOB.

Right, but Mr. Perkel wants to recreate the data ex nihlo.

-- 
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com


Re: Trying out a new concept

Posted by Rob McEwen <ro...@invaluement.com>.
Blaine Fleming wrote:
> John Hardin wrote:
>> Why is it so flippin' difficult to get a feed of newly-registered 
>> domain names?
> Because the TLDs hate giving people access to the data and certainly 
> won't provide a feed without a bunch of cash involved.  Even worse, 
> all the ccTLDs pretty much refuse to even talk to you about access to 
> the zones.  This is why I started processing all the TLDs I was able 
> to obtain access to.  There is lag but the most it could be is about 
> 24 hours and that assumes they register a new domain immediately after 
> the TLD dumps the zone.
>
> Honestly, on my system I have less than 0.01% hits against a list of 
> domains registered in the last five days so I've always considered the 
> list a failure.  However, several others are reporting excellent hit 
> rates on it.  I think it is because the test is so far after 
> everything else though

To some extent, I like the concept. But I think the results are going to 
be somewhat limited because the sneakiest of spammers often allow their 
domains to "age" a bit for the very reason that "age of domain" is a 
common metric in the evaluation of domain reputation. Snowshoe spammers 
in particular have caught onto this fact in recent years/months. 
Therefore, the tendency will be for DOB lists to catch spam that was 
already well-caught, such as botnet-sent spams. (matching up with what 
Blaine said). Also, Marc is wise to consider combining this with other 
metrics because it is not that uncommon for some large and legit 
organization to blast out an e-mail to their members discussing some new 
web site which uses a domain name just bought a few days ago.

But, as someone else said, such a list might be effective for scoring 1 
point, or something like that. I'd be interested in putting such a list 
to use in my own spam filtering in such a manner.

-- 
Rob McEwen
http://dnsbl.invaluement.com/
rob@invaluement.com
+1 (478) 475-9032



Re: Trying out a new concept

Posted by Blaine Fleming <gr...@digital-z.com>.
SM wrote:
>
> Even if your traffic patterns are different, the hit rates shouldn't 
> be that low.  There would be a difference if your MTA uses a DNSBL to 
> reject or if you apply other pre-content filtering techniques.

It's not a matter of different traffic patterns as much as a matter of 
when I do the tests.  Incoming mail that is accepted is subjected to 
many tests before it is even checked against the new domains list.  If I 
put it closer to the front of the tests it would probably hit higher but 
I've never had much need to do so.

--Blaine

Re: Trying out a new concept

Posted by SM <sm...@resistor.net>.
Hi Blaine,
At 17:00 22-09-2008, Blaine Fleming wrote:
>Honestly, on my system I have less than 0.01% hits against a list of 
>domains registered in the last five days so I've always considered 
>the list a failure.  However, several others are reporting excellent 
>hit rates on it.  I think it is because the test is so far after 
>everything else though.

Even if your traffic patterns are different, the hit rates shouldn't 
be that low.  There would be a difference if your MTA uses a DNSBL to 
reject or if you apply other pre-content filtering techniques.

Regards,
-sm 


Re: Trying out a new concept

Posted by John Hardin <jh...@impsec.org>.
On Mon, 2008-09-22 at 17:13 -0700, Marc Perkel wrote:

> Where I'm getting hits is on spam bots that link to these new domains. 
> Spambots are easy to detect because they never use the QUIT command to 
> clost the connection. So if a spambot message links to an "unfamliar" 
> domain (a domain NOT on my list) then that domain goes into my URIBL 
> list which I'm going to ship off to the folks at SURBL, which will 
> trickle down to you all here.
> 
> That is the plan - if it works. And it will get the offenders listed 
> quickly.

Best of luck with that. It will be interesting to see how it turns out.

-- 
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Obama? McCain? I'm so sick of our elections always being
  "choose the lesser of two evils."
-----------------------------------------------------------------------
 43 days until the Presidential Election


Re: Trying out a new concept

Posted by Ken A <ka...@pacific.net>.
Marc Perkel wrote:
> 
> 
> Blaine Fleming wrote:
>> John Hardin wrote:
>>> Why is it so flippin' difficult to get a feed of newly-registered 
>>> domain names?
>>
>> Because the TLDs hate giving people access to the data and certainly 
>> won't provide a feed without a bunch of cash involved.  Even worse, 
>> all the ccTLDs pretty much refuse to even talk to you about access to 
>> the zones.  This is why I started processing all the TLDs I was able 
>> to obtain access to.  There is lag but the most it could be is about 
>> 24 hours and that assumes they register a new domain immediately after 
>> the TLD dumps the zone.
>>
>> Honestly, on my system I have less than 0.01% hits against a list of 
>> domains registered in the last five days so I've always considered the 
>> list a failure.  However, several others are reporting excellent hit 
>> rates on it.  I think it is because the test is so far after 
>> everything else though.
>>
>> --Blaine
>>
> 
> Thanks Blaine,
> 
> John, the problem is that even if you have access to the data you have 
> to compare gigabyts to the previous day so there's a big delay in even 
> producing the lists. So my experiment is not to figure out how to get 
> them listed, but detect them from not being listed. I'm also NOT testing 
> this with SA. I'm using Exim rules and combining it with other sins to 
> produce an RBL list that those of you using SA can use.
> 
> Where I'm getting hits is on spam bots that link to these new domains. 
> Spambots are easy to detect because they never use the QUIT command to 
> clost the connection. So if a spambot message links to an "unfamliar" 
> domain (a domain NOT on my list) then that domain goes into my URIBL 
> list which I'm going to ship off to the folks at SURBL, which will 
> trickle down to you all here.

Is this data coming from connections to you free tempfail mx service?
Ken



> 
> That is the plan - if it works. And it will get the offenders listed 
> quickly.
> 
> 


-- 
Ken Anderson
Pacific.Net


Re: Trying out a new concept

Posted by Marc Perkel <ma...@perkel.com>.

Blaine Fleming wrote:
> John Hardin wrote:
>> Why is it so flippin' difficult to get a feed of newly-registered 
>> domain names?
>
> Because the TLDs hate giving people access to the data and certainly 
> won't provide a feed without a bunch of cash involved.  Even worse, 
> all the ccTLDs pretty much refuse to even talk to you about access to 
> the zones.  This is why I started processing all the TLDs I was able 
> to obtain access to.  There is lag but the most it could be is about 
> 24 hours and that assumes they register a new domain immediately after 
> the TLD dumps the zone.
>
> Honestly, on my system I have less than 0.01% hits against a list of 
> domains registered in the last five days so I've always considered the 
> list a failure.  However, several others are reporting excellent hit 
> rates on it.  I think it is because the test is so far after 
> everything else though.
>
> --Blaine
>

Thanks Blaine,

John, the problem is that even if you have access to the data you have 
to compare gigabyts to the previous day so there's a big delay in even 
producing the lists. So my experiment is not to figure out how to get 
them listed, but detect them from not being listed. I'm also NOT testing 
this with SA. I'm using Exim rules and combining it with other sins to 
produce an RBL list that those of you using SA can use.

Where I'm getting hits is on spam bots that link to these new domains. 
Spambots are easy to detect because they never use the QUIT command to 
clost the connection. So if a spambot message links to an "unfamliar" 
domain (a domain NOT on my list) then that domain goes into my URIBL 
list which I'm going to ship off to the folks at SURBL, which will 
trickle down to you all here.

That is the plan - if it works. And it will get the offenders listed 
quickly.



RE: Trying out a new concept

Posted by "McDonald, Dan" <Da...@austinenergy.com>.
Sorry for the top-post, I'm using a brain-damaged web-mailer...

Actually, I think it is to uribl_gold list that is the real day-old-bread list.  You have to subscribe to a datafeed service to get the gold list.  


-----Original Message-----
From: John Hardin [mailto:jhardin@impsec.org]
Sent: Mon 22-Sep-08 20:45
To: Blaine Fleming
Cc: users@spamassassin.apache.org
Subject: Re: Trying out a new concept
 
On Mon, 2008-09-22 at 18:26 -0600, Blaine Fleming wrote:
> John Hardin wrote:
> >
> >> This is why I started processing all the TLDs I was able to obtain 
> >> access to.  There is lag but the most it could be is about 24 hours 
> >> and that assumes they register a new domain immediately after the TLD 
> >> dumps the zone.
> >
> > Does your data allow mapping domain name to registrar? If so, you 
> > might want to try implementing a URIBL for the Evil Registrars as has 
> > been discussed from time to time on the list...
> >
> 
> I've thought about doing that but it seems redundant since URIBL already 
> does.  At least they seem to have it published on their site so I'm 
> pretty sure it's included in their zones too.

...now that you mention it:

red.uribl.com - This list contains domains that actively show up in mail
flow, are not listed on URIBL black, and are either very young (domain
age via whois), or use whois privacy features to protect their identity.

-- 
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Obama? McCain? I'm so sick of our elections always being
  "choose the lesser of two evils."
-----------------------------------------------------------------------
 43 days until the Presidential Election



Re: Trying out a new concept

Posted by John Hardin <jh...@impsec.org>.
On Mon, 2008-09-22 at 18:26 -0600, Blaine Fleming wrote:
> John Hardin wrote:
> >
> >> This is why I started processing all the TLDs I was able to obtain 
> >> access to.  There is lag but the most it could be is about 24 hours 
> >> and that assumes they register a new domain immediately after the TLD 
> >> dumps the zone.
> >
> > Does your data allow mapping domain name to registrar? If so, you 
> > might want to try implementing a URIBL for the Evil Registrars as has 
> > been discussed from time to time on the list...
> >
> 
> I've thought about doing that but it seems redundant since URIBL already 
> does.  At least they seem to have it published on their site so I'm 
> pretty sure it's included in their zones too.

...now that you mention it:

red.uribl.com - This list contains domains that actively show up in mail
flow, are not listed on URIBL black, and are either very young (domain
age via whois), or use whois privacy features to protect their identity.

-- 
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Obama? McCain? I'm so sick of our elections always being
  "choose the lesser of two evils."
-----------------------------------------------------------------------
 43 days until the Presidential Election


Re: Trying out a new concept

Posted by Blaine Fleming <gr...@digital-z.com>.
John Hardin wrote:
>
>> This is why I started processing all the TLDs I was able to obtain 
>> access to.  There is lag but the most it could be is about 24 hours 
>> and that assumes they register a new domain immediately after the TLD 
>> dumps the zone.
>
> Does your data allow mapping domain name to registrar? If so, you 
> might want to try implementing a URIBL for the Evil Registrars as has 
> been discussed from time to time on the list...
>

I've thought about doing that but it seems redundant since URIBL already 
does.  At least they seem to have it published on their site so I'm 
pretty sure it's included in their zones too.

--Blaine


Re: Trying out a new concept

Posted by John Hardin <jh...@impsec.org>.
On Mon, 22 Sep 2008, Blaine Fleming wrote:

> John Hardin wrote:
>>  Why is it so flippin' difficult to get a feed of newly-registered domain
>>  names?
>
> Because the TLDs hate giving people access to the data and certainly 
> won't provide a feed without a bunch of cash involved.  Even worse, all 
> the ccTLDs pretty much refuse to even talk to you about access to the 
> zones.

Note to self: remember, the answer to any question beginning "why" is 
always "money". :)

> This is why I started processing all the TLDs I was able to obtain 
> access to.  There is lag but the most it could be is about 24 hours and 
> that assumes they register a new domain immediately after the TLD dumps 
> the zone.

Does your data allow mapping domain name to registrar? If so, you might 
want to try implementing a URIBL for the Evil Registrars as has been 
discussed from time to time on the list...

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Those in the media have donated to Obama at a 100:1 ratio compared
  to McCain. Are we to believe that this bias does not in any way
  taint their coverage of the campaign?
-----------------------------------------------------------------------
  43 days until the Presidential Election

Re: Trying out a new concept

Posted by Blaine Fleming <gr...@digital-z.com>.
John Hardin wrote:
> Why is it so flippin' difficult to get a feed of newly-registered 
> domain names?

Because the TLDs hate giving people access to the data and certainly 
won't provide a feed without a bunch of cash involved.  Even worse, all 
the ccTLDs pretty much refuse to even talk to you about access to the 
zones.  This is why I started processing all the TLDs I was able to 
obtain access to.  There is lag but the most it could be is about 24 
hours and that assumes they register a new domain immediately after the 
TLD dumps the zone.

Honestly, on my system I have less than 0.01% hits against a list of 
domains registered in the last five days so I've always considered the 
list a failure.  However, several others are reporting excellent hit 
rates on it.  I think it is because the test is so far after everything 
else though.

--Blaine




Re: Trying out a new concept

Posted by John Hardin <jh...@impsec.org>.
On Mon, 22 Sep 2008, Curtis LaMasters wrote:

> Daniel, I think your missing the point, or I'm completely lost but I 
> believe the point of the list is to tag domains with a registration date 
> of a week or less when sending mail to you (prevent spam from newly 
> registered domains).

Marc didn't say anything about registration dates. It sounds like he's 
trying to avoid depending on registrar data, which to me makes his 
solution extremely non-portable.

Why is it so flippin' difficult to get a feed of newly-registered domain 
names?

> On Mon, Sep 22, 2008 at 5:52 PM, McDonald, Dan <
> Dan.McDonald@austinenergy.com> wrote:
>
>> On Mon, 2008-09-22 at 15:44 -0700, Marc Perkel wrote:
>>>
>>> Ken A wrote:
>>>> Marc Perkel wrote:
>>>>> I don't know how this will work but I'm building the data now. For
>>>>> those of you who are familiar with Day old bread lists to detect new
>>>>> domains, as you know there's a lag time in the data and they often
>>>>> don't have data from all the registries. So - here's a different
>>>>> solution.
>>>>>
>>>>> What I'm thinking is to accumulate every domain name that interacts
>>>>> with my system and storing it in a list. Eventually after a week or
>>>>> so I should have a good list. Then the idea is to do a lookup to see
>>>>> if a new domain is NOT on the list. This will catch all really new
>>>>> domains, but will have some false positives. But - if it is mixed
>>>>> with other conditionals it might be a good way to detect and block
>>>>> spam from or linking to tasting domains.
>>
>> So, If for years I send mail to hundreds of people in my county, but
>> never anything to your spamtraps or your legitimate mail, and then one
>> day I decide to send you a single piece of mail, you will blacklist me
>> as DOB?

I wouldn't say "blacklist", I'd say "add a point to the SA score".

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Obama is a three-year senator without a single important
  legislative achievement to his name, a former Illinois state
  senator who voted "present" nearly 130 times. As president of the
  Harvard Law Review, as law professor and as legislator, has he ever
  produced a single notable piece of scholarship? Written a single
  memorable article? His most memorable work is a biography of his
  favorite subject: himself.                    -- Charles Krauthammer
-----------------------------------------------------------------------
  43 days until the Presidential Election

Re: Trying out a new concept

Posted by Curtis LaMasters <cu...@gmail.com>.
Daniel,  I think your missing the point, or I'm completely lost but I
believe the point of the list is to tag domains with a registration date of
a week or less when sending mail to you (prevent spam from newly registered
domains).  I may be off but that's the way I understand DOB.

Curtis LaMasters
http://www.curtis-lamasters.com
http://www.builtnetworks.com


On Mon, Sep 22, 2008 at 5:52 PM, McDonald, Dan <
Dan.McDonald@austinenergy.com> wrote:

> On Mon, 2008-09-22 at 15:44 -0700, Marc Perkel wrote:
> >
> > Ken A wrote:
> > > Marc Perkel wrote:
> > >> I don't know how this will work but I'm building the data now. For
> > >> those of you who are familiar with Day old bread lists to detect new
> > >> domains, as you know there's a lag time in the data and they often
> > >> don't have data from all the registries. So - here's a different
> > >> solution.
> > >>
> > >> What I'm thinking is to accumulate every domain name that interacts
> > >> with my system and storing it in a list. Eventually after a week or
> > >> so I should have a good list. Then the idea is to do a lookup to see
> > >> if a new domain is NOT on the list. This will catch all really new
> > >> domains, but will have some false positives. But - if it is mixed
> > >> with other conditionals it might be a good way to detect and block
> > >> spam from or linking to tasting domains.
> > >>
>
> So, If for years I send mail to hundreds of people in my county, but
> never anything to your spamtraps or your legitimate mail, and then one
> day I decide to send you a single piece of mail, you will blacklist me
> as DOB?
>
>
> --
> Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
> Austin Energy
> http://www.austinenergy.com
>
>

Re: Trying out a new concept

Posted by "McDonald, Dan" <Da...@austinenergy.com>.
On Mon, 2008-09-22 at 15:44 -0700, Marc Perkel wrote:
> 
> Ken A wrote:
> > Marc Perkel wrote:
> >> I don't know how this will work but I'm building the data now. For 
> >> those of you who are familiar with Day old bread lists to detect new 
> >> domains, as you know there's a lag time in the data and they often 
> >> don't have data from all the registries. So - here's a different 
> >> solution.
> >>
> >> What I'm thinking is to accumulate every domain name that interacts 
> >> with my system and storing it in a list. Eventually after a week or 
> >> so I should have a good list. Then the idea is to do a lookup to see 
> >> if a new domain is NOT on the list. This will catch all really new 
> >> domains, but will have some false positives. But - if it is mixed 
> >> with other conditionals it might be a good way to detect and block 
> >> spam from or linking to tasting domains.
> >>

So, If for years I send mail to hundreds of people in my county, but
never anything to your spamtraps or your legitimate mail, and then one
day I decide to send you a single piece of mail, you will blacklist me
as DOB?


-- 
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com


Re: Trying out a new concept

Posted by Marc Perkel <ma...@perkel.com>.

Ken A wrote:
> Marc Perkel wrote:
>> I don't know how this will work but I'm building the data now. For 
>> those of you who are familiar with Day old bread lists to detect new 
>> domains, as you know there's a lag time in the data and they often 
>> don't have data from all the registries. So - here's a different 
>> solution.
>>
>> What I'm thinking is to accumulate every domain name that interacts 
>> with my system and storing it in a list. Eventually after a week or 
>> so I should have a good list. Then the idea is to do a lookup to see 
>> if a new domain is NOT on the list. This will catch all really new 
>> domains, but will have some false positives. But - if it is mixed 
>> with other conditionals it might be a good way to detect and block 
>> spam from or linking to tasting domains.
>>
>> Thoughts?
>>
>
> How will you keep your list from being easily polluted?
>
> Ken

I'm not dure what you mean. The idea is to detect what's NOT on the 
list. And also to track new entries for a week or so. I'm just in the 
data accumulation stage. I only have one day of data. But the idea is to 
detect new domains.


Re: Trying out a new concept

Posted by Ken A <ka...@pacific.net>.
Marc Perkel wrote:
> I don't know how this will work but I'm building the data now. For those 
> of you who are familiar with Day old bread lists to detect new domains, 
> as you know there's a lag time in the data and they often don't have 
> data from all the registries. So - here's a different solution.
> 
> What I'm thinking is to accumulate every domain name that interacts with 
> my system and storing it in a list. Eventually after a week or so I 
> should have a good list. Then the idea is to do a lookup to see if a new 
> domain is NOT on the list. This will catch all really new domains, but 
> will have some false positives. But - if it is mixed with other 
> conditionals it might be a good way to detect and block spam from or 
> linking to tasting domains.
> 
> Thoughts?
> 

How will you keep your list from being easily polluted?

Ken

-- 
Ken Anderson
Pacific.Net


Re: Trying out a new concept

Posted by Paweł Sasin <ha...@wp-sa.pl>.
> I don't know how this will work but I'm building the data now. For
> those of you who are familiar with Day old bread lists to detect new
> domains, as you know there's a lag time in the data and they often
> don't have data from all the registries. So - here's a different
> solution.
> 
> What I'm thinking is to accumulate every domain name that interacts
> with my system and storing it in a list. Eventually after a week or
> so I should have a good list. Then the idea is to do a lookup to see
> if a new domain is NOT on the list. This will catch all really new
> domains, but will have some false positives. But - if it is mixed
> with other conditionals it might be a good way to detect and block
> spam from or linking to tasting domains.

If you use the AWL, you have the list ready. Just scan the AWL DB for
domain names. 

AWL has even more precise data than you want to gather. We could use
it as well. If we assume we trust a new sender less than a sender
we've already seen, then just score any message that has a sender not
contained in the AWL DB (eg +1.0). 

-- 
Paweł Sasin

"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul.
Traugutta 115 C, wpisana do Krajowego Rejestru Sadowego - Rejestru
Przedsiebiorcow prowadzonego przez Sad Rejonowy Gdansk - Polnoc w
Gdansku pod numerem KRS 0000068548, o kapitale zakladowym
67.980.024,00  zlotych oplaconym w calosci oraz Numerze Identyfikacji
Podatkowej 957-07-51-216.