You are viewing a plain text version of this content. The canonical link for it is here.
Posted to sysadmins@spamassassin.apache.org by Henrik K <he...@hege.li> on 2019/06/25 12:33:52 UTC

Number of SpamAssassin installations

Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
I'm curious how many active SA installations there actually are?  I realize
it's just a ballpark figure..


Re: Number of SpamAssassin installations

Posted by Jari Fredriksson <ja...@iki.fi>.

> Jari Fredriksson <ja...@iki.fi> kirjoitti 28.6.2019 kello 10.18:
> 
> 
> 
>> Kevin A. McGrail <km...@apache.org> kirjoitti 28.6.2019 kello 1.43:
>> 
>> On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
>>> 
>>>> Kevin A. McGrail <km...@apache.org> kirjoitti 26.6.2019 kello 19.09:
>>>> 
>>>> Agreed.  I think David just had a simple command he ran on his logs or
>>>> the project's mirror.  I can get you access to the mirror the project
>>>> runs if you want to look at it.
>>>> 
>>>> On 6/26/2019 8:51 AM, Henrik K wrote:
>>>>> It's just simple awk/grep, no need for fancy scripts.. :-)
>>>>> 
>>>>> One month from single mirror should be enough to get a ballpark, weight can be
>>>>> calculated to it.
>>>>> 
>>>>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>>>>>> I seem to remember that yes, David Jones wrote a log parser for some
>>>>>> information on this but there is no centralization of logs.  Each mirror
>>>>>> would have to run and report.
>>>>>> 
>>>>>> On 6/25/2019 8:33 AM, Henrik K wrote:
>>>>>>> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
>>>>>>> I'm curious how many active SA installations there actually are?  I realize
>>>>>>> it's just a ballpark figure..
>>>>>>> 
>>> Here is mine for 6 last months. This is sa-update.bitwell.fi
>>> 
>>> jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
>>> 1700178 /tmp/unique-combined.txt
>>> 
>>> Br. jarif
>> 
>> What command did you run to get that so I run the same and I'll get my
>> numbers.
> 
> I have two machines for this and the command was unique for each of them, as the token for IP varies in the logs of them. But this is more stock
> 
> # grep -w "pound:" /var/log/messages*|awk '{print $7;}'|sort|uniq >/tmp/unique-as-updaters-www.txt
> 
> br. jarif

I use pound reverse proxy in front of my ngxin so I took the data from it. The command is of course different from other logs, like nginx or apache access log.




Re: Number of SpamAssassin installations

Posted by Jari Fredriksson <ja...@iki.fi>.

> Kevin A. McGrail <km...@apache.org> kirjoitti 28.6.2019 kello 1.43:
> 
> On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
>> 
>>> Kevin A. McGrail <km...@apache.org> kirjoitti 26.6.2019 kello 19.09:
>>> 
>>> Agreed.  I think David just had a simple command he ran on his logs or
>>> the project's mirror.  I can get you access to the mirror the project
>>> runs if you want to look at it.
>>> 
>>> On 6/26/2019 8:51 AM, Henrik K wrote:
>>>> It's just simple awk/grep, no need for fancy scripts.. :-)
>>>> 
>>>> One month from single mirror should be enough to get a ballpark, weight can be
>>>> calculated to it.
>>>> 
>>>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>>>>> I seem to remember that yes, David Jones wrote a log parser for some
>>>>> information on this but there is no centralization of logs.  Each mirror
>>>>> would have to run and report.
>>>>> 
>>>>> On 6/25/2019 8:33 AM, Henrik K wrote:
>>>>>> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
>>>>>> I'm curious how many active SA installations there actually are?  I realize
>>>>>> it's just a ballpark figure..
>>>>>> 
>> Here is mine for 6 last months. This is sa-update.bitwell.fi
>> 
>> jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
>> 1700178 /tmp/unique-combined.txt
>> 
>> Br. jarif
> 
> What command did you run to get that so I run the same and I'll get my
> numbers.

I have two machines for this and the command was unique for each of them, as the token for IP varies in the logs of them. But this is more stock

# grep -w "pound:" /var/log/messages*|awk '{print $7;}'|sort|uniq >/tmp/unique-as-updaters-www.txt

br. jarif 

Re: Number of SpamAssassin installations

Posted by Henrik K <he...@hege.li>.
On Sat, Jun 29, 2019 at 12:07:29AM +0300, Henrik K wrote:
>
> Here's sa-update.spamassassin.org, about a months duration, weight=10 should
> guarantee quite accurate number for that.
> 
> cat sa-update-access_log* | egrep 'tar\.gz .*"(curl|Wget|fetch|libwww|sa-update)' | awk '{print $1}' | sort -u | wc -l
> 1138747
> (unique C-classes from those: 352857)
> 
> Here's some interesting User-Agent's from those that use LWP, unique IP count:
> 
>  128978 sa-update/svn917659/3.3.1
>   88423 sa-update/svn917659/3.3.2
>   20717 sa-update/svn1652181/3.4.1
>    7315 sa-update/3.4.2 / svn1840377/3.4.2
>    6282 sa-update/svn1475932/3.4.0
>    1553 sa-update/svnunknown/3.4.2
>....
> Amazing to see some 3.1 there too. Hopefully most are just some useless
> boxes with cron left running.

Here's most recent stats for last ~month.  Might be interesting regarding
the latest SHA-1 debacle..

1127974 unique IPs
355623 unique C-classes

Below some User-Agents processed.  List is made from unique IP/User-Agent
pairs to reflect number of users better.

It's nice to see 3.4.3 on top, yet worrying to see all those unpatched
redhat derivates..

 296712 sa-update/3.4.3 / svn1869639/3.4.3
 203254 curl/7.29.0 redhat/centos7?
 117014 curl/7.19.7 redhat/centos6
 114180 sa-update/svn917659/3.3.1 redhat/centos6?
  82072 sa-update/svn917659/3.3.2 redhat/centos6 fedora/atomic?
  14047 sa-update/svn1652181/3.4.1
  13311 sa-update/3.4.2 / svn1840377/3.4.2
   7038 curl/7.15.5 redhat/centos5?
   5827 sa-update/svn1475932/3.4.0
   2241 sa-update/svnunknown/3.4.2
    594 sa-update/3.4.4 / svn1869639/3.4.4
    370 sa-update/svn507100/3.1.8
    277 sa-update/svn897929/3.3.0
    232 sa-update/svn917659/3.4.2
    211 sa-update/3.4.4-rc1 / svn1869639/3.4.4
    190 sa-update/svnunknown/3.4.3
    144 sa-update/svn607589/3.2.4
...dropped rest


Re: Number of SpamAssassin installations

Posted by Henrik K <he...@hege.li>.
On Fri, Jun 28, 2019 at 10:09:56AM +0300, Henrik K wrote:
> On Thu, Jun 27, 2019 at 06:43:42PM -0400, Kevin A. McGrail wrote:
> > On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
> > >
> > >> Kevin A. McGrail <km...@apache.org> kirjoitti 26.6.2019 kello 19.09:
> > >>
> > >> Agreed.  I think David just had a simple command he ran on his logs or
> > >> the project's mirror.  I can get you access to the mirror the project
> > >> runs if you want to look at it.
> > >>
> > >> On 6/26/2019 8:51 AM, Henrik K wrote:
> > >>> It's just simple awk/grep, no need for fancy scripts.. :-)
> > >>>
> > >>> One month from single mirror should be enough to get a ballpark, weight can be
> > >>> calculated to it.
> > >>>
> > >>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
> > >>>> I seem to remember that yes, David Jones wrote a log parser for some
> > >>>> information on this but there is no centralization of logs.  Each mirror
> > >>>> would have to run and report.
> > >>>>
> > >>>> On 6/25/2019 8:33 AM, Henrik K wrote:
> > >>>>> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
> > >>>>> I'm curious how many active SA installations there actually are?  I realize
> > >>>>> it's just a ballpark figure..
> > >>>>>
> > > Here is mine for 6 last months. This is sa-update.bitwell.fi
> > >
> > > jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
> > > 1700178 /tmp/unique-combined.txt
> > >
> > > Br. jarif
> > 
> > What command did you run to get that so I run the same and I'll get my
> > numbers.
> 
> I would check tar.gz downloads and correct user-agents to ignore bots etc.
> 
> zcat access*log* | egrep 'tar\.gz .*"(curl|Wget|fetch|libwww)' | awk '{print $1}' | sort -u | wc -l

Here's sa-update.spamassassin.org, about a months duration, weight=10 should
guarantee quite accurate number for that.

cat sa-update-access_log* | egrep 'tar\.gz .*"(curl|Wget|fetch|libwww|sa-update)' | awk '{print $1}' | sort -u | wc -l
1138747
(unique C-classes from those: 352857)

Here's some interesting User-Agent's from those that use LWP, unique IP count:

 128978 sa-update/svn917659/3.3.1
  88423 sa-update/svn917659/3.3.2
  20717 sa-update/svn1652181/3.4.1
   7315 sa-update/3.4.2 / svn1840377/3.4.2
   6282 sa-update/svn1475932/3.4.0
   1553 sa-update/svnunknown/3.4.2
    411 sa-update/svn507100/3.1.8
    346 sa-update/svn897929/3.3.0
    206 sa-update/svn917659/3.4.2
     68 sa-update/svn540384/3.2.1
     58 sa-update/svn607589/3.2.4
     37 sa-update/svn917659/3.4.0
     32 sa-update/svn607589/3.2.5
     20 sa-update/svn910278/3.4.0
      8 sa-update/svn540384/3.2.3
      7 sa-update/svn607589/3.3.2
      5 sa-update/svn540384/3.3.2
      4 sa-update/3.4.2 / svn1854476/3.4.2
      3 sa-update/svn540384/3.4.1
      3 sa-update/svn454083/3.1.7
      2 sa-update/svn897929/3.3.2
      2 sa-update/svn882245/3.3.0
      2 sa-update/svn540384/3.3.1
      1 sa-update/svn815500/3.3.0
      1 sa-update/svn540384/3.4.0
      1 sa-update/svn540384/3.3.0
      1 sa-update/svn540384/3.2.2
      1 sa-update/svn523403/3.2.0
      1 sa-update/svn1028810/3.4.0
      1 sa-update/4.0.0-r1854477 / svn1861181/4.0.0
      1 sa-update/4.0.0-r1854477 / svn1860877/4.0.0

Amazing to see some 3.1 there too. Hopefully most are just some useless
boxes with cron left running.


Re: Number of SpamAssassin installations

Posted by Henrik K <he...@hege.li>.
On Thu, Jun 27, 2019 at 06:43:42PM -0400, Kevin A. McGrail wrote:
> On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
> >
> >> Kevin A. McGrail <km...@apache.org> kirjoitti 26.6.2019 kello 19.09:
> >>
> >> Agreed.  I think David just had a simple command he ran on his logs or
> >> the project's mirror.  I can get you access to the mirror the project
> >> runs if you want to look at it.
> >>
> >> On 6/26/2019 8:51 AM, Henrik K wrote:
> >>> It's just simple awk/grep, no need for fancy scripts.. :-)
> >>>
> >>> One month from single mirror should be enough to get a ballpark, weight can be
> >>> calculated to it.
> >>>
> >>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
> >>>> I seem to remember that yes, David Jones wrote a log parser for some
> >>>> information on this but there is no centralization of logs.  Each mirror
> >>>> would have to run and report.
> >>>>
> >>>> On 6/25/2019 8:33 AM, Henrik K wrote:
> >>>>> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
> >>>>> I'm curious how many active SA installations there actually are?  I realize
> >>>>> it's just a ballpark figure..
> >>>>>
> > Here is mine for 6 last months. This is sa-update.bitwell.fi
> >
> > jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
> > 1700178 /tmp/unique-combined.txt
> >
> > Br. jarif
> 
> What command did you run to get that so I run the same and I'll get my
> numbers.

I would check tar.gz downloads and correct user-agents to ignore bots etc.

zcat access*log* | egrep 'tar\.gz .*"(curl|Wget|fetch|libwww)' | awk '{print $1}' | sort -u | wc -l


Re: Number of SpamAssassin installations

Posted by "Kevin A. McGrail" <km...@apache.org>.
On 6/27/2019 4:40 AM, Jari Fredriksson wrote:
>
>> Kevin A. McGrail <km...@apache.org> kirjoitti 26.6.2019 kello 19.09:
>>
>> Agreed.  I think David just had a simple command he ran on his logs or
>> the project's mirror.  I can get you access to the mirror the project
>> runs if you want to look at it.
>>
>> On 6/26/2019 8:51 AM, Henrik K wrote:
>>> It's just simple awk/grep, no need for fancy scripts.. :-)
>>>
>>> One month from single mirror should be enough to get a ballpark, weight can be
>>> calculated to it.
>>>
>>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>>>> I seem to remember that yes, David Jones wrote a log parser for some
>>>> information on this but there is no centralization of logs.  Each mirror
>>>> would have to run and report.
>>>>
>>>> On 6/25/2019 8:33 AM, Henrik K wrote:
>>>>> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
>>>>> I'm curious how many active SA installations there actually are?  I realize
>>>>> it's just a ballpark figure..
>>>>>
> Here is mine for 6 last months. This is sa-update.bitwell.fi
>
> jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
> 1700178 /tmp/unique-combined.txt
>
> Br. jarif

What command did you run to get that so I run the same and I'll get my
numbers.

-- 
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


Re: Number of SpamAssassin installations

Posted by Jari Fredriksson <ja...@iki.fi>.

> Kevin A. McGrail <km...@apache.org> kirjoitti 26.6.2019 kello 19.09:
> 
> Agreed.  I think David just had a simple command he ran on his logs or
> the project's mirror.  I can get you access to the mirror the project
> runs if you want to look at it.
> 
> On 6/26/2019 8:51 AM, Henrik K wrote:
>> It's just simple awk/grep, no need for fancy scripts.. :-)
>> 
>> One month from single mirror should be enough to get a ballpark, weight can be
>> calculated to it.
>> 
>> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>>> I seem to remember that yes, David Jones wrote a log parser for some
>>> information on this but there is no centralization of logs.  Each mirror
>>> would have to run and report.
>>> 
>>> On 6/25/2019 8:33 AM, Henrik K wrote:
>>>> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
>>>> I'm curious how many active SA installations there actually are?  I realize
>>>> it's just a ballpark figure..
>>>> 

Here is mine for 6 last months. This is sa-update.bitwell.fi

jarif@gauntlet ~ $ wc -l /tmp/unique-combined.txt 
1700178 /tmp/unique-combined.txt

Br. jarif

Re: Number of SpamAssassin installations

Posted by "Kevin A. McGrail" <km...@apache.org>.
Agreed.  I think David just had a simple command he ran on his logs or
the project's mirror.  I can get you access to the mirror the project
runs if you want to look at it.

On 6/26/2019 8:51 AM, Henrik K wrote:
> It's just simple awk/grep, no need for fancy scripts.. :-)
>
> One month from single mirror should be enough to get a ballpark, weight can be
> calculated to it.
>
> On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
>> I seem to remember that yes, David Jones wrote a log parser for some
>> information on this but there is no centralization of logs.  Each mirror
>> would have to run and report.
>>
>> On 6/25/2019 8:33 AM, Henrik K wrote:
>>> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
>>> I'm curious how many active SA installations there actually are?  I realize
>>> it's just a ballpark figure..
>>>
>> -- 
>> Kevin A. McGrail
>> Member, Apache Software Foundation
>> Chair Emeritus Apache SpamAssassin Project
>> https://www.linkedin.com/in/kmcgrail - 703.798.0171


-- 
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


Re: Number of SpamAssassin installations

Posted by Henrik K <he...@hege.li>.
It's just simple awk/grep, no need for fancy scripts.. :-)

One month from single mirror should be enough to get a ballpark, weight can be
calculated to it.

On Wed, Jun 26, 2019 at 08:41:32AM -0400, Kevin A. McGrail wrote:
> I seem to remember that yes, David Jones wrote a log parser for some
> information on this but there is no centralization of logs.  Each mirror
> would have to run and report.
> 
> On 6/25/2019 8:33 AM, Henrik K wrote:
> > Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
> > I'm curious how many active SA installations there actually are?  I realize
> > it's just a ballpark figure..
> >
> 
> -- 
> Kevin A. McGrail
> Member, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail - 703.798.0171

Re: Number of SpamAssassin installations

Posted by "Kevin A. McGrail" <km...@apache.org>.
I seem to remember that yes, David Jones wrote a log parser for some
information on this but there is no centralization of logs.  Each mirror
would have to run and report.

On 6/25/2019 8:33 AM, Henrik K wrote:
> Has someone calculated weekly/monthly unique IPs seen in some mirror logs? 
> I'm curious how many active SA installations there actually are?  I realize
> it's just a ballpark figure..
>

-- 
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171