You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@spamassassin.apache.org on 2022/11/25 14:30:49 UTC

[Bug 8075] New: Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

            Bug ID: 8075
           Summary: Request for .site, .online, and .fun to be dropped
                    from SpamAssassin's suspicious TLD list
           Product: Spamassassin
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Rules
          Assignee: dev@spamassassin.apache.org
          Reporter: nehav@radix.email
  Target Milestone: Undefined

Created attachment 5857
  --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5857&action=edit
Daily abuse feed for the concerned tlds from Spamhaus and Surbl

Issue: .online, .site, and .fun are listed in SpamAssassin's suspicious TLD
list. This leads to important business emails being marked as spam and being
delivered to customers. 

Details: I request that .online, .site, and .fun be dropped from SpamAssassin's
suspicious/untrustworthy TLD list. The abuse rates on these three TLDs have
been significantly low in the past couple of months. (Please see the attached
Spamhaus statistics. These are direct feeds we receive from Spamhaus every day.
It is evident from the data that all three TLDs have shown considerable and
continuous improvement in their rankings and abuse rate). However, for the last
couple of years, several clients using these TLDs have complained that their
emails are not getting delivered to their customers. One of the potential
reasons behind this could be that many of these emails are being marked as spam
by Apache SpamAssassin, leading to some profound business loss.

Can SpamAssassin please set up a test rule for .site, .online and .fun
individually to check their S/O (spam hits/overall hits)? Also, it would be
great if the TLDs were dropped from the list should the S/O for each TLD turns
out to be lower than the overall S/O of the aggregate rule.

A similar bug had been raised for .space TLD in Bug 7953, where after the test,
it was found that S/O for .space was lower, and thus, the TLD was eventually
dropped from the suspicious TLD list. Please see bug 7953, comment 8. I would
be happy to provide other supporting data/information to help with the tests.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

--- Comment #7 from nehav <ne...@radix.email> ---
Thanks, Bill. This is helpful. Would you be able to profile us with a sample
set of either .online spammy domain names or emails or even email headers,
basically anything that can help us do some RCA?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

--- Comment #3 from Bill Cole <bi...@apache.org> ---
(In reply to nehav from comment #0)
> Created attachment 5857 [details]
> Daily abuse feed for the concerned tlds from Spamhaus and Surbl
> 
> Issue: .online, .site, and .fun are listed in SpamAssassin's suspicious TLD
> list. This leads to important business emails being marked as spam and being
> delivered to customers. 

A claim which is not supported by any evidence that you have provided. Please
note that to the best of the knowledge of the SpamAssassin Project Management
Committee, none of the very large mailbox providers (MS, Oath (Yahoo/AOL),
Google, GMX, etc.) use SpamAssassin to classify spam. If you have a problem
with any of them, there is nothing we do can help. 

It is NOT a "false positive" for a non-spam message to match rules with
derogatory (arithmetically positive) scores. It is also not a "false negative"
for spam messages to match some complimentary (negative score) rules. By
design, all messages should hit some of both. What matters is the final score.
The default and recommended threshold for SA is a score of 5.0, and our rule QA
and rescoring system is built on the premise of that threshold. 

> Details: I request that .online, .site, and .fun be dropped from
> SpamAssassin's suspicious/untrustworthy TLD list. The abuse rates on these
> three TLDs have been significantly low in the past couple of months. (Please
> see the attached Spamhaus statistics. These are direct feeds we receive from
> Spamhaus every day. It is evident from the data that all three TLDs have
> shown considerable and continuous improvement in their rankings and abuse
> rate). 

Irrelevant. TLDs are included in our list of suspicious TLDs based on how much
of the mail using them is spam, according to the mass-check logs scored by our
contributors. The three metrics from Spamhaus in that spreadsheet are not
defined in any way (and so are literally meaningless to me,) but in the past I
know that they have tracked TLDs based on how much of the overall spam flow
uses them, which is not at all relevant to a filtering tool like SpamAssassin.  

> However, for the last couple of years, several clients using these
> TLDs have complained that their emails are not getting delivered to their
> customers. One of the potential reasons behind this could be that many of
> these emails are being marked as spam by Apache SpamAssassin, leading to
> some profound business loss.

Hypothetical. It is literally impossible for a SpamAssassin instance with
default scores and threshold to mark a message as spam solely on the use of a
suspicious TLD. Obviously it is possible for a message to exceed the threshold
score by less than the value of the TLD rules, but there are always other
factors contributing to that score.

> Can SpamAssassin please set up a test rule for .site, .online and .fun
> individually to check their S/O (spam hits/overall hits)? Also, it would be
> great if the TLDs were dropped from the list should the S/O for each TLD
> turns out to be lower than the overall S/O of the aggregate rule.

Sidney has added the .site and fun rules, the .online rule (T_SCC_TLD_ONLINE)
has been there for a while and consistently has S/O scores around 95%. That one
is a keeper, at least for now. 

So far (2 days, insufficient for a decision) it is looking like .fun is still
all spam, while .site is mostly not. Unless the next few days show very
different results, I anticipate that we will be removing .site but leaving .fun
for now. 

> A similar bug had been raised for .space TLD in Bug 7953, where after the
> test, it was found that S/O for .space was lower, and thus, the TLD was
> eventually dropped from the suspicious TLD list. Please see bug 7953,
> comment 8. I would be happy to provide other supporting data/information to
> help with the tests.

Real "ham" messages that score over 5 on a default-score SA instance would
substantially enhance my concern with the concept of 'suspicious TLDs.' I know
that linking the URL or email domain TLD to a spam score should not work at all
in an ideal world, but in fact it is a helpful tactic according to the data we
have. 

Obviously we don't want TLDs listed which no longer correlate to mail
spamminess. It is not the purpose of our rules to incur punishment, only to
make the best guess possible about whether a message is spam. 

I will look again at the stats after a week which does not include a US
holiday.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

Kevin A. McGrail <km...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmcgrail@apache.org

--- Comment #2 from Kevin A. McGrail <km...@apache.org> ---
Also, do you have FPs this is causing?  Many of those suspicious TLDs rules are
based on real spam and very lax security handling at the registration points.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

Sidney Markowitz <si...@sidney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |billcole@apache.org,
                   |                            |sidney@sidney.com

--- Comment #1 from Sidney Markowitz <si...@sidney.com> ---
I see that Bill Cole still has a test rule T_SCC_TLD_ONLINE defined so we can
look at the statistics for just .online over the recent past.

I have just added similar rules T_SCC_TLD_SITE and T_SCC_TLD_FUN for the .site
and .fun TLDs, in my sandbox.

Actually analyzing the statistics is outside my realm of expertise. Bill could
you take a look at the stats for those three rules after enough time has passed
to gather them and make a judgment as to whether they should or should not be
dropped from the suspicious TLD list?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

--- Comment #4 from Kevin A. McGrail <km...@apache.org> ---
Well put, Bill especially the fact that it is never about punishment simply
working with data to categorize correctly.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

--- Comment #6 from Bill Cole <bi...@apache.org> ---
(In reply to nehav from comment #5)
> Hello Bill, I want to follow up on the ticket. Would you mind sharing the
> outcome of the test results for .online, .site and .fun? 

.online and .fun are still consistently running above 95% spam. 

.site is an interesting case. It is NOT consistent from day to day or across
all data sources, and the noise pattern implies that a broader sample would
make it even less useful. I have removed it from the 'suspicious' TLD lists. 

I have also removed .xyz, another one that I have been watching which seems to
have gotten much less spammy in recent months. 

The relevant change in SVN is r1907241. 

> We are worried that
> .online might still be seeing high volumes of spam. My team and I are still
> figuring out what we can do to control it. 

Perversely, the most likely way for the stats we use to improve would be if
more entirely organic, unquestionably non-spam mail used the TLD. 

> Right now, we are in the middle
> of performing a few tests ourselves. We are trying to determine the 'use &
> throw' type of domains by cross-checking MX record hits for the freshly
> registered domains. Second, we are trying to find the quantum of traffic on
> different kinds of mail servers for .online domains. For this, we would
> consider reputable servers such as google & outlook to be 'good' servers
> which can be left as it is and everything else as 'others', which would then
> be manually evaluated. We will also check hits on gibberish domains. 
> 
> I would really appreciate more suggestions/ideas on how we can better track
> and reduce spam for .online.

I think that everyone helping maintain SA understands that this is a hard
problem, and that we don't have any brilliant insights on how to solve it. 

You may find useful ideas from the broader audience of our users' mailing list
(See https://cwiki.apache.org/confluence/display/SPAMASSASSIN/mailinglists)  or
from the wider email community in a forum like the MailOp list at 
https://list.mailop.org/listinfo/mailop

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

--- Comment #8 from Bill Cole <bi...@apache.org> ---
(In reply to nehav from comment #7)
> Thanks, Bill. This is helpful. Would you be able to profile us with a sample
> set of either .online spammy domain names or emails or even email headers,
> basically anything that can help us do some RCA?

Sadly, no. Our stats are generated by sites that use SA doing mass scans of
their mail and submitting their results. The SA project never sees the actual
mail, just the results.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by Michael Peddemors <mi...@linuxmagic.com>.
On 2023-02-03 04:24, bugzilla-daemon@spamassassin.apache.org wrote:
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075
> 
> nehav <ne...@radix.email> changed:
> 
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                   CC|                            |nehav@radix.email
> 
> --- Comment #5 from nehav <ne...@radix.email> ---
> Hello Bill, I want to follow up on the ticket. Would you mind sharing the
> outcome of the test results for .online, .site and .fun? We are worried that
> .online might still be seeing high volumes of spam. My team and I are still
> figuring out what we can do to control it. Right now, we are in the middle of
> performing a few tests ourselves. We are trying to determine the 'use & throw'
> type of domains by cross-checking MX record hits for the freshly registered
> domains. Second, we are trying to find the quantum of traffic on different
> kinds of mail servers for .online domains. For this, we would consider
> reputable servers such as google & outlook to be 'good' servers which can be
> left as it is and everything else as 'others', which would then be manually
> evaluated. We will also check hits on gibberish domains.
> 
> I would really appreciate more suggestions/ideas on how we can better track and
> reduce spam for .online.
> 

If they are on Digital Ocean and spamming, with a .online domain.. 
Pretty easy to catch..

134.209.25.157	x2	zero7.correiossrecompensass2023cc.online
138.197.149.86	x1	aspenpost.online
139.59.186.181	x4	zero8.correiossrecompensass2023cc.online
143.110.172.252	x3	zero10.correiossrecompensass2023cc.online
143.198.203.61	x1	vegagiveaways.online
165.22.105.49	x1	sparklesurveys.online
165.22.110.143	x1	surveycompany.online
165.22.126.196	x3	zero5.correiossrecompensass2023cc.online
167.172.53.189	x3	zero2.correiossrecompensass2023cc.online
167.172.53.21	x4	zero3.correiossrecompensass2023cc.online
178.128.171.246	x5	zero6.correiossrecompensass2023cc.online
178.62.65.212	x3	zero1.correiossrecompensass2023cc.online
178.62.7.247	x1	zero4.correiossrecompensass2023cc.online
188.166.175.52	x1	zero9.correiossrecompensass2023cc.online

Talk about zero day detection.. ;) These guys have been at it for months 
now, you think they would give up..

Just saying.. Here's an idea.. contribute a patch to SA, where if the 
domain is .online and is spamming, it can auto submit a report to you.


-- 
"Catch the Magic of Linux..."
------------------------------------------------------------------------
Michael Peddemors, President/CEO LinuxMagic Inc.
Visit us at http://www.linuxmagic.com @linuxmagic
A Wizard IT Company - For More Info http://www.wizard.ca
"LinuxMagic" a Registered TradeMark of Wizard Tower TechnoServices Ltd.
------------------------------------------------------------------------
604-682-0300 Beautiful British Columbia, Canada

This email and any electronic data contained are confidential and intended
solely for the use of the individual or entity to which they are addressed.
Please note that any views or opinions presented in this email are solely
those of the author and are not intended to represent those of the company.


[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

nehav <ne...@radix.email> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nehav@radix.email

--- Comment #5 from nehav <ne...@radix.email> ---
Hello Bill, I want to follow up on the ticket. Would you mind sharing the
outcome of the test results for .online, .site and .fun? We are worried that
.online might still be seeing high volumes of spam. My team and I are still
figuring out what we can do to control it. Right now, we are in the middle of
performing a few tests ourselves. We are trying to determine the 'use & throw'
type of domains by cross-checking MX record hits for the freshly registered
domains. Second, we are trying to find the quantum of traffic on different
kinds of mail servers for .online domains. For this, we would consider
reputable servers such as google & outlook to be 'good' servers which can be
left as it is and everything else as 'others', which would then be manually
evaluated. We will also check hits on gibberish domains. 

I would really appreciate more suggestions/ideas on how we can better track and
reduce spam for .online.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8075] Request for .site, .online, and .fun to be dropped from SpamAssassin's suspicious TLD list

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8075

Dubaibusinesssetup <du...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dubai.businesssetupae@gmail
                   |                            |.com

--- Comment #9 from Dubaibusinesssetup <du...@gmail.com> ---
Business Setup in Dubai - Start your new business in Dubai with Business Setup
Consultants in Dubai. We are the #1 company registration, formation,
establishment and business setup agents in Dubai UAE. Being one of the most
trusted companies formation consultants, we provide our clients with the best
advice for opening a business bank account Dubai and guidance during the most
crucial stages of their real estate company setup in dubai. We make sure that
our prospective teams have all the required knowledge and experience to deal
with the various business establishments and problems faced by our clients
along the various procedures such as opening a business bank account in dubai
for non residents.

Dubai Business Setup is one of the respected companies that help you with the
details. Our golden visa in Dubai, freelance visa Dubai, ecommerce license
Dubai, & Crypto license Dubai services is flawless and fabulous. Starting a
business is not an extremely simple job, and that is where we come into the
picture. We help you communicate with the various agencies in their local
languages for transparent and clear negotiations. If you need more information,
call the Dubai Business Setup today.

You will receive the necessary direction from our qualified specialists, who
will also guide you through every step of the application procedure for a real
estate broker license dubai. Because to its established relationships with
several respected banks, both worldwide and domestically in the UAE, Dubai
Business Setup is able to provide its clients with this essential investor visa
in dubai for superior skills service.
Website - https://www.dubaibusinesssetup.ae/real-estate-license/

-- 
You are receiving this mail because:
You are the assignee for the bug.