You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Eric A. Hall" <eh...@ehsco.com> on 2008/02/17 18:50:51 UTC

SVN notifications killing spamassassin

I sometimes get SVN notifications that contain lists of files and their
status. The filenames will often get picked up by the URI matching
algorithm, each of which end up being processed through numerous lookups
(URICOUNTRY, my LDAP filter, etc). Sometimes I get very large messages
with hundreds of file lists, which in turn causes spamassassin to go into
never-never land while it thinks about the hundreds of "URI" matches.

For example,

  A    fpo/reports/perl/nagios_notifications1.pl.bak
  A    foo/reports/perl/nagios_outages1.pl
  A    foo/reports/perl/GWIR.pm

nagios_outages1.pl will be determined as a URI for .pl domain and GWIR.pm
will be determined as a URI for .pm domain, and so forth. The only way to
get these messages through is to disable spamassassin...

I've updated to 3.2.4 just now and it still has the same problem

I'm guessing the URI analyzer needs to be smarter.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

Re: SVN notifications killing spamassassin

Posted by Matt Kettler <mk...@verizon.net>.
Philip Prindeville wrote:
> Eric A. Hall wrote:
>>
>> I'm guessing the URI analyzer needs to be smarter.
>>   
>
> That's strangely appropriate to the issue I had with "calthurs.com".
>
> It would be nice if this checker had an option to enforce checking 
> only of well-formed URL's (i.e. not "anything that might conceivably 
> be munged into a URL by the most ignorant of UA's")... something 
> requiring a protocol name (ftp:, http:, tftp:, etc.), a domain name, 
> and a path name (even if it's just slash). 

See the discussion at:

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5780


Re: SVN notifications killing spamassassin

Posted by Philip Prindeville <ph...@redfish-solutions.com>.
Eric A. Hall wrote:
> I sometimes get SVN notifications that contain lists of files and their
> status. The filenames will often get picked up by the URI matching
> algorithm, each of which end up being processed through numerous lookups
> (URICOUNTRY, my LDAP filter, etc). Sometimes I get very large messages
> with hundreds of file lists, which in turn causes spamassassin to go into
> never-never land while it thinks about the hundreds of "URI" matches.
>
> For example,
>
>   A    fpo/reports/perl/nagios_notifications1.pl.bak
>   A    foo/reports/perl/nagios_outages1.pl
>   A    foo/reports/perl/GWIR.pm
>
> nagios_outages1.pl will be determined as a URI for .pl domain and GWIR.pm
> will be determined as a URI for .pm domain, and so forth. The only way to
> get these messages through is to disable spamassassin...
>
> I've updated to 3.2.4 just now and it still has the same problem
>
> I'm guessing the URI analyzer needs to be smarter.
>   

That's strangely appropriate to the issue I had with "calthurs.com".

It would be nice if this checker had an option to enforce checking only 
of well-formed URL's (i.e. not "anything that might conceivably be 
munged into a URL by the most ignorant of UA's")... something requiring 
a protocol name (ftp:, http:, tftp:, etc.), a domain name, and a path 
name (even if it's just slash).

Or at the very least, to score complete URL's higher than just domain 
names alone.

-Philip