You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/06/20 21:43:28 UTC

[Bug 4411] New: $permsgstatus->get_uri_list too aggressive?

http://bugzilla.spamassassin.org/show_bug.cgi?id=4411

           Summary: $permsgstatus->get_uri_list too aggressive?
           Product: Spamassassin
           Version: unspecified
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: enhancement
          Priority: P4
         Component: Plugins
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: ehall@ehsco.com


I'm adding support for URI processing to my LDAPfilter plugin, and have noticed
that a URI of http://username:password@www.example.com:nn/ will get parsed and
returned as all of the following variations:

  mailto:username:password@www.example.com

  http://username:password@www.example.com:nn/

  http://www.example.com:nn/

The mailto and the last http URIs should not be returned since they are not
accurate representations of the input URI.

I can kind of see why you'd want to return something like the last URI for your
reduced view mechanism, but such a view would theoretically be provided without
the port number, so I'm not sure if that's the intention or not.

If you need/want to keep all of these in the outputs, can we get a separate
array or parameter that just returns the normalized URIs?

Thanks



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4411] $permsgstatus->get_uri_list too aggressive?

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4411





------- Additional Comments From felicity@apache.org  2005-06-20 12:51 -------
Subject: Re:   New: $permsgstatus->get_uri_list too aggressive?

On Mon, Jun 20, 2005 at 12:43:28PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> that a URI of http://username:password@www.example.com:nn/ will get parsed and
> 
>   mailto:username:password@www.example.com
>   http://username:password@www.example.com:nn/
>   http://www.example.com:nn/
> 
> The mailto and the last http URIs should not be returned since they are not
> accurate representations of the input URI.

The last one is.  It's the actual URI without the login information, which is
exactly what we want.  The mailto is incorrect.

> I can kind of see why you'd want to return something like the last URI for your
> reduced view mechanism, but such a view would theoretically be provided without
> the port number, so I'm not sure if that's the intention or not.

The idea is to remove the "obfuscation" of URIs so the rules don't have to
account for everything like username and passwords.  We do want to catch port
numbers.

> If you need/want to keep all of these in the outputs, can we get a separate
> array or parameter that just returns the normalized URIs?

This is possible in 3.1: don't call get_uri_list, call get_uri_detail_list
which you can see the raw URIs as well as the "canonified" version
separately.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4411] $permsgstatus->get_uri_list too aggressive?

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4411


ehall@ehsco.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From ehall@ehsco.com  2005-06-22 09:42 -------
I thought I'd changed this yesterday bug bugzilla doesn't show the mods. I'm
changing the status to 'fixed' since it's been incorporated into 3.1. However,
this doesn't really help me any since I'm not going to make any dependancies on
3.1 until 3.1.1 comes out probably (give peole time to move).

On a related issue, however, I'm wondering what kind of forward strategy there
is for unencoded i18n domain names. Right now, a raw IDN like
http://www.brav�.nu causes some breakage. URIDNSBL ignores these entirely, and I
also need to add similar code that ignores URIs with 8-bit characters. But at
some point, it would probably be a good idea for get_uri_list to provide these
in two forms--once in the original unencoded form, and also in the normalized
(xn--) format. Is this work in 3.1 also, or is this something that hasn't been
dealt with yet?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.