You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Theo Van Dinter <fe...@kluge.net> on 2004/04/19 18:00:54 UTC

Re: svn commit: rev 10103 - incubator/spamassassin/trunk/lib/Mail/SpamAssassin

On Mon, Apr 19, 2004 at 03:37:13PM -0000, quinlan@apache.org wrote:
> I think this method is still going to miss encoded redirects, though.
> +    if (m{^https?://.+?(https?://.+)$}) {
> +      push(@uris, $1);
> +    }

It shouldn't.  The code above it deals with encoding, then that if
statement strips out the top level redirect and puts the remaining URL
back in the queue to be passed through the loop again.

However, your modification to the code stripped out the "$nuri =~"
which kinda breaks things. :(  (r10104 committed to fix it)

-- 
Randomly Generated Tagline:
EACH year, as they pay their taxes, many Americans conduct a tiny
 mental debate. "Why should I have to turn over such a huge fraction of my
 hard-earned money to the government?" And then, a moment later: "Oh, yeah:
 schools, roads, national security - blah, blah, blah. Sign the check."
 - http://www.nytimes.com/2003/07/31/technology/circuits/31stat.html

Re: svn commit: rev 10103 - incubator/spamassassin/trunk/lib/Mail/SpamAssassin

Posted by Theo Van Dinter <fe...@kluge.net>.
On Mon, Apr 19, 2004 at 10:48:41AM -0700, Justin Mason wrote:
> I just checked in a fix to another regression --
> http://redir1.com/x?http://redir2.com/y?http://payload.com/
> would not be caught with the latest code.
> 
> Any chance of a t script to test it?

Ummm.  The code did that.

> -    if ($nuri =~ m{^https?://.+?(https?://.+)$}) {
> +    while ($nuri =~ s{^https?://.+?(https?://.+)$}{$1}s) {
>        push(@uris, $1);
>      }
>    }

NOOOO!  That code is very specifically meant to *NOT* loop like that.
It's supposed to work like this:

foreach(@uris) { 
  if ( redirect ) {
    # note, full URI is already in the uris array
    take off the redirect (ie: http://foo.com?http://bar.com ->
      http://bar.com), add to uris array -- the foreach will hit this
      at the end
  }
}

It takes advantage of the fact that:

@t=(1..3); foreach(@t){ print "$_\n"; push(@t,4..5) if ($_ == 3); }

results in:

1
2
3
4
5

this way each "unredirected" URI gets processed the same as the others.

>      while ($nuri =~ s{^https?://.+?(https?://.+)$}{$1}s) {
> -      push(@uris, $1);
> +      push(@uris, $_);

OMG NOOOO!   $_ isn't valid in that loop, so now you're putting some
random string into the array.


<takes_code_personally>
I appreciate people trying to make this part of the code better, but I
put a good amount of thought and testing into what that little section
is supposed to do.  So far it's been disabled, reenabled and rewritten,
modified to do stuff it shouldn't, then all-together broken...  in less
than a day.
</takes_code_personally>


Anyway, I think I've fixed the issue I've been having with bad URI
results, so I'll check back in a working version of this code when
I've finished.

-- 
Randomly Generated Tagline:
on't want it now, I bloody well want it RIGHT now!