You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Hudson <gh...@MIT.EDU> on 2004/06/10 22:46:28 UTC

Re: RFC: URL encoding switch and option (was Re: URI encoding URLs on the cmdline?)

On Thu, 2004-06-10 at 18:45, Pierre THIERRY wrote:
> The easiest way to deal with URL encoding seems to me to add an option
> for it:

-1.  We don't need a knob for this.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: URL encoding switch and option (was Re: URI encoding URLs on the cmdline?)

Posted by Pierre THIERRY <pi...@moine-fou.org>.
> Maybe this clears things out

Sure. I thought the IRI thing was a separate problem, so I didn't read
carefully the thread on it...

Lightly,
le Moine Fou
-- 
pierre.thierry@moine-fou.org
OpenPGP 0xD9D50D8A

Re: RFC: URL encoding switch and option (was Re: URI encoding URLs on the cmdline?)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Fri, 11 Jun 2004, Pierre THIERRY wrote:

> No need of my 'knob' if we have a magical way of knowing when and what
> to encode without any hint from the user. I'm pretty impatient of
> discovering the guessing algorithm...
>
I don't know what guessing algorithm you are referring to. I'm not talking
about guessing anything. You should read up on IRIs. There is a draft for
an RFC by Martin Dúrst. I don't have the URL handy, but you should be able
to find it via W3C or IETF.

As I explained earlier, to support IRIs, we just need to %-escape bytes
between 0x80 and 0xff. We already have the input in UTF-8. ONe concern is
that the IRI draft specifies that when converting from other encodings to
UTF-8, Unicode Normalization form C. I don't know if apr-iconv can
guarantee this. In any case, we can't do this normalization in the IRI ->
URI function, since the spec also says that if the input already is in an
Unicode encoding (which it is if the user uses an UTF-8 locale, for
example), we shall not do any normalization. But this isn't about
guessing, so let's leave it for now. So, IRI support is straight-forward.

Then, I also proposed to automatically escape some characters that are
always illegal in an URI. For example, space is illegal, so if we
encounter a space, we can just encode it as %20. Then, I suggested that we
can be more smart than that. For example, when we are in the path
component, [] are *not* reserved (they are in the eariler parts, for IPv6
addresses). This is no guessing, since we can always parse the URI. But it
is more complex, and may seem less predictable to the user, so whether
this is desirable can be debated further.
 Maybe this clears things out, or I (and Greg) have  missed something.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: RFC: URL encoding switch and option (was Re: URI encoding URLs on the cmdline?)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Thu, 2004-06-10 at 19:19, Pierre THIERRY wrote:
> No need of my 'knob' if we have a magical way of knowing when and what
> to encode without any hint from the user. I'm pretty impatient of
> discovering the guessing algorithm... 

Web browsers do this without a knob; we can do so as well.  Impatience
is not a reason to complicate the UI.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: URL encoding switch and option (was Re: URI encoding URLs on the cmdline?)

Posted by Pierre THIERRY <pi...@moine-fou.org>.
> -1.  We don't need a knob for this.

No need of my 'knob' if we have a magical way of knowing when and what
to encode without any hint from the user. I'm pretty impatient of
discovering the guessing algorithm... 

Curiously,
le Moine Fou
-- 
pierre.thierry@moine-fou.org
OpenPGP 0xD9D50D8A

Re: RFC: URL encoding switch and option (was Re: URI encoding URLs on the cmdline?)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Thu, 10 Jun 2004, Greg Hudson wrote:

> On Thu, 2004-06-10 at 18:45, Pierre THIERRY wrote:
> > The easiest way to deal with URL encoding seems to me to add an option
> > for it:
>
> -1.  We don't need a knob for this.
>
>
Exactly. A URI is an IRI, so there is no backwards compatibility problem.
I hadn't read up on IRIs when I wrote the thing about compatibility.

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org