You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Roy T. Fielding" <fi...@kiwi.ics.uci.edu> on 1998/03/03 02:21:37 UTC

Re: URIs again

I moved into an on-campus apartment on Sunday. The good news is that it
has a direct 10Mb/s connection to the Internet.  The bad news is that my
new bed won't arrive til Thursday (ouch).

>Roy, RFC2068 3.2.1 defines absoluteURI like so: 
>
>    absoluteURI    = scheme ":" *( uchar | reserved )
>
>That is, it's opaque.  Nowhere else in the document does it actually
>define what comes after the scheme.  It's obvious the intention is
>this:
>
>    absoluteURI    = scheme ":" relativeURI
>
>at least as far as parsing proxy requests go.

Yep.  The two are syntactically equivalent.

>Also interesting is the relativeURI definition:
>
>          relativeURI    = net_path | abs_path | rel_path
>
>          net_path       = "//" net_loc [ abs_path ]
>          abs_path       = "/" rel_path
>          rel_path       = [ path ] [ ";" params ] [ "?" query ]
>
>But no meaning is given to net_loc, other than in 5.1.2 where
>it defines Host: as the contents of net_loc.  Consider:
>
>    GET //hostname/path HTTP/1.1
>    Host: hostname
>
>It's a relativeURI, and so Apache doesn't proxy it or do vhosting
>on it.  Apache won't serve /path, it will essentially fail to serve
>//hostname/path.  Apache will also improperly handle URIs such as
>"//abc/../def".  In general right now I'm confused about URLs without
>schemes.

Ummm, those are not allowed in HTTP.  See the definition of Request-URI.
It should be treated as an abs_path.

>Another tidbit... are these valid Location response headers?
>
>    Location: http:foobar.gif
>    Location: http:/foobar.gif

Yes, but they are invalid http URLs.  The http URL does require net_path.
I know that libwww-based browsers will treat those as relative (due to
a stupid loophole in the original specs/code that assumed that all URLs
would have the net_loc (server) component).

>They seem to be given what I read in draft-fielding-url-syntax-09, but
>I know that some clients choke on them (lynx in particular).  RFC2068
>requires an absoluteURI response, and those fit that description since
>absoluteURI is opaque.

draft-fielding-uri-syntax-01 is the current spec (02 tomorrow), but both
disallow the "http:foobar.gif is relative" interpretation.

>And in a related manner, if a document's base url is
>"http://abc/def/foo.html", can it refer to "https:blah.gif"?  (and expect
>to get "https://abc/def/blah.gif")

No, that will fail (except maybe in Navigator, which has a random parser).

....Roy

Re: URIs again

Posted by Dean Gaudet <dg...@arctic.org>.
On Mon, 2 Mar 1998, Roy T. Fielding wrote:

> >    GET //hostname/path HTTP/1.1
> >    Host: hostname
>
> Ummm, those are not allowed in HTTP.  See the definition of Request-URI.
> It should be treated as an abs_path.

Ah ok cool.  Our new parser accepts these and breaks it into hostname and
path, because it's based on your url-syntax-09 regexes... that will need
to be fixed.  I'll look at uri-syntax-02 on the weekend probably (unless
someone beats me to it).

Dean