You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Sam Stainsby <sa...@sustainablesoftware.com.au> on 2012/10/29 08:05:59 UTC
literal '+' in URL when creaitng a database
Hi all,
Shouldn't this succeed (assuming appropriate permissions):
curl -X PUT 'http://localhost:5984/aaa+bbb'
Instead, I get the "Only lowercase characters (a-z), digits (0-9), and
any of the characters _, $, (, ), +, -, and / are allowed ..." error.
I understand that '+' has special significance in the query part of a
URL, but not the path part, so I think the above should work. I've found
with the latest Dispatch library (0.9.3) that dispatch doesn't encode the
'+', which from what I've read since seems to still be a legal URL. On
the other hand,, couch seems to require it to be encoded, so the
following *does* succeed:
curl -X PUT 'http://localhost:5984/aaa%2bbbb'
resulting in a database named 'aaa+bbb'.
I've checked (with wireshark) that the first query does indeed send the
literal '+ character : PUT /aaa+bbb ...
Cheers,
Sam Stainsby.
Re: literal '+' in URL when creaitng a database
Posted by Sam Stainsby <sa...@sustainablesoftware.com.au>.
On Mon, 29 Oct 2012 15:29:31 -0700, Mark Hahn wrote:
> I always encode the entire url. That can't cause a problem, can it?
Best you have a look at this I think:
http://www.lunatech-research.com/archives/2009/02/03/what-every-web-
developer-must-know-about-url-encoding
-- Sam.
Re: literal '+' in URL when creaitng a database
Posted by Mark Hahn <ma...@hahnca.com>.
I always encode the entire url. That can't cause a problem, can it?
On Mon, Oct 29, 2012 at 3:27 PM, Sam Stainsby <
sam@sustainablesoftware.com.au> wrote:
> On Mon, 29 Oct 2012 15:18:16 -0700, Mark Hahn wrote:
>
> >> "URL encoding" is applied homogeneously over all parts of the URL.
> >
> > What if there was a slash or hash character? I don't see how you can
> > avoid escaping the whole url.
>
> Sorry, what I mean is that different parts of the URL have subtlety
> different encoding rules.
>
> -- Sam.
>
>
Re: literal '+' in URL when creaitng a database
Posted by Sam Stainsby <sa...@sustainablesoftware.com.au>.
On Mon, 29 Oct 2012 15:18:16 -0700, Mark Hahn wrote:
>> "URL encoding" is applied homogeneously over all parts of the URL.
>
> What if there was a slash or hash character? I don't see how you can
> avoid escaping the whole url.
Sorry, what I mean is that different parts of the URL have subtlety
different encoding rules.
-- Sam.
Re: literal '+' in URL when creaitng a database
Posted by Mark Hahn <ma...@hahnca.com>.
> "URL encoding" is applied homogeneously over all parts of the URL.
What if there was a slash or hash character? I don't see how you can avoid
escaping the whole url.
On Mon, Oct 29, 2012 at 3:13 PM, Sam Stainsby <
sam@sustainablesoftware.com.au> wrote:
> On Mon, 29 Oct 2012 17:07:37 +0000, Robert Newson wrote:
>
> > It's because we call mochiweb_util:unquote(Path) which replaces the +
> > for a space.
>
> What I've read is that there seems to be a widespread misconception that
> "URL encoding" is applied homogeneously over all parts of the URL. Even
> some major libraries get it wrong --- or have misleading names at least.
>
> I've reported this now:
> https://issues.apache.org/jira/browse/COUCHDB-1580
>
> -- Sam.
>
>
Re: literal '+' in URL when creaitng a database
Posted by Sam Stainsby <sa...@sustainablesoftware.com.au>.
On Mon, 29 Oct 2012 17:07:37 +0000, Robert Newson wrote:
> It's because we call mochiweb_util:unquote(Path) which replaces the +
> for a space.
What I've read is that there seems to be a widespread misconception that
"URL encoding" is applied homogeneously over all parts of the URL. Even
some major libraries get it wrong --- or have misleading names at least.
I've reported this now:
https://issues.apache.org/jira/browse/COUCHDB-1580
-- Sam.
Re: literal '+' in URL when creaitng a database
Posted by Robert Newson <rn...@apache.org>.
It's because we call mochiweb_util:unquote(Path) which replaces the +
for a space.
B.
On 29 October 2012 16:48, Jens Alfke <je...@couchbase.com> wrote:
>
> On Oct 29, 2012, at 1:26 AM, Sam Stainsby <sa...@sustainablesoftware.com.au>> wrote:
>
> How couch encodes that as a file name in an OS would be internal to
> couch, so if couch is using query string encoding for the file name, that
> may be a good choice for OS portability. However, my understanding is
> that '+' representing a space in a URL is only valid for the *query* part
> of a URL.
>
> Agreed — it should not be necessary to URL-encode “+” signs in the path portion of a URL. Your URL refers to the database named “aaa+bbb”, not “aaa bbb”, so the request should have succeeded. This sounds like a bug in CouchDB.
>
> —Jens
Re: literal '+' in URL when creaitng a database
Posted by Jens Alfke <je...@couchbase.com>.
On Oct 29, 2012, at 1:26 AM, Sam Stainsby <sa...@sustainablesoftware.com.au>> wrote:
How couch encodes that as a file name in an OS would be internal to
couch, so if couch is using query string encoding for the file name, that
may be a good choice for OS portability. However, my understanding is
that '+' representing a space in a URL is only valid for the *query* part
of a URL.
Agreed — it should not be necessary to URL-encode “+” signs in the path portion of a URL. Your URL refers to the database named “aaa+bbb”, not “aaa bbb”, so the request should have succeeded. This sounds like a bug in CouchDB.
—Jens
Re: literal '+' in URL when creaitng a database
Posted by Sam Stainsby <sa...@sustainablesoftware.com.au>.
On Mon, 29 Oct 2012 08:23:09 +0100, Benoit Chesneau wrote:
> On Mon, Oct 29, 2012 at 8:05 AM, Sam Stainsby
>> I understand that '+' has special significance in the query part of a
>> URL, but not the path part, so I think the above should work.
> '+' would mean space on the file system if I recall correctly. Which
> could be problematic on some platforms.
Hi Benoit,
How couch encodes that as a file name in an OS would be internal to
couch, so if couch is using query string encoding for the file name, that
may be a good choice for OS portability. However, my understanding is
that '+' representing a space in a URL is only valid for the *query* part
of a URL.
"Within the query string, the plus sign is reserved as shorthand notation
for a space. Therefore, real plus signs must be encoded. This method was
used to make query URIs easier to pass in systems which did not allow
spaces." (http://www.w3.org/Addressing/URL/4_URI_Recommentations.html)
"For HTTP URLs, a space in a path fragment part has to be encoded to
"%20" (not, absolutely not "+"), while the "+" character in the path
fragment part can be left unencoded."
http://www.lunatech-research.com/archives/2009/02/03/what-every-web-
developer-must-know-about-url-encoding
Cheers,
Sam.
Re: literal '+' in URL when creaitng a database
Posted by Benoit Chesneau <bc...@gmail.com>.
On Mon, Oct 29, 2012 at 8:05 AM, Sam Stainsby
<sa...@sustainablesoftware.com.au> wrote:
> Hi all,
>
> Shouldn't this succeed (assuming appropriate permissions):
>
> curl -X PUT 'http://localhost:5984/aaa+bbb'
>
> Instead, I get the "Only lowercase characters (a-z), digits (0-9), and
> any of the characters _, $, (, ), +, -, and / are allowed ..." error.
>
> I understand that '+' has special significance in the query part of a
> URL, but not the path part, so I think the above should work. I've found
> with the latest Dispatch library (0.9.3) that dispatch doesn't encode the
> '+', which from what I've read since seems to still be a legal URL. On
> the other hand,, couch seems to require it to be encoded, so the
> following *does* succeed:
>
> curl -X PUT 'http://localhost:5984/aaa%2bbbb'
>
> resulting in a database named 'aaa+bbb'.
>
> I've checked (with wireshark) that the first query does indeed send the
> literal '+ character : PUT /aaa+bbb ...
>
> Cheers,
> Sam Stainsby.
>
'+' would mean space on the file system if I recall correctly. Which
could be problematic on some platforms.
- benoit