You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by bu...@apache.org on 2006/02/09 12:17:45 UTC

DO NOT REPLY [Bug 38588] New: - Documentation on problematic URL Character Sets might be in error

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=38588>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=38588

           Summary: Documentation on problematic URL Character Sets might be
                    in error
           Product: HttpClient
           Version: 3.0 Final
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: HttpCommon
        AssignedTo: httpclient-dev@jakarta.apache.org
        ReportedBy: apache@stolsvik.com


On "docs/charencodings.html", it states that only ASCII chars may be used in an
URL, and that non-ASCII chars have "no way to reliably encode them".

However, new RFCs have come after the one cited, RFC1738.

Here are some relevant links
  http://www.w3.org/International/O-URL-code.html
  http://www.ietf.org/rfc/rfc2396.txt
    (read chap 2.1)
  http://www.ietf.org/rfc/rfc2718.txt
    (read chap 2.2.5)

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


DO NOT REPLY [Bug 38588] - Documentation on problematic URL Character Sets might be in error

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=38588>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=38588





------- Additional Comments From apache@stolsvik.com  2006-02-09 13:00 -------
Re: "What the docs want to say is merely that it is not possible to URI-encode
non-ASCII character in an unambigous way."

This I agree upon, however, unless I use HttpClient as a general purpose enduser
operated -browser-, which I hardly think is the most common use case, I will
known which servers I talk to, and can adjust my use of HttpClient accordingly.
Many use-cases for HttpClient will even be such that the user have control over
both endpoints.
  In these scenarioes there is actually defined and specified a consistent way
for how to send full unicode over URLs. I therefore disagree upon the strongish
wording in the doc - as most new servers (notably both of Apache's, I believe?)
will employ this UTF-8 style assumption.

However, before checking this up better before submitting this ER, I really
thought that it was common for newer browsers to encode the URL parameters with
the same encoding as it uses for the body-part. And that this again typically
would be selected by which encoding the server sent as its response (the browser
would thus in effect adjusts to the server, where the server picks one of the
browsers accepted encodings). But in the specs I mentioned, it basically states
that "use UTF-8 always".

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


DO NOT REPLY [Bug 38588] - Documentation on problematic URL Character Sets might be in error

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=38588>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=38588





------- Additional Comments From odi@odi.ch  2006-02-09 12:32 -------
Endre,

What the docs want to say is merely that it is not possible to URI-encode
non-ASCII character in an unambigous way. Server and client must always agree on
a common character encoding. Arguably this has been "defined" in later RFCs as
UTF-8. But this definition is useless in the general case. If I have no
information about the internal workings of an HTTP server, I have no way to find
out which encoding it expects.

Our docs are really not so clear and actually a bit wrong. Feel free to submit a
patch (against the xml File in xdocs).

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


DO NOT REPLY [Bug 38588] - Documentation on problematic URL Character Sets might be in error

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=38588>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=38588


olegk@apache.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
   Target Milestone|---                         |3.1 Final




------- Additional Comments From olegk@apache.org  2006-02-09 23:34 -------
We will happily accept a patch against SVN trunk, if you care to contribute one

Oleg

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org