You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by yonik <ys...@gmail.com> on 2007/12/22 00:26:20 UTC

application/x-www-form-urlencoded standard

Using HttpClient 3.1, it looks like PostMethod parameters are encoded in the
body using the URI scheme of percent encoding the UTF-8 bytes.

The standard that specifies URI encoding is
http://www.ietf.org/rfc/rfc3986.txt
Does anyone have a pointer that specifies that
application/x-www-form-urlencoded should be handled in the same manner?

As an example, given a parameter of "q","h\u00e9llo"
HttlClient will generate a POST body with q=h%C3%A9llo

-Yonik
-- 
View this message in context: http://www.nabble.com/application-x-www-form-urlencoded-standard-tp14464212p14464212.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: application/x-www-form-urlencoded standard

Posted by yonik <ys...@gmail.com>.

Roland Weber wrote:
> 
> Hello Yonik,
> 
>> The standard that specifies URI encoding is
>> http://www.ietf.org/rfc/rfc3986.txt
>> Does anyone have a pointer that specifies that
>> application/x-www-form-urlencoded should be handled in the same manner?
> 
> http://www.w3.org/TR/html401/interact/forms.html#form-content-type
> 

Thanks... that standard doesn't specifically address how to handle unicode
(and references the older http://www.ietf.org/rfc/rfc1738.txt which also
doesn't handle it).
One *could* make the logical leap that since rfc3986 updates rfc1738, that
the double-encoding in section 2.5 of rfc3986 should now apply (encode using
UTF-8 first, then percent encode those individual octets).

However, one could also make the case that that hack only applies to the URI
since there is no place to specify a character encoding.  Since we *can*
specify character sets for a POST body, it could also make sense to simply
leave \u00e9 alone (encode it per the declared charset).

It would be nice if the standards were actually updated to spell it out.

-Yonik

from http://www.w3.org/TR/html401/interact/forms.html#form-content-type
'''
application/x-www-form-urlencoded  

This is the default content type. Forms submitted with this content type
must be encoded as follows:

   1. Control names and values are escaped. Space characters are replaced by
`+', and then reserved characters are escaped as described in [RFC1738],
section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent
sign and two hexadecimal digits representing the ASCII code of the
character. Line breaks are represented as "CR LF" pairs (i.e., `%0D%0A').
   2. The control names/values are listed in the order they appear in the
document. The name is separated from the value by `=' and name/value pairs
are separated from each other by `&'.
'''


-- 
View this message in context: http://www.nabble.com/application-x-www-form-urlencoded-standard-tp14464212p14470290.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: application/x-www-form-urlencoded standard

Posted by Roland Weber <os...@dubioso.net>.
Hello Yonik,

> The standard that specifies URI encoding is
> http://www.ietf.org/rfc/rfc3986.txt
> Does anyone have a pointer that specifies that
> application/x-www-form-urlencoded should be handled in the same manner?

http://www.w3.org/TR/html401/interact/forms.html#form-content-type

I'll add it to the references in the Application Design FAQ:
http://wiki.apache.org/jakarta-httpclient/FrequentlyAskedApplicationDesignQuestions

cheers,
  Roland

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org