You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Garret Wilson <ga...@globalmentor.com> on 2019/02/01 15:23:38 UTC

Re: distinction between resource charset and format octet decoding

Good morning, I'm just getting to the editing. I'm going to list some 
thoughts I have as I go through this, so you can verify things:

  * The servlet spec links are way out of date. I'll update them.
  * "There /is no default encoding for URIs/ specified anywhere, which
    is why there is a lot of confusion when it comes to decoding these
    values." Sheesh, this is is ancient. I'll correct it as per
    https://tools.ietf.org/html/rfc3986#section-2.5 .
  * "Most of the web uses ISO-8859-1 as the default for query strings."
    Is this still true?! In light of the above, I would think it is not
    true, but I wanted to ask, as you know better about what you've seen
    "in the wild".

Garret


Re: distinction between resource charset and format octet decoding

Posted by Garret Wilson <ga...@globalmentor.com>.
On 2/1/2019 9:38 AM, Christopher Schultz wrote:
>> Amazing. A close reading of RFC 3986 reveals that there is no
>> clear mandate for UTF-8 in existing URI schemes, even though
>> recommended for new schemes. Anyway, everyone seems to have settled
>> on UTF-8 (Tomcat included), so I'll try to indicate that.
> Wait... are you saying that _it's the Wild West out there?_ ;)
>
> Yes. The web is indeed held together with duct-tape and bailing wire.
> It's amazing that it works as well as it does.


Hahaha. I'm /so/ happy someone agrees with me! Here's to improving 
things with a little JB Weld once in a while. (That's what my 
grandparents used on the farm when the bailing wire and duct tape 
couldn't handle it.)

Garret


Re: distinction between resource charset and format octet decoding

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Garret,

On 2/1/19 11:08, Garret Wilson wrote:
> On 2/1/2019 7:23 AM, Garret Wilson wrote:
>> … * "There /is no default encoding for URIs/ specified anywhere,
>> which is why there is a lot of confusion when it comes to
>> decoding these values." Sheesh, this is is ancient. I'll correct
>> it as per https://tools.ietf.org/html/rfc3986#section-2.5 .
> 
> 
> Amazing. A close reading of RFC 3986 reveals that there is no
> clear mandate for UTF-8 in existing URI schemes, even though
> recommended for new schemes. Anyway, everyone seems to have settled
> on UTF-8 (Tomcat included), so I'll try to indicate that.

Wait... are you saying that _it's the Wild West out there?_ ;)

Yes. The web is indeed held together with duct-tape and bailing wire.
It's amazing that it works as well as it does.

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlxUhDEACgkQHPApP6U8
pFhWlA/8Cxr6xzT8+cw5Mu/a8cH788p+ucK4QtO9Qlm6EBhhX2sW9BelWpk2ftOX
xypZkwW155D2hlz58eUTGSoFl92rgFZNXmXBoIXd+MDgNS/b0zgabb7N7wlHswzj
LJArA9GtXNjRy5vJc4Bpe37ZpiqcV9f/sbQhSO31ZrJYvnVuOOYszzfp2g6UWlg5
+OAgfi2L99uMxJdqc81eIVsL6mmmhlkJYe6ejAZjb/EQ2Lk74MKlgCUfaoasCdYd
hqdQJIBpRGvUnx6UEoq+sdEilBAXTJocGv8cyOFQY5rHcaTy7WIQ9mIWilTjBb6O
gxWJbgRfX+uOVhTT5mo7LoE+YVLQZ3QPAM21SEXtX3PR5Vuk4hB8SYj3/er7S7v2
/kPL0d5K2DsO8034PoZQBturIV8pkiF5jqr2nSTND/B0nFK9hcZu27qY9RigHF95
8owMY7/hdMsK2PlYOwyj6dZSMx94Iy5mWDCrF3GUFCbEN9u3/6HoRYuJZOpCv8h1
aZHZmiYDEtxzxL8OkXNqyuBu4k+HJ58/ABMelpXOjxMVHuFXkqny6XiqrzyWac+z
yW1otX/uLKgqKI9PL3O8MfzVS5LZ6XVtprkZUDhCBvsA8vQTZYBRVQu3DiGMPojj
U4STB1VBJSV4I67bBhkQaAZnsqIgeNi/qzHC+5h6hbHl+Me1lRg=
=Z4XG
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: distinction between resource charset and format octet decoding

Posted by Garret Wilson <ga...@globalmentor.com>.
On 2/1/2019 7:23 AM, Garret Wilson wrote:
> …
>  * "There /is no default encoding for URIs/ specified anywhere, which
>    is why there is a lot of confusion when it comes to decoding these
>    values." Sheesh, this is is ancient. I'll correct it as per
>    https://tools.ietf.org/html/rfc3986#section-2.5 .


Amazing. A close reading of RFC 3986 reveals that there is no clear 
mandate for UTF-8 in existing URI schemes, even though recommended for 
new schemes. Anyway, everyone seems to have settled on UTF-8 (Tomcat 
included), so I'll try to indicate that.

Garret


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org