You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by "Michael Osipov (Jira)" <ji...@apache.org> on 2022/11/08 12:56:00 UTC

[jira] [Comment Edited] (HTTPCLIENT-2244) default response encoding is US.ASCII but it should be ISO-8859-1

    [ https://issues.apache.org/jira/browse/HTTPCLIENT-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630400#comment-17630400 ] 

Michael Osipov edited comment on HTTPCLIENT-2244 at 11/8/22 12:55 PM:
----------------------------------------------------------------------

Why then not UTF-8? Which is the default text encoding by now.


was (Author: michael-o):
Why then not UTF-8? Which is the default text encoding by now?

> default response encoding is US.ASCII but it should be ISO-8859-1
> -----------------------------------------------------------------
>
>                 Key: HTTPCLIENT-2244
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2244
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient (async)
>    Affects Versions: 5.1.3
>            Reporter: Johan Compagner
>            Priority: Major
>
> here (and in the getBodyBytes() above ):
> [httpcomponents-client/SimpleBody.java at master · apache/httpcomponents-client (github.com)|https://github.com/apache/httpcomponents-client/blob/master/httpclient5/src/main/java/org/apache/hc/client5/http/async/methods/SimpleBody.java#L86]
>  
> you see that the default charset to read a body is StandardCharsets.US_ASCII which is not correct that should be StandardCharsets.ISO_8859_1
>  
> this was the case in HC 4.x also described in the docs: [https://hc.apache.org/httpclient-legacy/charencodings.html]
>  
> i know the server should specify the encoding because of the different interpretations, but we don't always control the server and what they are doing, but according to the spec: 
> [https://www.w3.org/International/articles/http-charset/index]
> the default should be that ISO_8859_1
> Now clients of us that are using this suddenly see that german umlauts are not transferred correctly which with HC4 they worked fine.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org