You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Joan Balaguero <jo...@grupoventus.com> on 2012/01/12 14:15:00 UTC
Encoding issue
Hello Oleg,
Im having an strange issue with encoding using HttpClient 4.0.1.
Im sending a string to a servlet using HttpClient 4.0.1. The servlet just
sends back to the client exactly the same received string:
This is the piece of code that sends the string to the servlet and receives
the response:
( . . . )
String ENCODING = ISO-8859-1;
String str = "ÑÑÑ_ÁÁÁ";
objPost = new HttpPost(url);
StringEntity strEntity = new StringEntity(str);
strEntity.setContentType("text/plain; charset=" + ENCODING);
objPost.setEntity(strEntity);
HttpEntity entity = null;
try
{
entity = objHttp.execute(objPost).getEntity();
// Read the response.
BufferedInputStream bis = new BufferedInputStream(entity.getContent());
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] buffer = new byte[4098];
int numBytes;
while ((numBytes = bis.read(buffer)) >= 0) bos.write(buffer, 0, numBytes);
System.out.println("RESPONSE = " + new String(bos.toString(ENCODING));
( . . . )
And this is the piece of code of the servlet that prints the request
inputstream directly to the response outputstream:
( . . . )
BufferedOutputStream bos = null;
try
{
response.setContentType(request.getContentType());
bos = new
BufferedOutputStream(response.getOutputStream());
InputStream in = request.getInputStream();
byte[] tmp = new byte[4098];
int numBytesRead = 0;
while ((numBytesRead = in.read(tmp, 0, 4098)) >= 0) bos.write(tmp, 0,
numBytesRead);
( . . . )
If the ENCODING variable is ISO-8859-1, the response is received OK:
RESPONSE = ÑÑÑ_ÁÁÁ
But if the ENCODING variable is UTF-8, the response is received KO:
RESPONSE = "???_???"
In this case, if I force an iso decoding in the line:
System.out.println("RESPONSE = " + new String(bos.toString(ISO-8859-1));
Then it works: RESPONSE = ÑÑÑ_ÁÁÁ
Could be an issue in StringEntity? Or maybe Im missing anything?
Thanks in advance,
Joan.
RE: Encoding issue
Posted by Joan Balaguero <jo...@grupoventus.com>.
Hi Oleg,
Thanks a lot. I hadn't seen this constructor. Problem solved.
Joan.
-----Mensaje original-----
De: Oleg Kalnichevski [mailto:olegk@apache.org]
Enviado el: jueves, 12 de enero de 2012 21:04
Para: HttpClient User Discussion
Asunto: RE: Encoding issue
On Thu, 2012-01-12 at 20:57 +0100, Joan Balaguero wrote:
> Hi Oleg,
>
> > StringEntity strEntity = new StringEntity(str);
> >
> This constructor assumes default charset encoding for HTTP content which is ISO-8859-1.
>
> > strEntity.setContentType("text/plain; charset=" + ENCODING);
> >
> This basically causes the content to be decoded incorrectly as long as ENCODING is not ISO-8859-1.
>
>
> Then, what is my option? Something like:
>
> ByteArrayEntity bae = new ByteArrayEntity(str.getBytes(ENCODING));
> objPost.setEntity(bae);
>
>
Why do not you simply use StringEntity#StringEntity(String, String)
constructor?
> Another question: if the StringEntity constructor assumes ISO encoding for the String, what's the utility of 'entity.setContentType'?
To be able to specify a Content-Type with custom attributes besides
'charset'.
Oleg
> I expected to find an empty constructor to do something like:
> StringEntity strEntity = new StringEntity();
> strEntity.setContentType("text/plain; charset=" + ENCODING);
> strEntity.setContent(str);
>
>
> Thanks,
> Joan.
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
-----
No se encontraron virus en este mensaje.
Comprobado por AVG - www.avg.com
Versión: 2012.0.1901 / Base de datos de virus: 2109/4738 - Fecha de publicación: 01/12/12
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
RE: Encoding issue
Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2012-01-12 at 20:57 +0100, Joan Balaguero wrote:
> Hi Oleg,
>
> > StringEntity strEntity = new StringEntity(str);
> >
> This constructor assumes default charset encoding for HTTP content which is ISO-8859-1.
>
> > strEntity.setContentType("text/plain; charset=" + ENCODING);
> >
> This basically causes the content to be decoded incorrectly as long as ENCODING is not ISO-8859-1.
>
>
> Then, what is my option? Something like:
>
> ByteArrayEntity bae = new ByteArrayEntity(str.getBytes(ENCODING));
> objPost.setEntity(bae);
>
>
Why do not you simply use StringEntity#StringEntity(String, String)
constructor?
> Another question: if the StringEntity constructor assumes ISO encoding for the String, what's the utility of 'entity.setContentType'?
To be able to specify a Content-Type with custom attributes besides
'charset'.
Oleg
> I expected to find an empty constructor to do something like:
> StringEntity strEntity = new StringEntity();
> strEntity.setContentType("text/plain; charset=" + ENCODING);
> strEntity.setContent(str);
>
>
> Thanks,
> Joan.
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
RE: Encoding issue
Posted by Joan Balaguero <jo...@grupoventus.com>.
Hi Oleg,
> StringEntity strEntity = new StringEntity(str);
>
This constructor assumes default charset encoding for HTTP content which is ISO-8859-1.
> strEntity.setContentType("text/plain; charset=" + ENCODING);
>
This basically causes the content to be decoded incorrectly as long as ENCODING is not ISO-8859-1.
Then, what is my option? Something like:
ByteArrayEntity bae = new ByteArrayEntity(str.getBytes(ENCODING));
objPost.setEntity(bae);
Another question: if the StringEntity constructor assumes ISO encoding for the String, what's the utility of 'entity.setContentType'?
I expected to find an empty constructor to do something like:
StringEntity strEntity = new StringEntity();
strEntity.setContentType("text/plain; charset=" + ENCODING);
strEntity.setContent(str);
Thanks,
Joan.
-----Mensaje original-----
De: Oleg Kalnichevski [mailto:olegk@apache.org]
Enviado el: jueves, 12 de enero de 2012 16:52
Para: HttpClient User Discussion
Asunto: Re: Encoding issue
On Thu, 2012-01-12 at 14:15 +0100, Joan Balaguero wrote:
> Hello Oleg,
>
>
>
> I’m having an strange issue with encoding using HttpClient 4.0.1.
>
>
>
> I’m sending a string to a servlet using HttpClient 4.0.1. The servlet just
> sends back to the client exactly the same received string:
>
>
>
> This is the piece of code that sends the string to the servlet and receives
> the response:
>
>
>
> ( . . . )
>
>
>
> String ENCODING = “ISO-8859-1”;
>
> String str = "ÑÑÑ_ÁÁÁ";
>
>
>
> objPost = new HttpPost(url);
>
>
>
> StringEntity strEntity = new StringEntity(str);
>
This constructor assumes default charset encoding for HTTP content which
is ISO-8859-1.
> strEntity.setContentType("text/plain; charset=" + ENCODING);
>
This basically causes the content to be decoded incorrectly as long as
ENCODING is not ISO-8859-1.
Oleg
> objPost.setEntity(strEntity);
>
>
>
> HttpEntity entity = null;
>
>
>
> try
>
> {
>
> entity = objHttp.execute(objPost).getEntity();
>
>
>
> // Read the response.
>
> BufferedInputStream bis = new BufferedInputStream(entity.getContent());
>
> ByteArrayOutputStream bos = new ByteArrayOutputStream();
>
> byte[] buffer = new byte[4098];
>
> int numBytes;
>
>
>
> while ((numBytes = bis.read(buffer)) >= 0) bos.write(buffer, 0, numBytes);
>
>
>
> System.out.println("RESPONSE = " + new String(bos.toString(ENCODING));
>
>
>
> ( . . . )
>
>
>
>
>
> And this is the piece of code of the servlet that prints the request
> inputstream directly to the response outputstream:
>
>
>
> ( . . . )
>
>
>
> BufferedOutputStream bos = null;
>
>
>
> try
>
> {
>
> response.setContentType(request.getContentType());
>
>
>
> bos = new
> BufferedOutputStream(response.getOutputStream());
>
> InputStream in = request.getInputStream();
>
> byte[] tmp = new byte[4098];
>
> int numBytesRead = 0;
>
> while ((numBytesRead = in.read(tmp, 0, 4098)) >= 0) bos.write(tmp, 0,
> numBytesRead);
>
>
>
> ( . . . )
>
>
>
>
>
> If the ENCODING variable is “ISO-8859-1”, the response is received OK:
>
> RESPONSE = ÑÑÑ_ÁÁÁ
>
>
>
> But if the ENCODING variable is “UTF-8”, the response is received KO:
>
> RESPONSE = "???_???"
>
> In this case, if I force an “iso” decoding in the line:
>
> System.out.println("RESPONSE = " + new String(bos.toString(“ISO-8859-1”));
>
> Then it works: RESPONSE = ÑÑÑ_ÁÁÁ
>
>
>
>
>
> Could be an issue in StringEntity? Or maybe I’m missing anything?
>
>
>
> Thanks in advance,
>
>
>
> Joan.
>
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
-----
No se encontraron virus en este mensaje.
Comprobado por AVG - www.avg.com
Versión: 2012.0.1901 / Base de datos de virus: 2109/4737 - Fecha de publicación: 01/11/12
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Encoding issue
Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2012-01-12 at 14:15 +0100, Joan Balaguero wrote:
> Hello Oleg,
>
>
>
> Im having an strange issue with encoding using HttpClient 4.0.1.
>
>
>
> Im sending a string to a servlet using HttpClient 4.0.1. The servlet just
> sends back to the client exactly the same received string:
>
>
>
> This is the piece of code that sends the string to the servlet and receives
> the response:
>
>
>
> ( . . . )
>
>
>
> String ENCODING = ISO-8859-1;
>
> String str = "ÑÑÑ_ÁÁÁ";
>
>
>
> objPost = new HttpPost(url);
>
>
>
> StringEntity strEntity = new StringEntity(str);
>
This constructor assumes default charset encoding for HTTP content which
is ISO-8859-1.
> strEntity.setContentType("text/plain; charset=" + ENCODING);
>
This basically causes the content to be decoded incorrectly as long as
ENCODING is not ISO-8859-1.
Oleg
> objPost.setEntity(strEntity);
>
>
>
> HttpEntity entity = null;
>
>
>
> try
>
> {
>
> entity = objHttp.execute(objPost).getEntity();
>
>
>
> // Read the response.
>
> BufferedInputStream bis = new BufferedInputStream(entity.getContent());
>
> ByteArrayOutputStream bos = new ByteArrayOutputStream();
>
> byte[] buffer = new byte[4098];
>
> int numBytes;
>
>
>
> while ((numBytes = bis.read(buffer)) >= 0) bos.write(buffer, 0, numBytes);
>
>
>
> System.out.println("RESPONSE = " + new String(bos.toString(ENCODING));
>
>
>
> ( . . . )
>
>
>
>
>
> And this is the piece of code of the servlet that prints the request
> inputstream directly to the response outputstream:
>
>
>
> ( . . . )
>
>
>
> BufferedOutputStream bos = null;
>
>
>
> try
>
> {
>
> response.setContentType(request.getContentType());
>
>
>
> bos = new
> BufferedOutputStream(response.getOutputStream());
>
> InputStream in = request.getInputStream();
>
> byte[] tmp = new byte[4098];
>
> int numBytesRead = 0;
>
> while ((numBytesRead = in.read(tmp, 0, 4098)) >= 0) bos.write(tmp, 0,
> numBytesRead);
>
>
>
> ( . . . )
>
>
>
>
>
> If the ENCODING variable is ISO-8859-1, the response is received OK:
>
> RESPONSE = ÑÑÑ_ÁÁÁ
>
>
>
> But if the ENCODING variable is UTF-8, the response is received KO:
>
> RESPONSE = "???_???"
>
> In this case, if I force an iso decoding in the line:
>
> System.out.println("RESPONSE = " + new String(bos.toString(ISO-8859-1));
>
> Then it works: RESPONSE = ÑÑÑ_ÁÁÁ
>
>
>
>
>
> Could be an issue in StringEntity? Or maybe Im missing anything?
>
>
>
> Thanks in advance,
>
>
>
> Joan.
>
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org