You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by B K <gr...@hotmail.com> on 2006/08/24 16:55:09 UTC

GZIP Response

I am using HTTPClient to connect to a web site which returns the response in 
GZip compressed format. I was previously just using

new GZIPInputStream(method.getResponseBodyAsStream()

to return the inputstream. I now have a requirement to determine how large 
the compressed response is(to determine bandwidth usage), as the server is 
not returning content-length I thought I would do the following

create a string buffer
create a buffered reader for getResponseBodyAsStream()
read the response and append to my string buffer

Now at this point I have a stringbuffer with the compressed content, I can 
use this to get the response length.

Then I tried a

return new GZIPInputStream(new 
ByteArrayInputStream(sb.toString().getBytes()));

It falls over with the following

java.io.IOException: Corrupt GZIP trailer
	at java.util.zip.GZIPInputStream.readTrailer(Unknown Source)
	at java.util.zip.GZIPInputStream.read(Unknown Source)
	at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(Unknown Source)
	at sun.nio.cs.StreamDecoder$CharsetSD.implRead(Unknown Source)
	at sun.nio.cs.StreamDecoder.read(Unknown Source)

Anyone got any ideas what the issue might be? Thanks BK



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Re: GZIP Response

Posted by Roland Weber <ht...@dubioso.net>.
Hi B K,

> to return the inputstream. I now have a requirement to determine how
> large the compressed response is(to determine bandwidth usage), as the
> server is not returning content-length I thought I would do the following
> 
> create a string buffer
> create a buffered reader for getResponseBodyAsStream()
> read the response and append to my string buffer

As Robert pointed out, you have issues with character vs binary data
and you are clogging memory with this approach. Instead of just wrapping
the GZIPInputStream, you can implement your own wrapper stream that counts
the number of bytes being read, then wrap the GZIPInputStream on top of
that. Use java.io.FilterInputStream as a starting point and override all
read() and skip() methods to count the number of bytes.

hope that helps,
  Roland

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Re: GZIP Response

Posted by Robert Olofsson <ro...@khelekore.org>.
B K wrote:
> As the response body is wrapped by GZIPInputStream I don't believe I 
> would be able to do this for a true indication of bw, please correct me 
> if Im wrong.

See what Roland wrote about a filtered stream that counts the bytes in
all read and skip methods.

>> Bad idea!
>> You try to mix binary data with character data without any care for 
>> character encodings. What method do you use to append data to the buffer?
> I gave this a go, with the understanding it may require a permanent 
> solution later, so i coded the following to see if it works

> ByteArrayOutputStream baos = new ByteArrayOutputStream();
> BufferedReader br = new BufferedReader(new  
> InputStreamReader(method.getResponseBodyAsStream()));

Streams are for binary data, Readers are for character data.
You are still mixing the two.

If you use an InputStreamReader then specify the character encoding
you want for the conversion of bytes to characters.

> String line = br.readLine();

reading lines means that you are reading characters.

> ...However I get exactly the same result, I thought this is what you meant 
> to try? If not please excuse my ignorance.

You really have to read up on what streams and readers are for.

Using a FilterInputStream as Roland suggests is the right way, but
if you do not want that you can try something like this:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
InputStream is = method.getResponseBodyAsStream();
byte[] buf = new byte[1024];
int read = 0;
while ((read = is.read(buf)) > -1)
     baos.write(buf, 0, read);

/* use baos.array for gzip input.. */

I do not think that you need to use a BufferedInputStream since
the code already does read(byte[]), but I am not sure, add the
BufferedInputStream if you need it.

/robo


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Re: GZIP Response

Posted by B K <gr...@hotmail.com>.
>B K wrote:
>>to return the inputstream. I now have a requirement to determine how large 
>>the compressed response is(to determine bandwidth usage), as the server is 
>>not returning content-length I thought I would do the following
>
>Can you not just keep track of how much data you have read from the stream?

As the response body is wrapped by GZIPInputStream I don't believe I would 
be able to do this for a true indication of bw, please correct me if Im 
wrong.

>
>>create a string buffer
>>create a buffered reader for getResponseBodyAsStream()
>>read the response and append to my string buffer
>
>Bad idea!
>You try to mix binary data with character data without any care for 
>character encodings. What method do you use to append data to the buffer?
>
>If you want to do this, then put the data into a ByteArrayOutputStream 
>instead of a StringBuffer. Then you have binary data.

I gave this a go, with the understanding it may require a permanent solution 
later, so i coded the following to see if it works

ByteArrayOutputStream baos = new ByteArrayOutputStream();
BufferedReader br = new BufferedReader(new  
InputStreamReader(method.getResponseBodyAsStream()));
String line = br.readLine();
while (line != null) {
  baos.write(line.getBytes());
  line = br.readLine();
}
br.close();
method.releaseConnection();
return new GZIPInputStream(new ByteArrayInputStream(baos.toByteArray()));

However I get exactly the same result, I thought this is what you meant to 
try? If not please excuse my ignorance.



>
>>Now at this point I have a stringbuffer with the compressed content, I can 
>>use this to get the response length.
>
>No, you can get the number of characters, not the same thing.
>
>Note: since you do not know in advance how much data you will get you risk 
>getting OOME if the server sends a DVD or something other large.
>
>/robo
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Re: GZIP Response

Posted by Robert Olofsson <ro...@khelekore.org>.
B K wrote:
> to return the inputstream. I now have a requirement to determine how 
> large the compressed response is(to determine bandwidth usage), as the 
> server is not returning content-length I thought I would do the following

Can you not just keep track of how much data you have read from the stream?

> create a string buffer
> create a buffered reader for getResponseBodyAsStream()
> read the response and append to my string buffer

Bad idea!
You try to mix binary data with character data without any care for 
character encodings. What method do you use to append data to the buffer?

If you want to do this, then put the data into a ByteArrayOutputStream 
instead of a StringBuffer. Then you have binary data.

> Now at this point I have a stringbuffer with the compressed content, I 
> can use this to get the response length.

No, you can get the number of characters, not the same thing.

Note: since you do not know in advance how much data you will get you 
risk getting OOME if the server sends a DVD or something other large.

/robo

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org