You are viewing a plain text version of this content. The canonical link for it is here.

Posted to httpclient-users@hc.apache.org by Tony Thompson <To...@stone-ware.com> on 2007/04/11 03:29:53 UTC

Performance issues in ChunkedInputStream

I am having what I consider a fairly significant performance issue with
a ChunkedInputStream in the 3.0.1 client.  I have been packet tracing a
conversation using the HttpClient where the response is chunked and my
application is reading the response stream directly from the method I
executed.  On the wire, frequently during a large chunked response I see
"ZeroWindow" responses from my client to the server which would indicate
that the HttpClient is not getting the data off the wire fast enough.  I
have tried making the BufferedInputStream in HttpConnection large (128K)
and it still fills up (it just takes a little longer).
 
So, after doing some profiling of ChunkedInputStream I found that a huge
amount of time is spent in
ChunkedInputStream.getChunkSizeFromInputStream().  In the very short
profiling session I ran, ChunkedInputStream.read( byte[], int, int ) was
invoked 2424 times and the time spent in that method (excluding further
method calls) was 870ms.  getChunkSizeFromInputStream() was invoked 432
times and the time spent in that method was 27762ms (excluding further
method calls).  Does someone who understands that code better than I
have any idea how that can be improved?
 
One other issue I have with that code is if I interrupt the file
transfer and call method.abort(), that ChunkedInputStream appears to
still keep pulling data from the host.  Wouldn't it just make more sense
to just close down that connection instead of making it sit there and
pull data that is just dumped into the bit bucket?  It could sit there
and just waste bandwidth if the chunked response is large.  I am also
using the idle connection timer with the connection manager and I
frequently see connections like that abandoned, the idle timer kicks in
and just resets the socket.
 
Thanks for any help.
Tony
 
This message (and any associated files) is intended only for the 
use of the individual or entity to which it is addressed and may 
contain information that is confidential, subject to copyright or
constitutes a trade secret. If you are not the intended recipient 
you are hereby notified that any dissemination, copying or 
distribution of this message, or files associated with this message, 
is strictly prohibited. If you have received this message in error, 
please notify us immediately by replying to the message and deleting 
it from your computer. Messages sent to and from Stoneware, Inc.
may be monitored.

RE: Performance issues in ChunkedInputStream

Posted by Oleg Kalnichevski <ol...@apache.org>.

On Wed, 2007-04-11 at 14:42 +0200, Roland Weber wrote:
> Hi Oleg,
> 
> > I do not think this is the case. #exhaustInputStream() is called ONLY if
> > the connection is being released back to the connection manager.
> > HttpMethod#abort() simply calls HttpConnection#close(), which in its
> > turn just closes down the underlying network socket without trying to
> > exhaust the input stream. 
> > 
> > http://jakarta.apache.
> > 
> org/commons/httpclient/xref/org/apache/commons/httpclient/HttpMethodBase.
> > html#1102
> > http://jakarta.apache.
> > 
> org/commons/httpclient/xref/org/apache/commons/httpclient/HttpConnection.
> > html#1214
> 
> these are links to 3.1-RC1 code, while Tony is on 3.0.1.
> Are you sure there were no improvements in that area?
> 
> cheers,
>   Roland
> 

I am _reasonably_ sure. The 3.1 branch has seen very few changes outside
the RFC2965 cookie spec and the multithreaded HTTP connection manager.
Definitely nothing remotely severe as a fix for completely broken
HttpMethod#abort().

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Roland Weber <RO...@de.ibm.com>.

Hi Oleg,

> I do not think this is the case. #exhaustInputStream() is called ONLY if
> the connection is being released back to the connection manager.
> HttpMethod#abort() simply calls HttpConnection#close(), which in its
> turn just closes down the underlying network socket without trying to
> exhaust the input stream. 
> 
> http://jakarta.apache.
> 
org/commons/httpclient/xref/org/apache/commons/httpclient/HttpMethodBase.
> html#1102
> http://jakarta.apache.
> 
org/commons/httpclient/xref/org/apache/commons/httpclient/HttpConnection.
> html#1214

these are links to 3.1-RC1 code, while Tony is on 3.0.1.
Are you sure there were no improvements in that area?

cheers,
  Roland

RE: Performance issues in ChunkedInputStream

Posted by Oleg Kalnichevski <ol...@apache.org>.

On Wed, 2007-04-11 at 08:02 -0400, Tony Thompson wrote:
> > I am having what I consider a fairly significant performance issue 
> > with a ChunkedInputStream in the 3.0.1 client.  I have been packet 
> > tracing a conversation using the HttpClient where the response is 
> > chunked and my application is reading the response stream directly 
> > from the method I executed.  On the wire, frequently during a large 
> > chunked response I see "ZeroWindow" responses from my client to the 
> > server which would indicate that the HttpClient is not getting the 
> > data off the wire fast enough.  I have tried making the 
> > BufferedInputStream in HttpConnection large (128K) and it still fills
> up (it just takes a little longer).
> >  
> > So, after doing some profiling of ChunkedInputStream I found that a 
> > huge amount of time is spent in 
> > ChunkedInputStream.getChunkSizeFromInputStream().  In the very short 
> > profiling session I ran, ChunkedInputStream.read( byte[], int, int ) 
> > was invoked 2424 times and the time spent in that method (excluding 
> > further method calls) was 870ms.  getChunkSizeFromInputStream() was 
> > invoked 432 times and the time spent in that method was 27762ms 
> > (excluding further method calls).  Does someone who understands that 
> > code better than I have any idea how that can be improved?
> >  
> >
> >Tony,
> >
> >I have spent a fair amount of time and efforts profiling HttpCore, a
> set of low level HTTP transport components HttpClient 4.0 will be 
> >based on.
> >Overall HttpClient 4.0 is expected to be 20 to 40% faster then
> HttpClient 3.x due to improvements in the core HTTP components. The sad 
> >truth is we simply lack resources to back-port those changes to
> HttpClient 3.x code line. 
> 
> Unfortunately I need to come up with some kind of fix now.  I am
> currently using this in an environment where this is causing lots of
> issues.  So, I guess that means I have to dig into that myself.  Any
> pointers on what I might be able to do to improve that particular piece
> of the code?
> 

I ended up rewriting it almost completely for HttpClient 4.0. One of the
problems I found is that in lots of places HttpClient 3.x reads one byte
at a time from the input stream in order to be able to detect a CRLF /
LF line delimiter, which may be one of the factors contributing to the
performance issue you have been having. I simply do not see an easy fix
for this problem. 

> > One other issue I have with that code is if I interrupt the file 
> > transfer and call method.abort(), that ChunkedInputStream appears to 
> > still keep pulling data from the host.  Wouldn't it just make more 
> > sense to just close down that connection instead of making it sit 
> > there and pull data that is just dumped into the bit bucket?
> >
> >This precisely what HttpMethod#abort() does. It simply shuts down the
> underlying connection. I have a hard time believing any data can 
> >be received after the connection socket has been closed. It is
> plausible, though, some data may still be read from an intermediate 
> >content buffer, but I find this scenario unlikely.
> 
> You may want to take a look at that in the new client.  In the 3.0.1
> client, after that stream is closed, exhaustInputStream() is called
> which attempts to finish reading the content so the connection can be
> ready for another request. 

I do not think this is the case. #exhaustInputStream() is called ONLY if
the connection is being released back to the connection manager.
HttpMethod#abort() simply calls HttpConnection#close(), which in its
turn just closes down the underlying network socket without trying to
exhaust the input stream. 

http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpclient/HttpMethodBase.html#1102
http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpclient/HttpConnection.html#1214

Hope this helps

Oleg


>  In my case, that is a lot of content and so
> it continues on for a bit before the socket is just reset.  I am not
> continuing to read in my application but looking at a packet trace I can
> see the client is doing it for me.
> 
> Thanks
> Tony
>  
> This message (and any associated files) is intended only for the 
> use of the individual or entity to which it is addressed and may 
> contain information that is confidential, subject to copyright or
> constitutes a trade secret. If you are not the intended recipient 
> you are hereby notified that any dissemination, copying or 
> distribution of this message, or files associated with this message, 
> is strictly prohibited. If you have received this message in error, 
> please notify us immediately by replying to the message and deleting 
> it from your computer. Messages sent to and from Stoneware, Inc.
> may be monitored.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Tony Thompson <To...@stone-ware.com>.

> I am having what I consider a fairly significant performance issue 
> with a ChunkedInputStream in the 3.0.1 client.  I have been packet 
> tracing a conversation using the HttpClient where the response is 
> chunked and my application is reading the response stream directly 
> from the method I executed.  On the wire, frequently during a large 
> chunked response I see "ZeroWindow" responses from my client to the 
> server which would indicate that the HttpClient is not getting the 
> data off the wire fast enough.  I have tried making the 
> BufferedInputStream in HttpConnection large (128K) and it still fills
up (it just takes a little longer).
>  
> So, after doing some profiling of ChunkedInputStream I found that a 
> huge amount of time is spent in 
> ChunkedInputStream.getChunkSizeFromInputStream().  In the very short 
> profiling session I ran, ChunkedInputStream.read( byte[], int, int ) 
> was invoked 2424 times and the time spent in that method (excluding 
> further method calls) was 870ms.  getChunkSizeFromInputStream() was 
> invoked 432 times and the time spent in that method was 27762ms 
> (excluding further method calls).  Does someone who understands that 
> code better than I have any idea how that can be improved?
>  
>
>Tony,
>
>I have spent a fair amount of time and efforts profiling HttpCore, a
set of low level HTTP transport components HttpClient 4.0 will be 
>based on.
>Overall HttpClient 4.0 is expected to be 20 to 40% faster then
HttpClient 3.x due to improvements in the core HTTP components. The sad 
>truth is we simply lack resources to back-port those changes to
HttpClient 3.x code line. 

Unfortunately I need to come up with some kind of fix now.  I am
currently using this in an environment where this is causing lots of
issues.  So, I guess that means I have to dig into that myself.  Any
pointers on what I might be able to do to improve that particular piece
of the code?

> One other issue I have with that code is if I interrupt the file 
> transfer and call method.abort(), that ChunkedInputStream appears to 
> still keep pulling data from the host.  Wouldn't it just make more 
> sense to just close down that connection instead of making it sit 
> there and pull data that is just dumped into the bit bucket?
>
>This precisely what HttpMethod#abort() does. It simply shuts down the
underlying connection. I have a hard time believing any data can 
>be received after the connection socket has been closed. It is
plausible, though, some data may still be read from an intermediate 
>content buffer, but I find this scenario unlikely.

You may want to take a look at that in the new client.  In the 3.0.1
client, after that stream is closed, exhaustInputStream() is called
which attempts to finish reading the content so the connection can be
ready for another request.  In my case, that is a lot of content and so
it continues on for a bit before the socket is just reset.  I am not
continuing to read in my application but looking at a packet trace I can
see the client is doing it for me.

Thanks
Tony
 
This message (and any associated files) is intended only for the 
use of the individual or entity to which it is addressed and may 
contain information that is confidential, subject to copyright or
constitutes a trade secret. If you are not the intended recipient 
you are hereby notified that any dissemination, copying or 
distribution of this message, or files associated with this message, 
is strictly prohibited. If you have received this message in error, 
please notify us immediately by replying to the message and deleting 
it from your computer. Messages sent to and from Stoneware, Inc.
may be monitored.

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

Re: Performance issues in ChunkedInputStream

Posted by Oleg Kalnichevski <ol...@apache.org>.

On Tue, 2007-04-10 at 21:29 -0400, Tony Thompson wrote:
> I am having what I consider a fairly significant performance issue with
> a ChunkedInputStream in the 3.0.1 client.  I have been packet tracing a
> conversation using the HttpClient where the response is chunked and my
> application is reading the response stream directly from the method I
> executed.  On the wire, frequently during a large chunked response I see
> "ZeroWindow" responses from my client to the server which would indicate
> that the HttpClient is not getting the data off the wire fast enough.  I
> have tried making the BufferedInputStream in HttpConnection large (128K)
> and it still fills up (it just takes a little longer).
>  
> So, after doing some profiling of ChunkedInputStream I found that a huge
> amount of time is spent in
> ChunkedInputStream.getChunkSizeFromInputStream().  In the very short
> profiling session I ran, ChunkedInputStream.read( byte[], int, int ) was
> invoked 2424 times and the time spent in that method (excluding further
> method calls) was 870ms.  getChunkSizeFromInputStream() was invoked 432
> times and the time spent in that method was 27762ms (excluding further
> method calls).  Does someone who understands that code better than I
> have any idea how that can be improved?
>  

Tony,

I have spent a fair amount of time and efforts profiling HttpCore, a set
of low level HTTP transport components HttpClient 4.0 will be based on.
Overall HttpClient 4.0 is expected to be 20 to 40% faster then
HttpClient 3.x due to improvements in the core HTTP components. The sad
truth is we simply lack resources to back-port those changes to
HttpClient 3.x code line. 


> One other issue I have with that code is if I interrupt the file
> transfer and call method.abort(), that ChunkedInputStream appears to
> still keep pulling data from the host.  Wouldn't it just make more sense
> to just close down that connection instead of making it sit there and
> pull data that is just dumped into the bit bucket? 

This precisely what HttpMethod#abort() does. It simply shuts down the
underlying connection. I have a hard time believing any data can be
received after the connection socket has been closed. It is plausible,
though, some data may still be read from an intermediate content buffer,
but I find this scenario unlikely.

I hope this helps somewhat

Oleg

>  It could sit there
> and just waste bandwidth if the chunked response is large.  I am also
> using the idle connection timer with the connection manager and I
> frequently see connections like that abandoned, the idle timer kicks in
> and just resets the socket.
>  
> Thanks for any help.
> Tony
>  
> This message (and any associated files) is intended only for the 
> use of the individual or entity to which it is addressed and may 
> contain information that is confidential, subject to copyright or
> constitutes a trade secret. If you are not the intended recipient 
> you are hereby notified that any dissemination, copying or 
> distribution of this message, or files associated with this message, 
> is strictly prohibited. If you have received this message in error, 
> please notify us immediately by replying to the message and deleting 
> it from your computer. Messages sent to and from Stoneware, Inc.
> may be monitored.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Igor Lubashev <Ig...@everypoint.com>.

It looks like attachments are filtered out.
Here is the code inline.

/**
 * This is not used, but it is a nice little class that I wrote for
Apache, and it might be useful some day.
 * NOTE: This has not been tested
 */
package com.foo.util;

import java.io.BufferedInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.Charset;

/**
 * @author Igor Lubashev
 *
 */
public class LineReaderInputStream extends BufferedInputStream {
    private static int defaultBufferSize     = 8192;
    private static int defaultLineSizeLimit  = 8192;

    private final int lineSizeLimit;
    
    public LineReaderInputStream(InputStream in) {
        this(in, defaultBufferSize, defaultLineSizeLimit);
    }

    public LineReaderInputStream(InputStream in, int size) {
        this(in, size, defaultLineSizeLimit);
    }

    public LineReaderInputStream(InputStream in, int size, int
lineSizeLimit) {
        super(in, size);
        this.lineSizeLimit = lineSizeLimit;
    }

    public synchronized String readLine(Charset charset) throws
IOException {
        int savedMarkPos   = markpos;
        int savedMarkLimit = marklimit;

        if( savedMarkPos >= 0 ) {
            marklimit += lineSizeLimit;
        } else {
            markpos   = pos;
            marklimit = lineSizeLimit;
        }

        String retLine;
        int lineStart = pos;

      topLoop:
        while( true ) {
            while( pos < count ) {
                if( buf[pos++] == '\n' ) {
                    int lineLen = ((pos > 1 && buf[pos-2] == '\r') ?
pos-2 : pos-1) - lineStart;
                    retLine = new String(buf, lineStart, lineLen,
charset);
                    break topLoop;
                }
            }
            
            // Fill buffer with more data
            int prevPos = pos;
            if( read() < 0 ) {
                retLine = null;
                break topLoop;
            }
            lineStart -= prevPos - pos; // Adjust for the moved buffer
        }

        // Cleanup and return
        markpos   = savedMarkPos;
        marklimit = savedMarkLimit;
        return retLine;
    }
}


-----Original Message-----
From: Igor Lubashev [mailto:IgorLubashev@everypoint.com] 
Sent: Thursday, April 12, 2007 1:41 PM
To: HttpClient User Discussion
Subject: RE: Performance issues in ChunkedInputStream

1.  BufferedInputStream is working fine.  I've looked at the source, and
it correctly tried to read data only when its internal buffer is
exhausted.  Most read calls reference only the internal buffer.  When
the data does get read from the underlying stream, it tries to read it
in large chunks.  (Of course, if the underlying stream returns very
little data, it is a different problem.)

2.  It is hard to believe that reading a byte at a time is a bottleneck,
but I've just quickly written a LineReaderInputStream, which is derived
from BufferedInputStream, so all the searching for CRLF/LF happens very
quickly internally.  The source is attached.

Just call readLine() method, and you'll get Strings out of the stream.
You can interleave all regular stream operations and readLine() calls.
However, if you wish to use readLine() *after* using the stream's read()
methods, make sure that you do not inadvertently pass this stream to
anything that is buffering the stream's data (or your strings may get
consumed via buffering).

- Igor


>>> I looked at the source for BufferedInputStream and it looks like
>>> it tries to fill the empty space in the buffer each time you read
from
>> it (for a socket connection it will read more than one packet of
data)
>>> instead of just doing a single read from the underlying stream.
>>>     
>>
>> Ok, then the byte-by-byte reading in CIS when parsing the chunk
header
>> might well be the problem. If you want to fix that, you'll have to
hack
>> deeply into CIS. Here is what I would do if I had no other choice:
>>
>> - extend CIS by a local byte array as a buffer (needs two extra int
>>   for cursor and fill size)
>> - change the chunk header parsing to read a bunch of bytes into the
>>   buffer, then parsing from there
>> - change all read methods to return leftover bytes from the buffer
>>   before calling a read on the underlying stream
>>
>> hope that helps,
>>   Roland
>>   
>Tony and Roland,
>
>I suspect rather strongly it is BufferedInputStream that needs fixing, 
>not ChunkedInputStream
>
>Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Oleg Kalnichevski <ol...@apache.org>.

On Fri, 2007-04-13 at 09:13 -0700, Igor Lubashev wrote: 
> Oleg, I've not only done 0 benchmarking, but I've done 0 testing on this
> class.  It's just something I put together to throw into the discussion.
> I make to further claims than this.  In fact, I prefaced this post with
> "It is hard to believe that reading a byte at a time is a bottleneck".
> 
> Now, when you are saying that "most performance gains came 
> chiefly from [...] elimination of unnecessary synchronization", do you
> mean only contested synchronization points or uncontested ones, too?
> 
> If even uncontested synchronization is undesirable, then
> LineReaderInputStream.readLine() eliminates the need to repeatedly call
> BufferedInputStream.read(), which is synchronized.
> 

Igor,

Uncontested synchronization is clearly undesirable if it is completely
pointless. StringBuffer, which is used quite extensively throughout
HttpClient 3.x codebase, is synchronized for some stupid reason.
Instances of StringBuffer are usually short-lived and are accessed by a
single thread only (who in their sane mind would want to concatenate
strings from multiple threads anyways?). So, synchronization (even
uncontested) on an instance of StringBuffer is a waste of CPU cycles.
Yet, there are places where StringBuffer gets filled character by
character. Same story goes for BufferedInputStream.

The point I am trying to make is simple. If anyone is prepared to submit
a fully tested patch that applies cleanly against the SVN trunk and
provides sufficient test coverage for the new functionality, I'll
happily check it in. I have already done all this hard work for
HttpClient 4.0 including writing test cases. I have no interest of what
so ever to repeat this work for HttpClient 3.1.  

Oleg

> - Igor
> 
> 
> -----Original Message-----
> From: Oleg Kalnichevski [mailto:olegk@apache.org] 
> Sent: Thursday, April 12, 2007 6:01 PM
> To: HttpClient User Discussion
> Subject: Re: Performance issues in ChunkedInputStream
> 
> Igor Lubashev wrote:
> > 1.  BufferedInputStream is working fine.  I've looked at the source,
> and
> > it correctly tried to read data only when its internal buffer is
> > exhausted.  Most read calls reference only the internal buffer.  When
> > the data does get read from the underlying stream, it tries to read it
> > in large chunks.  (Of course, if the underlying stream returns very
> > little data, it is a different problem.)
> >
> > 2.  It is hard to believe that reading a byte at a time is a
> bottleneck,
> > but I've just quickly written a LineReaderInputStream, which is
> derived
> > from BufferedInputStream, so all the searching for CRLF/LF happens
> very
> > quickly internally.  The source is attached.
> >
> > Just call readLine() method, and you'll get Strings out of the stream.
> > You can interleave all regular stream operations and readLine() calls.
> > However, if you wish to use readLine() *after* using the stream's
> read()
> > methods, make sure that you do not inadvertently pass this stream to
> > anything that is buffering the stream's data (or your strings may get
> > consumed via buffering).
> >
> > - Igor
> >
> >
> >   
> 
> Igor,
> 
> With all due respect given the implementation of 
> BufferedInputStream#read() method in Sun's JRE (see below) I just do not
> 
> see how LineReaderInputStream should be any faster
> 
>     public synchronized int read() throws IOException {
>     if (pos >= count) {
>         fill();
>         if (pos >= count)
>         return -1;
>     }
>     return getBufIfOpen()[pos++] & 0xff;
>     }
> 
> Have you done any benchmarking comparing performance of HttpClient 3.x 
> with and without the patch?
> 
> I have invested a lot of efforts into optimizing the low level HTTP 
> components for HttpClient 4.0 [1] and most performance gains came 
> chiefly from three factors: elimination of unnecessary synchronization 
> and intermediate buffer copying and reduced garbage (thus reduced GC 
> time). Performance improvement due to the improved HTTP header parser 
> and chunk codec were marginal at best.
> 
> Oleg
> 
> [1] http://jakarta.apache.org/httpcomponents/httpcore/index.html
> 
> 
> 
> 
> >>>> I looked at the source for BufferedInputStream and it looks like
> >>>> it tries to fill the empty space in the buffer each time you read
> >>>>         
> > from
> >   
> >>> it (for a socket connection it will read more than one packet of
> >>>       
> > data)
> >   
> >>>> instead of just doing a single read from the underlying stream.
> >>>>     
> >>>>         
> >>> Ok, then the byte-by-byte reading in CIS when parsing the chunk
> >>>       
> > header
> >   
> >>> might well be the problem. If you want to fix that, you'll have to
> >>>       
> > hack
> >   
> >>> deeply into CIS. Here is what I would do if I had no other choice:
> >>>
> >>> - extend CIS by a local byte array as a buffer (needs two extra int
> >>>   for cursor and fill size)
> >>> - change the chunk header parsing to read a bunch of bytes into the
> >>>   buffer, then parsing from there
> >>> - change all read methods to return leftover bytes from the buffer
> >>>   before calling a read on the underlying stream
> >>>
> >>> hope that helps,
> >>>   Roland
> >>>   
> >>>       
> >> Tony and Roland,
> >>
> >> I suspect rather strongly it is BufferedInputStream that needs
> fixing, 
> >> not ChunkedInputStream
> >>
> >> Oleg
> >>     
> >
> >
> >   
> >
> ------------------------------------------------------------------------
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> httpclient-user-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Igor Lubashev <Ig...@everypoint.com>.

Oleg, I've not only done 0 benchmarking, but I've done 0 testing on this
class.  It's just something I put together to throw into the discussion.
I make to further claims than this.  In fact, I prefaced this post with
"It is hard to believe that reading a byte at a time is a bottleneck".

Now, when you are saying that "most performance gains came 
chiefly from [...] elimination of unnecessary synchronization", do you
mean only contested synchronization points or uncontested ones, too?

If even uncontested synchronization is undesirable, then
LineReaderInputStream.readLine() eliminates the need to repeatedly call
BufferedInputStream.read(), which is synchronized.

- Igor


-----Original Message-----
From: Oleg Kalnichevski [mailto:olegk@apache.org] 
Sent: Thursday, April 12, 2007 6:01 PM
To: HttpClient User Discussion
Subject: Re: Performance issues in ChunkedInputStream

Igor Lubashev wrote:
> 1.  BufferedInputStream is working fine.  I've looked at the source,
and
> it correctly tried to read data only when its internal buffer is
> exhausted.  Most read calls reference only the internal buffer.  When
> the data does get read from the underlying stream, it tries to read it
> in large chunks.  (Of course, if the underlying stream returns very
> little data, it is a different problem.)
>
> 2.  It is hard to believe that reading a byte at a time is a
bottleneck,
> but I've just quickly written a LineReaderInputStream, which is
derived
> from BufferedInputStream, so all the searching for CRLF/LF happens
very
> quickly internally.  The source is attached.
>
> Just call readLine() method, and you'll get Strings out of the stream.
> You can interleave all regular stream operations and readLine() calls.
> However, if you wish to use readLine() *after* using the stream's
read()
> methods, make sure that you do not inadvertently pass this stream to
> anything that is buffering the stream's data (or your strings may get
> consumed via buffering).
>
> - Igor
>
>
>   

Igor,

With all due respect given the implementation of 
BufferedInputStream#read() method in Sun's JRE (see below) I just do not

see how LineReaderInputStream should be any faster

    public synchronized int read() throws IOException {
    if (pos >= count) {
        fill();
        if (pos >= count)
        return -1;
    }
    return getBufIfOpen()[pos++] & 0xff;
    }

Have you done any benchmarking comparing performance of HttpClient 3.x 
with and without the patch?

I have invested a lot of efforts into optimizing the low level HTTP 
components for HttpClient 4.0 [1] and most performance gains came 
chiefly from three factors: elimination of unnecessary synchronization 
and intermediate buffer copying and reduced garbage (thus reduced GC 
time). Performance improvement due to the improved HTTP header parser 
and chunk codec were marginal at best.

Oleg

[1] http://jakarta.apache.org/httpcomponents/httpcore/index.html




>>>> I looked at the source for BufferedInputStream and it looks like
>>>> it tries to fill the empty space in the buffer each time you read
>>>>         
> from
>   
>>> it (for a socket connection it will read more than one packet of
>>>       
> data)
>   
>>>> instead of just doing a single read from the underlying stream.
>>>>     
>>>>         
>>> Ok, then the byte-by-byte reading in CIS when parsing the chunk
>>>       
> header
>   
>>> might well be the problem. If you want to fix that, you'll have to
>>>       
> hack
>   
>>> deeply into CIS. Here is what I would do if I had no other choice:
>>>
>>> - extend CIS by a local byte array as a buffer (needs two extra int
>>>   for cursor and fill size)
>>> - change the chunk header parsing to read a bunch of bytes into the
>>>   buffer, then parsing from there
>>> - change all read methods to return leftover bytes from the buffer
>>>   before calling a read on the underlying stream
>>>
>>> hope that helps,
>>>   Roland
>>>   
>>>       
>> Tony and Roland,
>>
>> I suspect rather strongly it is BufferedInputStream that needs
fixing, 
>> not ChunkedInputStream
>>
>> Oleg
>>     
>
>
>   
>
------------------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
httpclient-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

Re: Performance issues in ChunkedInputStream

Posted by Oleg Kalnichevski <ol...@apache.org>.

Igor Lubashev wrote:
> 1.  BufferedInputStream is working fine.  I've looked at the source, and
> it correctly tried to read data only when its internal buffer is
> exhausted.  Most read calls reference only the internal buffer.  When
> the data does get read from the underlying stream, it tries to read it
> in large chunks.  (Of course, if the underlying stream returns very
> little data, it is a different problem.)
>
> 2.  It is hard to believe that reading a byte at a time is a bottleneck,
> but I've just quickly written a LineReaderInputStream, which is derived
> from BufferedInputStream, so all the searching for CRLF/LF happens very
> quickly internally.  The source is attached.
>
> Just call readLine() method, and you'll get Strings out of the stream.
> You can interleave all regular stream operations and readLine() calls.
> However, if you wish to use readLine() *after* using the stream's read()
> methods, make sure that you do not inadvertently pass this stream to
> anything that is buffering the stream's data (or your strings may get
> consumed via buffering).
>
> - Igor
>
>
>   

Igor,

With all due respect given the implementation of 
BufferedInputStream#read() method in Sun's JRE (see below) I just do not 
see how LineReaderInputStream should be any faster

    public synchronized int read() throws IOException {
    if (pos >= count) {
        fill();
        if (pos >= count)
        return -1;
    }
    return getBufIfOpen()[pos++] & 0xff;
    }

Have you done any benchmarking comparing performance of HttpClient 3.x 
with and without the patch?

I have invested a lot of efforts into optimizing the low level HTTP 
components for HttpClient 4.0 [1] and most performance gains came 
chiefly from three factors: elimination of unnecessary synchronization 
and intermediate buffer copying and reduced garbage (thus reduced GC 
time). Performance improvement due to the improved HTTP header parser 
and chunk codec were marginal at best.

Oleg

[1] http://jakarta.apache.org/httpcomponents/httpcore/index.html




>>>> I looked at the source for BufferedInputStream and it looks like
>>>> it tries to fill the empty space in the buffer each time you read
>>>>         
> from
>   
>>> it (for a socket connection it will read more than one packet of
>>>       
> data)
>   
>>>> instead of just doing a single read from the underlying stream.
>>>>     
>>>>         
>>> Ok, then the byte-by-byte reading in CIS when parsing the chunk
>>>       
> header
>   
>>> might well be the problem. If you want to fix that, you'll have to
>>>       
> hack
>   
>>> deeply into CIS. Here is what I would do if I had no other choice:
>>>
>>> - extend CIS by a local byte array as a buffer (needs two extra int
>>>   for cursor and fill size)
>>> - change the chunk header parsing to read a bunch of bytes into the
>>>   buffer, then parsing from there
>>> - change all read methods to return leftover bytes from the buffer
>>>   before calling a read on the underlying stream
>>>
>>> hope that helps,
>>>   Roland
>>>   
>>>       
>> Tony and Roland,
>>
>> I suspect rather strongly it is BufferedInputStream that needs fixing, 
>> not ChunkedInputStream
>>
>> Oleg
>>     
>
>
>   
> ------------------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Igor Lubashev <Ig...@everypoint.com>.

Correction.

One needs to add the following line as the last line of the outer loop
in readLIne(). Sorry.

            pos--;  // Roll back the byte that was returned by read()



-----Original Message-----
From: Igor Lubashev 
Sent: Thursday, April 12, 2007 2:00 PM
To: 'HttpClient User Discussion'
Subject: RE: Performance issues in ChunkedInputStream

It looks like attachments are filtered out.
Here is the code inline.

/**
 * This is not used, but it is a nice little class that I wrote for
Apache, and it might be useful some day.
 * NOTE: This has not been tested
 */
package com.foo.util;

import java.io.BufferedInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.Charset;

/**
 * @author Igor Lubashev
 *
 */
public class LineReaderInputStream extends BufferedInputStream {
    private static int defaultBufferSize     = 8192;
    private static int defaultLineSizeLimit  = 8192;

    private final int lineSizeLimit;
    
    public LineReaderInputStream(InputStream in) {
        this(in, defaultBufferSize, defaultLineSizeLimit);
    }

    public LineReaderInputStream(InputStream in, int size) {
        this(in, size, defaultLineSizeLimit);
    }

    public LineReaderInputStream(InputStream in, int size, int
lineSizeLimit) {
        super(in, size);
        this.lineSizeLimit = lineSizeLimit;
    }

    public synchronized String readLine(Charset charset) throws
IOException {
        int savedMarkPos   = markpos;
        int savedMarkLimit = marklimit;

        if( savedMarkPos >= 0 ) {
            marklimit += lineSizeLimit;
        } else {
            markpos   = pos;
            marklimit = lineSizeLimit;
        }

        String retLine;
        int lineStart = pos;

      topLoop:
        while( true ) {
            while( pos < count ) {
                if( buf[pos++] == '\n' ) {
                    int lineLen = ((pos > 1 && buf[pos-2] == '\r') ?
pos-2 : pos-1) - lineStart;
                    retLine = new String(buf, lineStart, lineLen,
charset);
                    break topLoop;
                }
            }
            
            // Fill buffer with more data
            int prevPos = pos;
            if( read() < 0 ) {
                retLine = null;
                break topLoop;
            }
            lineStart -= prevPos - pos; // Adjust for the moved buffer
        }

        // Cleanup and return
        markpos   = savedMarkPos;
        marklimit = savedMarkLimit;
        return retLine;
    }
}


-----Original Message-----
From: Igor Lubashev [mailto:IgorLubashev@everypoint.com] 
Sent: Thursday, April 12, 2007 1:41 PM
To: HttpClient User Discussion
Subject: RE: Performance issues in ChunkedInputStream

1.  BufferedInputStream is working fine.  I've looked at the source, and
it correctly tried to read data only when its internal buffer is
exhausted.  Most read calls reference only the internal buffer.  When
the data does get read from the underlying stream, it tries to read it
in large chunks.  (Of course, if the underlying stream returns very
little data, it is a different problem.)

2.  It is hard to believe that reading a byte at a time is a bottleneck,
but I've just quickly written a LineReaderInputStream, which is derived
from BufferedInputStream, so all the searching for CRLF/LF happens very
quickly internally.  The source is attached.

Just call readLine() method, and you'll get Strings out of the stream.
You can interleave all regular stream operations and readLine() calls.
However, if you wish to use readLine() *after* using the stream's read()
methods, make sure that you do not inadvertently pass this stream to
anything that is buffering the stream's data (or your strings may get
consumed via buffering).

- Igor


>>> I looked at the source for BufferedInputStream and it looks like
>>> it tries to fill the empty space in the buffer each time you read
from
>> it (for a socket connection it will read more than one packet of
data)
>>> instead of just doing a single read from the underlying stream.
>>>     
>>
>> Ok, then the byte-by-byte reading in CIS when parsing the chunk
header
>> might well be the problem. If you want to fix that, you'll have to
hack
>> deeply into CIS. Here is what I would do if I had no other choice:
>>
>> - extend CIS by a local byte array as a buffer (needs two extra int
>>   for cursor and fill size)
>> - change the chunk header parsing to read a bunch of bytes into the
>>   buffer, then parsing from there
>> - change all read methods to return leftover bytes from the buffer
>>   before calling a read on the underlying stream
>>
>> hope that helps,
>>   Roland
>>   
>Tony and Roland,
>
>I suspect rather strongly it is BufferedInputStream that needs fixing, 
>not ChunkedInputStream
>
>Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Igor Lubashev <Ig...@everypoint.com>.

1.  BufferedInputStream is working fine.  I've looked at the source, and
it correctly tried to read data only when its internal buffer is
exhausted.  Most read calls reference only the internal buffer.  When
the data does get read from the underlying stream, it tries to read it
in large chunks.  (Of course, if the underlying stream returns very
little data, it is a different problem.)

2.  It is hard to believe that reading a byte at a time is a bottleneck,
but I've just quickly written a LineReaderInputStream, which is derived
from BufferedInputStream, so all the searching for CRLF/LF happens very
quickly internally.  The source is attached.

Just call readLine() method, and you'll get Strings out of the stream.
You can interleave all regular stream operations and readLine() calls.
However, if you wish to use readLine() *after* using the stream's read()
methods, make sure that you do not inadvertently pass this stream to
anything that is buffering the stream's data (or your strings may get
consumed via buffering).

- Igor


>>> I looked at the source for BufferedInputStream and it looks like
>>> it tries to fill the empty space in the buffer each time you read
from
>> it (for a socket connection it will read more than one packet of
data)
>>> instead of just doing a single read from the underlying stream.
>>>     
>>
>> Ok, then the byte-by-byte reading in CIS when parsing the chunk
header
>> might well be the problem. If you want to fix that, you'll have to
hack
>> deeply into CIS. Here is what I would do if I had no other choice:
>>
>> - extend CIS by a local byte array as a buffer (needs two extra int
>>   for cursor and fill size)
>> - change the chunk header parsing to read a bunch of bytes into the
>>   buffer, then parsing from there
>> - change all read methods to return leftover bytes from the buffer
>>   before calling a read on the underlying stream
>>
>> hope that helps,
>>   Roland
>>   
>Tony and Roland,
>
>I suspect rather strongly it is BufferedInputStream that needs fixing, 
>not ChunkedInputStream
>
>Oleg

Re: Performance issues in ChunkedInputStream

Posted by Oleg Kalnichevski <ol...@apache.org>.

Roland Weber wrote:
> Hi Tony,
>
> thanks for the details.
>
>   
>> I looked at the source for BufferedInputStream and it looks like
>> it tries to fill the empty space in the buffer each time you read from
>> it (for a socket connection it will read more than one packet of data)
>> instead of just doing a single read from the underlying stream.
>>     
>
> Ok, then the byte-by-byte reading in CIS when parsing the chunk header
> might well be the problem. If you want to fix that, you'll have to hack
> deeply into CIS. Here is what I would do if I had no other choice:
>
> - extend CIS by a local byte array as a buffer (needs two extra int
>   for cursor and fill size)
> - change the chunk header parsing to read a bunch of bytes into the
>   buffer, then parsing from there
> - change all read methods to return leftover bytes from the buffer
>   before calling a read on the underlying stream
>
> hope that helps,
>   Roland
>
>
>   
Tony and Roland,

I suspect rather strongly it is BufferedInputStream that needs fixing, 
not ChunkedInputStream

Oleg

> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

Re: Performance issues in ChunkedInputStream

Posted by Roland Weber <os...@dubioso.net>.

Hi Tony,

thanks for the details.

> I looked at the source for BufferedInputStream and it looks like
> it tries to fill the empty space in the buffer each time you read from
> it (for a socket connection it will read more than one packet of data)
> instead of just doing a single read from the underlying stream.

Ok, then the byte-by-byte reading in CIS when parsing the chunk header
might well be the problem. If you want to fix that, you'll have to hack
deeply into CIS. Here is what I would do if I had no other choice:

- extend CIS by a local byte array as a buffer (needs two extra int
  for cursor and fill size)
- change the chunk header parsing to read a bunch of bytes into the
  buffer, then parsing from there
- change all read methods to return leftover bytes from the buffer
  before calling a read on the underlying stream

hope that helps,
  Roland



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Tatu Saloranta <co...@yahoo.com>.

--- Tony Thompson <To...@stone-ware.com>
wrote:

...
> I can understand that but by watching a packet
> trace, that doesn't seem
> to be what is happening.  After I call abort() I can
> still observe
> packets flowing on the wire so someone is still
> requesting data from the

That's not necessarily the case though. Server may
have
buffered data to be sent over HTTP/TCP, and its TCP
flow control that may be pushing data until it
gets the reset. So there may be a bit of data
in-flight, which will get transferred and buffered by
the OS. This shouldn't happen for very long, but
depends on tcp stack.

-+ Tatu +-

____________________________________________________________________________________
Bored stiff? Loosen up... 
Download and play hundreds of games for free on Yahoo! Games.
http://games.yahoo.com/games/front

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

RE: Performance issues in ChunkedInputStream

Posted by Tony Thompson <To...@stone-ware.com>.

>Hello Tony,
>
>> I
>> have tried making the BufferedInputStream in HttpConnection large 
>> (128K) and it still fills up (it just takes a little longer).
>> [...]
>> One other issue I have with that code is if I interrupt the file 
>> transfer and call method.abort(), that ChunkedInputStream appears to 
>> still keep pulling data from the host.
>
>As Oleg pointed out, the socket and it's I/O streams are closed by
method.abort(). The streams residing on top of that will not be closed.
>The ChunkedInputStream sits on top of the BufferedInputStream that was
created by the connection. As long as there is data in the buffer, 
>the stream will continue to work. Only when the buffer is exhausted, a
new call to the socket stream is made. That call will then result in 
>an IOException indicating that the socket has been closed. In other
words, CIS is pulling data from the buffer, not from the host.
>
I can understand that but by watching a packet trace, that doesn't seem
to be what is happening.  After I call abort() I can still observe
packets flowing on the wire so someone is still requesting data from the
host.  I didn't try to dig far enough into the HttpClient code to say
for sure what is still going on.  I can only tell you what I am
observing at this point.

>I wonder how the BufferedInputStream can fill up. If it is empty, a
single attempt will be made to read data from the underlying socket 
>stream. Then, only the buffered data is accessed until that is
exhausted. Does your OS buffer enough data on it's own to fill a 128 K 
>buffer in a single read operation? If it does, I wonder even more why
parsing the chunk header should be a performance bottleneck. 
>Is there suspicious GC activity?

As a matter of fact, the window size in my particular test is only 16k
so no, I wouldn't be able to fill up the 128k buffer in a single read.
I added some printlns to the CIS.read() method to see how much data is
available on the InputStream before I read it and what I observed was
the amount of data that was available would grow and shrink (back down
to 0 sometimes) but I could get that buffer to the point where it was
full.  I looked at the source for BufferedInputStream and it looks like
it tries to fill the empty space in the buffer each time you read from
it (for a socket connection it will read more than one packet of data)
instead of just doing a single read from the underlying stream.  So, if
the host is shoving data back fast enough, that buffer could fill up if
the CIS is not pulling data out of it fast enough which seems to be
where the bottleneck is.  The longer it takes for the CIS to process
data from the buffered stream means everything waits around for another
read on the buffered stream (which is what triggers a read on the socket
stream).

Tony
 
This message (and any associated files) is intended only for the 
use of the individual or entity to which it is addressed and may 
contain information that is confidential, subject to copyright or
constitutes a trade secret. If you are not the intended recipient 
you are hereby notified that any dissemination, copying or 
distribution of this message, or files associated with this message, 
is strictly prohibited. If you have received this message in error, 
please notify us immediately by replying to the message and deleting 
it from your computer. Messages sent to and from Stoneware, Inc.
may be monitored.

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

Re: Performance issues in ChunkedInputStream

Posted by Roland Weber <os...@dubioso.net>.

Hello Tony,

> I
> have tried making the BufferedInputStream in HttpConnection large (128K)
> and it still fills up (it just takes a little longer).
> [...] 
> One other issue I have with that code is if I interrupt the file
> transfer and call method.abort(), that ChunkedInputStream appears to
> still keep pulling data from the host.

As Oleg pointed out, the socket and it's I/O streams are closed by
method.abort(). The streams residing on top of that will not be closed.
The ChunkedInputStream sits on top of the BufferedInputStream that
was created by the connection. As long as there is data in the buffer,
the stream will continue to work. Only when the buffer is exhausted,
a new call to the socket stream is made. That call will then result
in an IOException indicating that the socket has been closed. In
other words, CIS is pulling data from the buffer, not from the host.

I wonder how the BufferedInputStream can fill up. If it is empty,
a single attempt will be made to read data from the underlying
socket stream. Then, only the buffered data is accessed until that
is exhausted. Does your OS buffer enough data on it's own to fill
a 128 K buffer in a single read operation? If it does, I wonder
even more why parsing the chunk header should be a performance
bottleneck. Is there suspicious GC activity?

cheers,
  Roland


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org