You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Jim Roepcke <ji...@roepcke.com> on 2005/07/03 01:49:04 UTC
Making ContentLengthInputStream stop when I want, is this safe?
Hi,
I'm writing an application where the user must be able to interrupt
and resume downloads at will.
I was noticing noticeable pauses when I called releaseConnection() on
the GetMethod that I was reading the response from. Recently I
realized this is because deep in HttpClient, releaseConnection
eventually calls exhaustInputStream to read and discard the rest of
the response (!!!) before closing the connection. Since the file
downloads the user will interrupt are potentially many gigabytes in
size, having releaseConnection block while it reads and discards many
gigabytes of data is obviously very bad.
I created a workaround. It seems to be working for me, ie:
releaseConnection is no longer pausing, but I am guessing the
developers of HttpClient had something important in mind when they
decided to FORCE responses to be COMPLETELY read. Could you please
let me know if my hack is going to cause bad side-effects?
Background info: I'm using the MultiThreadedHttpConnectionManager.
At any time I could have up to one (significant) GET download, one
(significant) PUT upload, and possibly a very small number of very
short DELETE, HEAD and GET requests executing.
The hack:
In ContentLengthInputStream, I added a method forceSkipTheRest(). It
does the following:
public void forceSkipTheRest() {
this.contentLength = this.pos;
}
In AutoCloseInputStream, I added an accessor method to get the
wrapped [ContentLength]InputStream, so that I can call the
forceSkipTheRest method on it.
Here is a snippet of code from my Download worker code (which runs in
a separate thread managed by Doug Lea's util.concurrent's
QueuedExectutor class). Please refer to the section near the bottom
inside the if (shouldStop()) block for the use of the gross hack...
// [snip]
client.executeMethod(hc, get);
int status_code_get = get.getStatusCode();
if ((status_code_get / 100) == 2) { // in the 200 series of responses
responseStream = get.getResponseBodyAsStream();
int download_stream_buf_size = 4096;
// TODO: review again if this bufferedinputstream is needed
// it seems it is redundant, the responseStream is probably good
enough!!
downloadStream = new BufferedInputStream(responseStream,
download_stream_buf_size);
byte[] downloadBuffer = new byte[download_stream_buf_size];
int bytesRead = 0;
raf = new RandomAccessFile(downloadFile, "rw");
while ((bytesRead = downloadStream.read(downloadBuffer, 0,
download_stream_buf_size)) > 0) {
raf.write(downloadBuffer, 0, bytesRead);
totalBytesRead += bytesRead;
if (shouldStop()) {
if (responseStream instanceof AutoCloseInputStream) {
InputStream stream = ((AutoCloseInputStream)
responseStream).getWrappedInputStream();
if (stream instanceof ContentLengthInputStream)
((ContentLengthInputStream)
stream).forceSkipTheRest();
}
break;
}
}
get.releaseConnection();
get = null;
// [snip]
In some quick testing, it seems this is working just fine... however,
I wonder if any bad side effects will result from this.
I'm also wondering if perhaps I should simply set a flag in
ContentLengthInputStream, and if that flag is true, when close() is
called on ContentLegnthInputStream it calls forceSkipTheRest before
calling exhaustInputStream (so that this interruption works the way I
need in all cases, not just in my read loop).
Your advice is *greatly* appreciated!!
Thanks,
Jim
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
Re: Making ContentLengthInputStream stop when I want, is this safe?
Posted by Jim Roepcke <ji...@roepcke.com>.
On 2-Jul-05, at 6:47 PM, Jim Roepcke wrote:
> I did try get.abort() before devising the workaround, but after
> calling abort, subsequent HTTP methods weren't being executed. ...
> I just tried it again. It works a few times but then eventually
> HTTP methods just don't execute. My guess is, since the
> connections aren't (as far as I can tell) being released to the
> pool when you call abort(), eventually the
> MultiThreadedHttpConnectionManager's connection pool is starved and
> it has nothing to offer the HttpClient.
I've done some more debugging. I think I was right, the thread that
calls client.execute is blocked, waiting in
MultiThreadedHttpConnectionManager.doGetConnection (which is called
by MultiThreadedHttpConnectionManager .getConnectionWithTimeout,
called by HttpMethodDirector.executeMethod).
The timeout value would appear to be the default of 0 (since I didn't
configure that in any way), which appears to mean no timeout, which
explains why the QueuedExecutor handling my transfer commands is
blocked.
From the looks of things (please correct me if I'm wrong) even if I
set a timeout value it would just result in client.execute method
throwing a connection pool timeout exception instead of having the
connection pool re-stock itself with new connections.
Is this the intended behavior of abort(), or should it be notifying
the connection manager so it can clean up and keep the connection
pool stocked?
> Looking at abort(), it calls close() on the connection, but not
> releaseConnection(), which would in turn call
> httpConnectionManager.releaseConnection(this). I assume this is
> why you say it won't be re-used? On a quick inspection, I don't
> see the MultiThreadedHttpConnectionManager being notified that the
> connection is finished or unusable. Do I have to either tell it
> somehow or create a new connection manager? It doesn't appear abort
> () does anything to tell the connection manager the connection is
> now unusable.
Thanks again for your help,
Jim
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
Re: Making ContentLengthInputStream stop when I want, is this safe?
Posted by Jim Roepcke <ji...@roepcke.com>.
On 2-Jul-05, at 11:25 PM, Jim Roepcke wrote:
> Ah, okay. I thought it was releaseConnection or abort. I was also
> a little scared to call releaseConnection because it was doing the
> exhaustInputStream, but I suppose if I abort first there won't be
> anything for it to exhaust, right?
> ....snip....
> Yeah, I do have a finally statement that checks if get != null and
> then does releaseConnection, it's outside of the try block
> (obviously) which I didn't put in the [snip]'d code I pasted in here.
>
> I'll give this a try now, thanks again for your help!
Some preliminary testing shows it's working wonderfully!
exhaustInputStream is still being called but there's nothing left to
read. :)
Perfect!
I think I'm fine on the upload side, since my upload worker is
implementing RequestEntity, and it breaks out of the write loop in
writeRequest when the the user signals the upload to stop.
Thanks again,
Jim
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
Re: Making ContentLengthInputStream stop when I want, is this safe?
Posted by Jim Roepcke <ji...@roepcke.com>.
On 2-Jul-05, at 8:16 PM, Michael Becke wrote:
>> Thanks for the advice. :-) That's interesting that the content would
>> still be on the wire. I guess that's a part of HttpClient or sockets
>> I don't understand well enough -- I assumed that when the socket was
>> closed the remaining bytes simply "could not get through", and that
>> after exhaustInputStream(), the socket would soon after be closed.
>>
>
> Connections/sockets can be reused, depending on the configuration of
> the client/server. Releasing a connection does not necessarily close
> it.
Server is Tomcat 5.5.9, not doing anything special except making sure
Content-Length is sent and handling Range/Content-Range headers
properly. One hack though is on HEAD requests when the file I'd
serve is greater than 2GB, I can't set the Content-Length to be the
file's size because it'll only take an int, so I set the Content-
Length to 0 and set another header "X-Long-Content-Length" that my
HEAD request code watches for. I know that's terrible put I didn't
have the time to fix it. Coyote actually supports long content
requests but I think something higher up in the Servlet API doesn't
and it's next to impossible to get to the Coyote response object to
get it to work with a long without modifying lots of Tomcat/Servlet
2.4 code.
> Yep, sounds like you're leaking connections. Even when using abort()
> you still need to be sure to call releaseConnection(). As a rule
> releaseConnection() needs to be called every time executeMethod() is
> called.
Ah, okay. I thought it was releaseConnection or abort. I was also a
little scared to call releaseConnection because it was doing the
exhaustInputStream, but I suppose if I abort first there won't be
anything for it to exhaust, right?
> You should probably put releaseConnection() in a finally statement, so
> you can be sure it is always called.
Yeah, I do have a finally statement that checks if get != null and
then does releaseConnection, it's outside of the try block
(obviously) which I didn't put in the [snip]'d code I pasted in here.
I'll give this a try now, thanks again for your help!
Jim
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
Re: Making ContentLengthInputStream stop when I want, is this safe?
Posted by Michael Becke <mb...@gmail.com>.
> Thanks for the advice. :-) That's interesting that the content would
> still be on the wire. I guess that's a part of HttpClient or sockets
> I don't understand well enough -- I assumed that when the socket was
> closed the remaining bytes simply "could not get through", and that
> after exhaustInputStream(), the socket would soon after be closed.
Connections/sockets can be reused, depending on the configuration of
the client/server. Releasing a connection does not necessarily close
it.
> I did try get.abort() before devising the workaround, but after
> calling abort, subsequent HTTP methods weren't being executed. ... I
> just tried it again. It works a few times but then eventually HTTP
> methods just don't execute. My guess is, since the connections
> aren't (as far as I can tell) being released to the pool when you
> call abort(), eventually the MultiThreadedHttpConnectionManager's
> connection pool is starved and it has nothing to offer the HttpClient.
>
> Looking at abort(), it calls close() on the connection, but not
> releaseConnection(), which would in turn call
> httpConnectionManager.releaseConnection(this). I assume this is why
> you say it won't be re-used? On a quick inspection, I don't see the
> MultiThreadedHttpConnectionManager being notified that the connection
> is finished or unusable. Do I have to either tell it somehow or
> create a new connection manager? It doesn't appear abort() does
> anything to tell the connection manager the connection is now unusable.
Yep, sounds like you're leaking connections. Even when using abort()
you still need to be sure to call releaseConnection(). As a rule
releaseConnection() needs to be called every time executeMethod() is
called.
> Here's how I have the read loop when I use abort():
>
> while ((bytesRead = downloadStream.read(downloadBuffer, 0,
> download_stream_buf_size)) > 0) {
> raf.write(downloadBuffer, 0, bytesRead);
> totalBytesRead += bytesRead;
> if (shouldStop()) {
> get.abort();
> get = null
> break;
> }
> }
> if (get != null) {
> get.releaseConnection();
> get = null;
> }
> // [snip]
>
> Do you seen anything wrong with that? (Assuming abort() is working
> perfectly)
You should probably put releaseConnection() in a finally statement, so
you can be sure it is always called.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
Re: Making ContentLengthInputStream stop when I want, is this safe?
Posted by Jim Roepcke <ji...@roepcke.com>.
On 2-Jul-05, at 4:56 PM, Michael Becke wrote:
> Hi Jim,
>
> HttpMethod.abort() is probably what you're looking for.
>
> Your workaround solution could potentially cause some issues. In
> particular it will be a problem if you end up reusing a connection
> that has been skipped. The remaining content will still be on the
> wire and it will be read the next time the connection is used.
> Reading to the end of the response is specifically done so that
> connections can be reused. In your case it sounds like it would be
> better to abort the method, which causes the connection to be closed
> and not reused.
Hi Michael,
Thanks for the advice. :-) That's interesting that the content would
still be on the wire. I guess that's a part of HttpClient or sockets
I don't understand well enough -- I assumed that when the socket was
closed the remaining bytes simply "could not get through", and that
after exhaustInputStream(), the socket would soon after be closed.
I did try get.abort() before devising the workaround, but after
calling abort, subsequent HTTP methods weren't being executed. ... I
just tried it again. It works a few times but then eventually HTTP
methods just don't execute. My guess is, since the connections
aren't (as far as I can tell) being released to the pool when you
call abort(), eventually the MultiThreadedHttpConnectionManager's
connection pool is starved and it has nothing to offer the HttpClient.
Looking at abort(), it calls close() on the connection, but not
releaseConnection(), which would in turn call
httpConnectionManager.releaseConnection(this). I assume this is why
you say it won't be re-used? On a quick inspection, I don't see the
MultiThreadedHttpConnectionManager being notified that the connection
is finished or unusable. Do I have to either tell it somehow or
create a new connection manager? It doesn't appear abort() does
anything to tell the connection manager the connection is now unusable.
Here's how I have the read loop when I use abort():
while ((bytesRead = downloadStream.read(downloadBuffer, 0,
download_stream_buf_size)) > 0) {
raf.write(downloadBuffer, 0, bytesRead);
totalBytesRead += bytesRead;
if (shouldStop()) {
get.abort();
get = null
break;
}
}
if (get != null) {
get.releaseConnection();
get = null;
}
// [snip]
Do you seen anything wrong with that? (Assuming abort() is working
perfectly)
I appreciate your advice, hopefully I can get this working properly
using abort(), because it sounds like that is the intended device for
my needs.
If you have any suggestions for improvements to abort, I'm very happy
to try them here.
Thanks again,
Jim
> On 7/2/05, Jim Roepcke <ji...@roepcke.com> wrote:
>
>> Hi,
>>
>> I'm writing an application where the user must be able to interrupt
>> and resume downloads at will.
>>
>> I was noticing noticeable pauses when I called releaseConnection() on
>> the GetMethod that I was reading the response from. Recently I
>> realized this is because deep in HttpClient, releaseConnection
>> eventually calls exhaustInputStream to read and discard the rest of
>> the response (!!!) before closing the connection. Since the file
>> downloads the user will interrupt are potentially many gigabytes in
>> size, having releaseConnection block while it reads and discards many
>> gigabytes of data is obviously very bad.
>>
>> I created a workaround. It seems to be working for me, ie:
>> releaseConnection is no longer pausing, but I am guessing the
>> developers of HttpClient had something important in mind when they
>> decided to FORCE responses to be COMPLETELY read. Could you please
>> let me know if my hack is going to cause bad side-effects?
>>
>> Background info: I'm using the MultiThreadedHttpConnectionManager.
>> At any time I could have up to one (significant) GET download, one
>> (significant) PUT upload, and possibly a very small number of very
>> short DELETE, HEAD and GET requests executing.
>>
>> The hack:
>>
>> In ContentLengthInputStream, I added a method forceSkipTheRest(). It
>> does the following:
>>
>> public void forceSkipTheRest() {
>> this.contentLength = this.pos;
>> }
>>
>> In AutoCloseInputStream, I added an accessor method to get the
>> wrapped [ContentLength]InputStream, so that I can call the
>> forceSkipTheRest method on it.
>>
>> Here is a snippet of code from my Download worker code (which runs in
>> a separate thread managed by Doug Lea's util.concurrent's
>> QueuedExectutor class). Please refer to the section near the bottom
>> inside the if (shouldStop()) block for the use of the gross hack...
>>
>> // [snip]
>> client.executeMethod(hc, get);
>> int status_code_get = get.getStatusCode();
>>
>> if ((status_code_get / 100) == 2) { // in the 200 series of responses
>> responseStream = get.getResponseBodyAsStream();
>> int download_stream_buf_size = 4096;
>> // TODO: review again if this bufferedinputstream is needed
>> // it seems it is redundant, the responseStream is probably good
>> enough!!
>> downloadStream = new BufferedInputStream(responseStream,
>> download_stream_buf_size);
>> byte[] downloadBuffer = new byte[download_stream_buf_size];
>>
>> int bytesRead = 0;
>> raf = new RandomAccessFile(downloadFile, "rw");
>> while ((bytesRead = downloadStream.read(downloadBuffer, 0,
>> download_stream_buf_size)) > 0) {
>> raf.write(downloadBuffer, 0, bytesRead);
>> totalBytesRead += bytesRead;
>> if (shouldStop()) {
>> if (responseStream instanceof AutoCloseInputStream) {
>> InputStream stream = ((AutoCloseInputStream)
>> responseStream).getWrappedInputStream();
>> if (stream instanceof ContentLengthInputStream)
>> ((ContentLengthInputStream)
>> stream).forceSkipTheRest();
>> }
>> break;
>> }
>> }
>> get.releaseConnection();
>> get = null;
>> // [snip]
>>
>> In some quick testing, it seems this is working just fine... however,
>> I wonder if any bad side effects will result from this.
>>
>> I'm also wondering if perhaps I should simply set a flag in
>> ContentLengthInputStream, and if that flag is true, when close() is
>> called on ContentLegnthInputStream it calls forceSkipTheRest before
>> calling exhaustInputStream (so that this interruption works the way I
>> need in all cases, not just in my read loop).
>>
>> Your advice is *greatly* appreciated!!
>>
>> Thanks,
>>
>> Jim
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
Re: Making ContentLengthInputStream stop when I want, is this safe?
Posted by Michael Becke <mb...@gmail.com>.
Hi Jim,
HttpMethod.abort() is probably what you're looking for.
Your workaround solution could potentially cause some issues. In
particular it will be a problem if you end up reusing a connection
that has been skipped. The remaining content will still be on the
wire and it will be read the next time the connection is used.
Reading to the end of the response is specifically done so that
connections can be reused. In your case it sounds like it would be
better to abort the method, which causes the connection to be closed
and not reused.
Mike
On 7/2/05, Jim Roepcke <ji...@roepcke.com> wrote:
> Hi,
>
> I'm writing an application where the user must be able to interrupt
> and resume downloads at will.
>
> I was noticing noticeable pauses when I called releaseConnection() on
> the GetMethod that I was reading the response from. Recently I
> realized this is because deep in HttpClient, releaseConnection
> eventually calls exhaustInputStream to read and discard the rest of
> the response (!!!) before closing the connection. Since the file
> downloads the user will interrupt are potentially many gigabytes in
> size, having releaseConnection block while it reads and discards many
> gigabytes of data is obviously very bad.
>
> I created a workaround. It seems to be working for me, ie:
> releaseConnection is no longer pausing, but I am guessing the
> developers of HttpClient had something important in mind when they
> decided to FORCE responses to be COMPLETELY read. Could you please
> let me know if my hack is going to cause bad side-effects?
>
> Background info: I'm using the MultiThreadedHttpConnectionManager.
> At any time I could have up to one (significant) GET download, one
> (significant) PUT upload, and possibly a very small number of very
> short DELETE, HEAD and GET requests executing.
>
> The hack:
>
> In ContentLengthInputStream, I added a method forceSkipTheRest(). It
> does the following:
>
> public void forceSkipTheRest() {
> this.contentLength = this.pos;
> }
>
> In AutoCloseInputStream, I added an accessor method to get the
> wrapped [ContentLength]InputStream, so that I can call the
> forceSkipTheRest method on it.
>
> Here is a snippet of code from my Download worker code (which runs in
> a separate thread managed by Doug Lea's util.concurrent's
> QueuedExectutor class). Please refer to the section near the bottom
> inside the if (shouldStop()) block for the use of the gross hack...
>
> // [snip]
> client.executeMethod(hc, get);
> int status_code_get = get.getStatusCode();
>
> if ((status_code_get / 100) == 2) { // in the 200 series of responses
> responseStream = get.getResponseBodyAsStream();
> int download_stream_buf_size = 4096;
> // TODO: review again if this bufferedinputstream is needed
> // it seems it is redundant, the responseStream is probably good
> enough!!
> downloadStream = new BufferedInputStream(responseStream,
> download_stream_buf_size);
> byte[] downloadBuffer = new byte[download_stream_buf_size];
>
> int bytesRead = 0;
> raf = new RandomAccessFile(downloadFile, "rw");
> while ((bytesRead = downloadStream.read(downloadBuffer, 0,
> download_stream_buf_size)) > 0) {
> raf.write(downloadBuffer, 0, bytesRead);
> totalBytesRead += bytesRead;
> if (shouldStop()) {
> if (responseStream instanceof AutoCloseInputStream) {
> InputStream stream = ((AutoCloseInputStream)
> responseStream).getWrappedInputStream();
> if (stream instanceof ContentLengthInputStream)
> ((ContentLengthInputStream)
> stream).forceSkipTheRest();
> }
> break;
> }
> }
> get.releaseConnection();
> get = null;
> // [snip]
>
> In some quick testing, it seems this is working just fine... however,
> I wonder if any bad side effects will result from this.
>
> I'm also wondering if perhaps I should simply set a flag in
> ContentLengthInputStream, and if that flag is true, when close() is
> called on ContentLegnthInputStream it calls forceSkipTheRest before
> calling exhaustInputStream (so that this interruption works the way I
> need in all cases, not just in my read loop).
>
> Your advice is *greatly* appreciated!!
>
> Thanks,
>
> Jim
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org