You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Stefan Bodewig <bo...@apache.org> on 2011/08/15 10:56:05 UTC

[compress] close() and archives/streams in an invalid state

Hi,

while working on the Zip64 stuff I deliberately created some invalid
archives to test I get the expected exceptions.  While doing so I
realized I couldn't close the underlying stream because
ZipArchiveOutputStream's close would throw an exception as finish()
failed before ever closing the wrapped stream.  The calling code may not
even have a handle to the underlying stream if it used the File-arg
constructor and so in the end a broken archive leaks the file
descriptor.

Then I went looking through the other implementations and found we are
consistent here as far as archivers and bzip2 go.  All of them will not
close the underlying stream if the archive/stream is invalid.  I'm not
sure what gzip does as it just delegates to close in the Java classlib's
GZipOutputStream.

I wonder whether we should rather change the behavior to always close
the underlying stream or whether we should add a new method - I called
in destroy in ZipArchiveOutputStream - that does so.

Or maybe we just document it for 1.3, add destroy as a non-common method
where needed (like in ZIP, in all other cases the user code should have
access to the underlying stream) - and revisit this in the 2.x
timeframe, even though I can't see what could be broken by always
closing the stream in a finally block.

Does anybody see a good reason why the close behavior should stay the
way it is right now?

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] close() and archives/streams in an invalid state

Posted by sebb <se...@gmail.com>.
On 17 August 2011 06:18, Stefan Bodewig <bo...@apache.org> wrote:
> On 2011-08-17, sebb wrote:
>
>> On 17 August 2011 03:42, Stefan Bodewig <bo...@apache.org> wrote:
>>> On 2011-08-15, sebb wrote:
>
>>>> For input, there might be a use case for leaving the stream open, in
>>>> case some kind of recovery is possible.
>
>>>> It would be useful to have a way of determining the input file pointer
>>>> at the point of failure.
>
>>> stream.getBytesRead() or are you thinking about anything else?
>
>> That would be fine.
>
> I'm not sure whether we need to do anything for that.  The streams
> already count the bytes and you can access it when an exception occurs.
> We might want to revisit the exception messages whether they can include
> some sort of byte count.
>
> For "normal users" of compress I don't expect that information to be too
> useful, though.  Most users won't be familiar with the details of the
> format.  It may be useful for us during debugging or when bugs get
> reported.

Yes, that was my main concern.

> Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] close() and archives/streams in an invalid state

Posted by Stefan Bodewig <bo...@apache.org>.
On 2011-08-17, sebb wrote:

> On 17 August 2011 03:42, Stefan Bodewig <bo...@apache.org> wrote:
>> On 2011-08-15, sebb wrote:

>>> For input, there might be a use case for leaving the stream open, in
>>> case some kind of recovery is possible.

>>> It would be useful to have a way of determining the input file pointer
>>> at the point of failure.

>> stream.getBytesRead() or are you thinking about anything else?

> That would be fine.

I'm not sure whether we need to do anything for that.  The streams
already count the bytes and you can access it when an exception occurs.
We might want to revisit the exception messages whether they can include
some sort of byte count.

For "normal users" of compress I don't expect that information to be too
useful, though.  Most users won't be familiar with the details of the
format.  It may be useful for us during debugging or when bugs get
reported.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] close() and archives/streams in an invalid state

Posted by Stefan Bodewig <bo...@apache.org>.
On 2011-08-17, Honton, Charles wrote:

> Why guard against closing System.in?  If the client application needs that
> sort of functionality, it should wrap the incoming stream in a proxy that
> doesn't delegate the close method to the underlying stream.

You and I agree here - there even has been a thread about this not too
long ago.  Changing it would break backwards compatibility, though,
something we are willing to do in a 2.x release not too far away in the
future (I hope).  For 1.x we prefer to keep it as it is.

> (A new class for commons-io?)

I'm pretty sure that already exists.  CloseShieldInputStream IIRC.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] close() and archives/streams in an invalid state

Posted by "Honton, Charles" <Ch...@intuit.com>.
Why guard against closing System.in?  If the client application needs that
sort of functionality, it should wrap the incoming stream in a proxy that
doesn't delegate the close method to the underlying stream.  (A new class
for commons-io?)

Chas Honton


On 8/16/11 11:16 PM, "sebb" <se...@gmail.com> wrote:

>On 17 August 2011 03:42, Stefan Bodewig <bo...@apache.org> wrote:
>> On 2011-08-15, sebb wrote:
>>
>>> On 15 August 2011 09:56, Stefan Bodewig <bo...@apache.org> wrote:
>>>> Hi,
>>
>>>> while working on the Zip64 stuff I deliberately created some invalid
>>>> archives to test I get the expected exceptions.  While doing so I
>>>> realized I couldn't close the underlying stream because
>>>> ZipArchiveOutputStream's close would throw an exception as finish()
>>>> failed before ever closing the wrapped stream.  The calling code may
>>>>not
>>>> even have a handle to the underlying stream if it used the File-arg
>>>> constructor and so in the end a broken archive leaks the file
>>>> descriptor.
>>
>>>> Then I went looking through the other implementations and found we are
>>>> consistent here as far as archivers and bzip2 go.  All of them will
>>>>not
>>>> close the underlying stream if the archive/stream is invalid.  I'm not
>>>> sure what gzip does as it just delegates to close in the Java
>>>>classlib's
>>>> GZipOutputStream.
>>
>>>> I wonder whether we should rather change the behavior to always close
>>>> the underlying stream or whether we should add a new method - I called
>>>> in destroy in ZipArchiveOutputStream - that does so.
>>
>>>> Or maybe we just document it for 1.3, add destroy as a non-common
>>>>method
>>>> where needed (like in ZIP, in all other cases the user code should
>>>>have
>>>> access to the underlying stream) - and revisit this in the 2.x
>>>> timeframe, even though I can't see what could be broken by always
>>>> closing the stream in a finally block.
>>
>>>> Does anybody see a good reason why the close behavior should stay the
>>>> way it is right now?
>>
>>> For output, it seems sensible to close the underlying stream on
>>> failure, as the content is likely to be garbage.
>>
>> Yes, this is the case I was talking about.
>>
>>> For input, there might be a use case for leaving the stream open, in
>>> case some kind of recovery is possible.
>>
>> All our implementations - except for the new dump code, that doesn't
>> properly handle close ATM - try to close the input stream regardless of
>> whether there has been a problem with the archive or not (they don't
>> even check).  This is what I'd expect it to do as well.
>>
>> Some of the guard against closing System.in, but not all.
>>
>>> It would be useful to have a way of determining the input file pointer
>>> at the point of failure.
>>
>> stream.getBytesRead() or are you thinking about anything else?
>
>That would be fine.
>
>> Stefan
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>For additional commands, e-mail: dev-help@commons.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] close() and archives/streams in an invalid state

Posted by sebb <se...@gmail.com>.
On 17 August 2011 03:42, Stefan Bodewig <bo...@apache.org> wrote:
> On 2011-08-15, sebb wrote:
>
>> On 15 August 2011 09:56, Stefan Bodewig <bo...@apache.org> wrote:
>>> Hi,
>
>>> while working on the Zip64 stuff I deliberately created some invalid
>>> archives to test I get the expected exceptions.  While doing so I
>>> realized I couldn't close the underlying stream because
>>> ZipArchiveOutputStream's close would throw an exception as finish()
>>> failed before ever closing the wrapped stream.  The calling code may not
>>> even have a handle to the underlying stream if it used the File-arg
>>> constructor and so in the end a broken archive leaks the file
>>> descriptor.
>
>>> Then I went looking through the other implementations and found we are
>>> consistent here as far as archivers and bzip2 go.  All of them will not
>>> close the underlying stream if the archive/stream is invalid.  I'm not
>>> sure what gzip does as it just delegates to close in the Java classlib's
>>> GZipOutputStream.
>
>>> I wonder whether we should rather change the behavior to always close
>>> the underlying stream or whether we should add a new method - I called
>>> in destroy in ZipArchiveOutputStream - that does so.
>
>>> Or maybe we just document it for 1.3, add destroy as a non-common method
>>> where needed (like in ZIP, in all other cases the user code should have
>>> access to the underlying stream) - and revisit this in the 2.x
>>> timeframe, even though I can't see what could be broken by always
>>> closing the stream in a finally block.
>
>>> Does anybody see a good reason why the close behavior should stay the
>>> way it is right now?
>
>> For output, it seems sensible to close the underlying stream on
>> failure, as the content is likely to be garbage.
>
> Yes, this is the case I was talking about.
>
>> For input, there might be a use case for leaving the stream open, in
>> case some kind of recovery is possible.
>
> All our implementations - except for the new dump code, that doesn't
> properly handle close ATM - try to close the input stream regardless of
> whether there has been a problem with the archive or not (they don't
> even check).  This is what I'd expect it to do as well.
>
> Some of the guard against closing System.in, but not all.
>
>> It would be useful to have a way of determining the input file pointer
>> at the point of failure.
>
> stream.getBytesRead() or are you thinking about anything else?

That would be fine.

> Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] close() and archives/streams in an invalid state

Posted by Stefan Bodewig <bo...@apache.org>.
On 2011-08-15, sebb wrote:

> On 15 August 2011 09:56, Stefan Bodewig <bo...@apache.org> wrote:
>> Hi,

>> while working on the Zip64 stuff I deliberately created some invalid
>> archives to test I get the expected exceptions.  While doing so I
>> realized I couldn't close the underlying stream because
>> ZipArchiveOutputStream's close would throw an exception as finish()
>> failed before ever closing the wrapped stream.  The calling code may not
>> even have a handle to the underlying stream if it used the File-arg
>> constructor and so in the end a broken archive leaks the file
>> descriptor.

>> Then I went looking through the other implementations and found we are
>> consistent here as far as archivers and bzip2 go.  All of them will not
>> close the underlying stream if the archive/stream is invalid.  I'm not
>> sure what gzip does as it just delegates to close in the Java classlib's
>> GZipOutputStream.

>> I wonder whether we should rather change the behavior to always close
>> the underlying stream or whether we should add a new method - I called
>> in destroy in ZipArchiveOutputStream - that does so.

>> Or maybe we just document it for 1.3, add destroy as a non-common method
>> where needed (like in ZIP, in all other cases the user code should have
>> access to the underlying stream) - and revisit this in the 2.x
>> timeframe, even though I can't see what could be broken by always
>> closing the stream in a finally block.

>> Does anybody see a good reason why the close behavior should stay the
>> way it is right now?

> For output, it seems sensible to close the underlying stream on
> failure, as the content is likely to be garbage.

Yes, this is the case I was talking about.

> For input, there might be a use case for leaving the stream open, in
> case some kind of recovery is possible.

All our implementations - except for the new dump code, that doesn't
properly handle close ATM - try to close the input stream regardless of
whether there has been a problem with the archive or not (they don't
even check).  This is what I'd expect it to do as well.

Some of the guard against closing System.in, but not all.

> It would be useful to have a way of determining the input file pointer
> at the point of failure.

stream.getBytesRead() or are you thinking about anything else?

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] close() and archives/streams in an invalid state

Posted by sebb <se...@gmail.com>.
On 15 August 2011 09:56, Stefan Bodewig <bo...@apache.org> wrote:
> Hi,
>
> while working on the Zip64 stuff I deliberately created some invalid
> archives to test I get the expected exceptions.  While doing so I
> realized I couldn't close the underlying stream because
> ZipArchiveOutputStream's close would throw an exception as finish()
> failed before ever closing the wrapped stream.  The calling code may not
> even have a handle to the underlying stream if it used the File-arg
> constructor and so in the end a broken archive leaks the file
> descriptor.
>
> Then I went looking through the other implementations and found we are
> consistent here as far as archivers and bzip2 go.  All of them will not
> close the underlying stream if the archive/stream is invalid.  I'm not
> sure what gzip does as it just delegates to close in the Java classlib's
> GZipOutputStream.
>
> I wonder whether we should rather change the behavior to always close
> the underlying stream or whether we should add a new method - I called
> in destroy in ZipArchiveOutputStream - that does so.
>
> Or maybe we just document it for 1.3, add destroy as a non-common method
> where needed (like in ZIP, in all other cases the user code should have
> access to the underlying stream) - and revisit this in the 2.x
> timeframe, even though I can't see what could be broken by always
> closing the stream in a finally block.
>
> Does anybody see a good reason why the close behavior should stay the
> way it is right now?

For output, it seems sensible to close the underlying stream on
failure, as the content is likely to be garbage.

For input, there might be a use case for leaving the stream open, in
case some kind of recovery is possible.
It would be useful to have a way of determining the input file pointer
at the point of failure.

> Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org