You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Stefan Bodewig <bo...@apache.org> on 2021/06/27 11:03:25 UTC

[compress] Dealing with uncaught RuntimeExceptions (again)

Hi

I'd like to get closure on which approach we want to take.

When we read a broken archive it may trigger arbitrary RuntimeExceptions
because we are not explicitly checking for each and every sizuation
where a bounds check could fail, a negative size is sent to a classlib
method that then throws an IllegalArgumentException or whatnot (even a
NullPointerException may escape us every now and then).

Uncaught RuntimeExceptions are considered security issued by some tools
because of a potential DoS attack. Historically we have never agreed
with this point of view and I'm not suggesting to change that.

Even though we may not know what is wrong, when the RuntimeException
occurs, we do know the archive is broken and this is the reason for the
exception.

AFAICS there are two ways we can deal with it:

(1) every method that reads from the archive declares it can throw
    arbitrary RuntimeExceptions as well. And we document that broken
    archives may cause RuntimeExceptions and that we never consider such
    a case a security issue.

(2) we catch RuntimeExceptions at every method that reads from the
    archive and wrap them in a custom IOException, making sure such a
    case can never escape us.

Personally I prefer (2) but can live with (1) - I've suggested something
along the lines of (2) in [1] and it seemed Gilles was opposed to this
idea (and Matt was torn).

In [2] Bernd seemed to support (2).

Are there any other opinions?

Stefan

[1] https://lists.apache.org/thread.html/r5d2427566dff4c7d293e8d48f9ac62b7958d19047f730836ce5b3c60%40%3Cdev.commons.apache.org%3E

[2] https://lists.apache.org/thread.html/r3ce77eb9ab9429097ca57c48cb99b8be497ee5b69d419b52a6722616%40%3Cdev.commons.apache.org%3E

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Stefan Bodewig <bo...@apache.org>.
On 2021-06-29, Stefan Bodewig wrote:

> Options raised during the thread:

> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>     subclass) and throw the IOException

+1

> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>     IOException (possibly a subclass) and throw the IOException - allow
>     the remaining RuntimeExceptions to fly through

+0

> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>     exception (which one could be discussed later) and throw this one

-0

> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.

+0

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Gilles Sadowski <gi...@gmail.com>.
Le mer. 30 juin 2021 à 02:32, Bernd Eckenfels <ec...@zusammenkunft.net> a écrit :
>
> I vote for the subclass of IOException - if you care about handling broken archives you would catch it,

Yes, but you can catch a runtime exception all the same.
An application that doesn't want to terminate abruptly must catch
all exceptions at some point.
The distinction between checked and unchecked is related to the
question of "recoverable or not", not about "handling or not".

> if you don’t care you have to handle IO problems anyway.

?

Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Bernd Eckenfels <ec...@zusammenkunft.net>.
I vote for the subclass of IOException - if you care about handling broken archives you would catch it, if you don’t care you have to handle IO problems anyway.

Gruß
Bernd

--
https://Bernd.eckenfels.net
________________________________
From: Gilles Sadowski <gi...@gmail.com>
Sent: Wednesday, June 30, 2021 12:33:16 AM
To: Commons Developers List <de...@commons.apache.org>
Subject: Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Le mar. 29 juin 2021 à 22:24, Stefan Bodewig <bo...@apache.org> a écrit :
>
> Hi
>
> I'm sorry, but I'm unable to see what would or would not work for the
> people who chimed in. Short of calling for a vote, lets try with a poll
> that could show whether there is some sort of solution that is
> acceptable to everybody.
>
> Please use +1 to mean "I like this option", +0 to mean "the option is
> OK, but I'd prefer a different one", -0 for "I don't like the option but
> I can live with it" and -1 for "this option is not acceptable to me.
>
> Options raised during the thread:
>
> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>     subclass) and throw the IOException

-1
[As a matter of principle, we should not force users to handle checked
exceptions; but, as said previously, it can be mandated by design, .e.g.
the (strange) promise that no other exception can arise from calling the
library.]

>
> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>     IOException (possibly a subclass) and throw the IOException - allow
>     the remaining RuntimeExceptions to fly through

-1
[Same reason (it's a variant of the above).]
But how will you choose which should be caught or allowed to fly?

>
> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>     exception (which one could be discussed later) and throw this one

+0
 A sane solution that can satisfy two categories of users:
  (A) those who want to easily catch everything that comes out, and
  (B) those who want to make sure that no error can go unnoticed.

>
> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.

+1

>
> "Just harden all parsers" is a variation of (4) in my view as I don't
> believe we would manage to cover all cases no matter how hard and long
> we try.
>
> I hope I didn't overlook any.
>
> Thanks
>
>         Stefan
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Gilles Sadowski <gi...@gmail.com>.
Le mar. 29 juin 2021 à 22:24, Stefan Bodewig <bo...@apache.org> a écrit :
>
> Hi
>
> I'm sorry, but I'm unable to see what would or would not work for the
> people who chimed in. Short of calling for a vote, lets try with a poll
> that could show whether there is some sort of solution that is
> acceptable to everybody.
>
> Please use +1 to mean "I like this option", +0 to mean "the option is
> OK, but I'd prefer a different one", -0 for "I don't like the option but
> I can live with it" and -1 for "this option is not acceptable to me.
>
> Options raised during the thread:
>
> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>     subclass) and throw the IOException

-1
[As a matter of principle, we should not force users to handle checked
exceptions; but, as said previously, it can be mandated by design, .e.g.
the (strange) promise that no other exception can arise from calling the
library.]

>
> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>     IOException (possibly a subclass) and throw the IOException - allow
>     the remaining RuntimeExceptions to fly through

-1
[Same reason (it's a variant of the above).]
But how will you choose which should be caught or allowed to fly?

>
> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>     exception (which one could be discussed later) and throw this one

+0
 A sane solution that can satisfy two categories of users:
  (A) those who want to easily catch everything that comes out, and
  (B) those who want to make sure that no error can go unnoticed.

>
> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.

+1

>
> "Just harden all parsers" is a variation of (4) in my view as I don't
> believe we would manage to cover all cases no matter how hard and long
> we try.
>
> I hope I didn't overlook any.
>
> Thanks
>
>         Stefan
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Oliver Heger <ol...@oliver-heger.de>.

Am 30.06.21 um 14:41 schrieb Gary Gregory:
> On Tue, Jun 29, 2021 at 4:24 PM Stefan Bodewig <bo...@apache.org> wrote:
>>
>> Hi
>>
>> I'm sorry, but I'm unable to see what would or would not work for the
>> people who chimed in. Short of calling for a vote, lets try with a poll
>> that could show whether there is some sort of solution that is
>> acceptable to everybody.
>>
>> Please use +1 to mean "I like this option", +0 to mean "the option is
>> OK, but I'd prefer a different one", -0 for "I don't like the option but
>> I can live with it" and -1 for "this option is not acceptable to me.
>>
>> Options raised during the thread:
>>
>> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>>      subclass) and throw the IOException
> 
> -1. There will no end to this and our code will be riddled and
> cluttered with try-catch clauses.
> 
>>
>> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>>      IOException (possibly a subclass) and throw the IOException - allow
>>      the remaining RuntimeExceptions to fly through
> 
> -0. Selective implementation might be OK but I prefer throwing
> unchecked exceptions for unrecoverable errors like a negative index.
> 
>>
>> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>>      exception (which one could be discussed later) and throw this one
> 
> -1. That one does not make sense to me.
> 
>>
>> (4) don't catch RuntimeExceptions at all, just document broken archives
>>      can cause arbitrary RuntimeExceptions and code that tries to read
>>      archives from untrusted sources is expected to deal with them
>>      itself.
> 
> +0.
> 
> Here what I would prefer to see on our APIs:
> 
> - Throw IOException where we surface JRE IOException.
> - Throw NPE where we use methods like Objects.requireNonNull().
> - Throw IllegalArgumentException on other garbage API input arguments.
> - Throw IllegalStateException when the archive is broken.

IIUC, the problem is that [compress] states that it throws an 
IOException, but in practice can throw arbitrary runtime exceptions as 
well. So IMHO, it would be a great improvement for the users to come to 
a definition under which circumstances what outcome is expected.

This will probably require some kind of translation from exceptions 
thrown during parsing to the exceptions declared by [compress]. How such 
a translation then looks like, is certainly a matter of taste. As the 
API currently uses IOExceptions anyway, using sub classes of this type 
would be fine with me.

In any case, I think we should implement some translation and not let 
all kinds of unexpected exceptions fall on our users.

Oliver

> 
> Gary
> 
>> "Just harden all parsers" is a variation of (4) in my view as I don't
>> believe we would manage to cover all cases no matter how hard and long
>> we try.
>>
>> I hope I didn't overlook any.
>>
>> Thanks
>>
>>          Stefan
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Gary Gregory <ga...@gmail.com>.
On Tue, Jun 29, 2021 at 4:24 PM Stefan Bodewig <bo...@apache.org> wrote:
>
> Hi
>
> I'm sorry, but I'm unable to see what would or would not work for the
> people who chimed in. Short of calling for a vote, lets try with a poll
> that could show whether there is some sort of solution that is
> acceptable to everybody.
>
> Please use +1 to mean "I like this option", +0 to mean "the option is
> OK, but I'd prefer a different one", -0 for "I don't like the option but
> I can live with it" and -1 for "this option is not acceptable to me.
>
> Options raised during the thread:
>
> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>     subclass) and throw the IOException

-1. There will no end to this and our code will be riddled and
cluttered with try-catch clauses.

>
> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>     IOException (possibly a subclass) and throw the IOException - allow
>     the remaining RuntimeExceptions to fly through

-0. Selective implementation might be OK but I prefer throwing
unchecked exceptions for unrecoverable errors like a negative index.

>
> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>     exception (which one could be discussed later) and throw this one

-1. That one does not make sense to me.

>
> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.

+0.

Here what I would prefer to see on our APIs:

- Throw IOException where we surface JRE IOException.
- Throw NPE where we use methods like Objects.requireNonNull().
- Throw IllegalArgumentException on other garbage API input arguments.
- Throw IllegalStateException when the archive is broken.

Gary

> "Just harden all parsers" is a variation of (4) in my view as I don't
> believe we would manage to cover all cases no matter how hard and long
> we try.
>
> I hope I didn't overlook any.
>
> Thanks
>
>         Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Gilles Sadowski <gi...@gmail.com>.
Le mer. 30 juin 2021 à 11:55, PeterLee <pe...@apache.org> a écrit :
>
> > (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
> >    subclass) and throw the IOException
>
> -1
>
> I agree with sebb about this option - this may accidentally convert a bug
> into
> something else.
>
> > (2) catch only a subset of all RuntimeExceptions, wrap them in an
> >     IOException (possibly a subclass) and throw the IOException - allow
> >     the remaining RuntimeExceptions to fly through
>
> +1
>
> It may take a lot of work to get the subset, but I think the work is worthy.

IIUC what has been said, the main "problem" is that (some) callers expect
a single exception (whatever the encountered issue with the input) out of
[Compress].  This option does not do them any good.
The task is ill-defined as long as the "subset" remains undefined.  And
furthermore, the same low-level exception could arise from either a bug or
a corrupt input (so, depending which case is uncovered first, bugs could
still go unnoticed).

> > (3) catch all RuntimeExceptions, wrap them in an specific unchecked
> >     exception (which one could be discussed later) and throw this one
>
> -1
>
> > (4) don't catch RuntimeExceptions at all, just document broken archives
> >     can cause arbitrary RuntimeExceptions and code that tries to read
> >     archives from untrusted sources is expected to deal with them
> >     itself.
>
> -0
>
>
> > [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by PeterLee <pe...@apache.org>.
> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>    subclass) and throw the IOException

-1

I agree with sebb about this option - this may accidentally convert a bug
into
something else.

> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>     IOException (possibly a subclass) and throw the IOException - allow
>     the remaining RuntimeExceptions to fly through

+1

It may take a lot of work to get the subset, but I think the work is worthy.

> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>     exception (which one could be discussed later) and throw this one

-1

> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.

-0


On Wed, Jun 30, 2021 at 4:24 AM Stefan Bodewig <bo...@apache.org> wrote:

> Hi
>
> I'm sorry, but I'm unable to see what would or would not work for the
> people who chimed in. Short of calling for a vote, lets try with a poll
> that could show whether there is some sort of solution that is
> acceptable to everybody.
>
> Please use +1 to mean "I like this option", +0 to mean "the option is
> OK, but I'd prefer a different one", -0 for "I don't like the option but
> I can live with it" and -1 for "this option is not acceptable to me.
>
> Options raised during the thread:
>
> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>     subclass) and throw the IOException
>
> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>     IOException (possibly a subclass) and throw the IOException - allow
>     the remaining RuntimeExceptions to fly through
>
> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>     exception (which one could be discussed later) and throw this one
>
> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.
>
> "Just harden all parsers" is a variation of (4) in my view as I don't
> believe we would manage to cover all cases no matter how hard and long
> we try.
>
> I hope I didn't overlook any.
>
> Thanks
>
>         Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by sebb <se...@gmail.com>.
On Tue, 29 Jun 2021 at 21:24, Stefan Bodewig <bo...@apache.org> wrote:
>
> Hi
>
> I'm sorry, but I'm unable to see what would or would not work for the
> people who chimed in. Short of calling for a vote, lets try with a poll
> that could show whether there is some sort of solution that is
> acceptable to everybody.
>
> Please use +1 to mean "I like this option", +0 to mean "the option is
> OK, but I'd prefer a different one", -0 for "I don't like the option but
> I can live with it" and -1 for "this option is not acceptable to me.
>
> Options raised during the thread:
>
> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>     subclass) and throw the IOException

-0

Too easy to accidentally convert a bug into something that looks like
an input error.

> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>     IOException (possibly a subclass) and throw the IOException - allow
>     the remaining RuntimeExceptions to fly through

+1

I think we should catch known RTEs and wrap with an IEO giving the
probable cause.

This may take a few iterations to catch them all, but the fuzzer
should help here.

> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>     exception (which one could be discussed later) and throw this one

-1

> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.

-0

> "Just harden all parsers" is a variation of (4) in my view as I don't
> believe we would manage to cover all cases no matter how hard and long
> we try.
>
> I hope I didn't overlook any.
>
> Thanks
>
>         Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Matt Juntunen <ma...@gmail.com>.
Adding my two cents here...

It's very important to me as a user and developer to have informative
errors. If I use compress in my application and a raw NPE or out of bounds
exception flies out of the code with little to no context and makes it out
to the end user, then I am going to be getting a call as to why my code is
broken. It is also going to take me quite a while to track down the source
of the error and will most likely involve me digging into the compress
source code at some point. On the other hand, if the user receives an
exception like Stefan's UnhandledInputException [1] or a
CorruptedArchiveException, then they may try to check the archive itself
first and not even call me, and even if they do call, I'll be able to
troubleshoot much more quickly. That is why I think Stefan's PR (or
something similar) is the best way to go: it adds useful information to the
error that would be very difficult to obtain otherwise.

Regards,
Matt J

[1]
https://github.com/apache/commons-compress/compare/catch-RuntimeExceptions

On Thu, Jul 1, 2021 at 6:04 PM Gilles Sadowski <gi...@gmail.com> wrote:

> > [...]
> > > One is "sorry, this is an invalid archive" while the other is
> "something
> > > > went wrong reading the archive - printStackTrace". If that
> distinction is
> > > > not important to you - so be it. To me that's a big difference.
> > >
> > > Never said such a thing.
> > >
> >
> > Not sure what "thing" you mean exactly but In the very same email you
> wrote:
> >
> > "I don't see why the library should be expected to explain in details
> > why an archive is corrupt; it is good enough that it can detect it..."
>
> Cases in point (example provided by Stefan) is that exceptions are
> raised by bad input, but the exact reason is not always known because
> the processing could continue for some time before the optimistic
> assumption turned out to be wrong.
> For sure, it is better to provide the actual root cause of the failure but
> Stefan indicated that reviewing all the codes would be a lot of work.
> And IIUC, we are talking about the current situation, not the ideal one
> (where the caller could act based on accurate information).
>
> IOW still, what I meant by the above is that this pseudo-code (I have
> no idea how the [Compress] API looks like):
> ---CUT---
> String foo = "This is probably invalid ZIP data";
> String bar = compress.unzip(foo);
> ---CUT---
> could throw a plain IAE, without being required to give an explanation
> of why/how/where the library figured out that something was wrong
> with the input.  Providing wrong input is a programming error on the
> caller's part.  It's his responsibility to protect against his mistakes.
>
> > > The "sorry, this is an invalid archive" is what makes the situation
> > > > for me "recoverable".
> > >
> > > This is plain wrong:
> >
> >
> > I am not drawing that straight line to preconditions.
> > Flows or processes can also have the attribute of "recoverable".
>
> Not [Compress] responsibility.
>
> > [...]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Gilles Sadowski <gi...@gmail.com>.
> [...]
> > One is "sorry, this is an invalid archive" while the other is "something
> > > went wrong reading the archive - printStackTrace". If that distinction is
> > > not important to you - so be it. To me that's a big difference.
> >
> > Never said such a thing.
> >
>
> Not sure what "thing" you mean exactly but In the very same email you wrote:
>
> "I don't see why the library should be expected to explain in details
> why an archive is corrupt; it is good enough that it can detect it..."

Cases in point (example provided by Stefan) is that exceptions are
raised by bad input, but the exact reason is not always known because
the processing could continue for some time before the optimistic
assumption turned out to be wrong.
For sure, it is better to provide the actual root cause of the failure but
Stefan indicated that reviewing all the codes would be a lot of work.
And IIUC, we are talking about the current situation, not the ideal one
(where the caller could act based on accurate information).

IOW still, what I meant by the above is that this pseudo-code (I have
no idea how the [Compress] API looks like):
---CUT---
String foo = "This is probably invalid ZIP data";
String bar = compress.unzip(foo);
---CUT---
could throw a plain IAE, without being required to give an explanation
of why/how/where the library figured out that something was wrong
with the input.  Providing wrong input is a programming error on the
caller's part.  It's his responsibility to protect against his mistakes.

> > The "sorry, this is an invalid archive" is what makes the situation
> > > for me "recoverable".
> >
> > This is plain wrong:
>
>
> I am not drawing that straight line to preconditions.
> Flows or processes can also have the attribute of "recoverable".

Not [Compress] responsibility.

> [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Torsten Curdt <tc...@vafer.org>.
> > So IllegalArgumentException is not from the early Java days?
>
> It is, so what?
> It's the exception I use 99% of the time (to signal failed precondition)?
> It's a runtime exception.
>

When I quote standard exceptions you said:

"if you quote design decisions that date back from the early Java days"


> There should be a difference of action when the code knows the archive is
> > broken in contrast there being a problem with the code reading it (for
> > whatever reason).
>
> The difference does not entail the checked vs unchecked selection.
>

While I'd prefer "Optional", given the context of backward compatibility
and expectations I believe it does.


> One is "sorry, this is an invalid archive" while the other is "something
> > went wrong reading the archive - printStackTrace". If that distinction is
> > not important to you - so be it. To me that's a big difference.
>
> Never said such a thing.
>

Not sure what "thing" you mean exactly but In the very same email you wrote:

"I don't see why the library should be expected to explain in details
why an archive is corrupt; it is good enough that it can detect it..."


> The "sorry, this is an invalid archive" is what makes the situation
> > for me "recoverable".
>
> This is plain wrong:


I am not drawing that straight line to preconditions.
Flows or processes can also have the attribute of "recoverable".


I don't see why the library should be expected to explain in details
> why an archive is corrupt; it is good enough that it can detect it, and
> very good if it can protect the caller from himself


If you don't see the quality difference between "this archive is invalid"
and "could not read the archive" there is really no point to make here.

I think this has again gotten to the point of no longer being useful for
the wider audience.
So never mind - I'll go back to lurking.

cheers,
Torsten

Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Gilles Sadowski <gi...@gmail.com>.
Le jeu. 1 juil. 2021 à 11:31, Torsten Curdt <tc...@vafer.org> a écrit :
>
> > I'm not sure what you refer to exactly,
>
>
> The various links from
>
>   https://docs.oracle.com/javase/7/docs/api/java/io/IOException.html
>
> or even
>
>   https://docs.oracle.com/javase/7/docs/api/java/lang/Exception.html
>
>
> > but I noted already that not all
> > references are equal, especially if you quote design decisions that date
> > back from the early Java days (when every textbook touted checked
> > exceptions as _the_ solution to robust programming).
> >
>
> So IllegalArgumentException is not from the early Java days?

It is, so what?
It's the exception I use 99% of the time (to signal failed precondition)?
It's a runtime exception.

My point, all along, is that I have a hard time finding a single proper
use of checked exceptions (in libraries such as we deal with here),
not counting the cases where one was right, at the time, to mimic
JDK usage of checked exceptions.
I said from the outset of this discussion that if the purpose is to stay
true to the initial design decisions, it may be OK to continue with
"IOException".
As you indicate below, the currently advertised way to handle
recovery is with "Optional".

>
> > ...
> > Ask yourself how the programmer will handle the exception ? If the
> > programmer
> > can’t do better than e.printStackTrace() or throw new AssertionError(),
> > then, an
> > unchecked exception is called for.
> >
>
> And I guess that's the point - the programmer can do better.
>
> There should be a difference of action when the code knows the archive is
> broken in contrast there being a problem with the code reading it (for
> whatever reason).

The difference does not entail the checked vs unchecked selection.

>
> One is "sorry, this is an invalid archive" while the other is "something
> went wrong reading the archive - printStackTrace". If that distinction is
> not important to you - so be it. To me that's a big difference.

Never said such a thing.

> The "sorry, this is an invalid archive" is what makes the situation
> for me "recoverable".

This is plain wrong: From the POV of (any) library, failure to satisfy
a precondition is not recoverable, by definition of "precondition".
[Here there is often no simple check, but that does not change the
principle.]

I don't see why the library should be expected to explain in details
why an archive is corrupt; it is good enough that it can detect it, and
very good if it can protect the caller from himself (by throwing *any*
exception rather than crashing the JVM or giving root access. ;-)

> For me "recoverable" does not mean running the same code with the same
> input twice and expecting a different result.

In effect this is exactly what one does in the only case I know of,
where a checked exception may be appropriate (as per the two
conditions mentioned earlier), that is when resources are requested
that may be temporarily unavailable.

Gilles

> If you continue to read your link to "Effective Java", the use of
> "Optional" is suggested as the proper alternative to a checked exception.
> And to be honest I'd be all for that - on a new code base. But you'd still
> need figure out the "canCompute" to make the distinction between the two
> states.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Torsten Curdt <tc...@vafer.org>.
> I'm not sure what you refer to exactly,


The various links from

  https://docs.oracle.com/javase/7/docs/api/java/io/IOException.html

or even

  https://docs.oracle.com/javase/7/docs/api/java/lang/Exception.html


> but I noted already that not all
> references are equal, especially if you quote design decisions that date
> back from the early Java days (when every textbook touted checked
> exceptions as _the_ solution to robust programming).
>

So IllegalArgumentException is not from the early Java days?



> ...
> Ask yourself how the programmer will handle the exception ? If the
> programmer
> can’t do better than e.printStackTrace() or throw new AssertionError(),
> then, an
> unchecked exception is called for.
>

And I guess that's the point - the programmer can do better.

There should be a difference of action when the code knows the archive is
broken in contrast there being a problem with the code reading it (for
whatever reason).

One is "sorry, this is an invalid archive" while the other is "something
went wrong reading the archive - printStackTrace". If that distinction is
not important to you - so be it. To me that's a big difference.

The "sorry, this is an invalid archive" is what makes the situation
for me "recoverable".
For me "recoverable" does not mean running the same code with the same
input twice and expecting a different result.

If you continue to read your link to "Effective Java", the use of
"Optional" is suggested as the proper alternative to a checked exception.
And to be honest I'd be all for that - on a new code base. But you'd still
need figure out the "canCompute" to make the distinction between the two
states.

Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Gilles Sadowski <gi...@gmail.com>.
Le jeu. 1 juil. 2021 à 02:27, Torsten Curdt <tc...@vafer.org> a écrit :
>
> >
> > "If there is runtime exception there is a bug in the code"
> >
> > I don't think that's correct because IllegalArgumentException is a
> > RuntimeException.
>
>
> I have a hard time following that causality. The way I've seen this
> exception used mainly is in case of program errors, not in case of input
> validations. Similar to
>
>
> https://docs.oracle.com/javase/7/docs/api/java/lang/IllegalStateException.html
>
> It's interesting that the IllegalArgumentException from the java API is
> meant to prove a point while the ones I referenced didn't seem to count ;)

I'm not sure what you refer to exactly, but I noted already that not all
references are equal, especially if you quote design decisions that date
back from the early Java days (when every textbook touted checked
exceptions as _the_ solution to robust programming).

From[1] J. Bloch's "Effective Java" (3rd ed.):
---CUT---
Item 71 : Avoid unnecessary use of checked exceptions

Methods throwing checked exceptions can’t be used directly in streams.
An unchecked exception use is appropriate unless both of these conditions
are met :

 * The exceptional condition cannot be prevented by proper use of the API
 * The programmer using the API can take some useful action once confronted
with the exception

Ask yourself how the programmer will handle the exception ? If the programmer
can’t do better than e.printStackTrace() or throw new AssertionError(), then, an
unchecked exception is called for.
---CUT---

Case discussed here obviously fails the first condition: Passing a non-corrupt
archive would not trigger an exception.[2]
In my view of what is "recoverable", the second condition also fails (but YMMV):
If the library throws an exception because it cannot make sense of the archive,
the caller still has no clue how to recover (except trying again by
assuming that
the error will automagically fix itself).

Gilles

[1] Copied from
    https://ahdak.github.io/blog/effective-java-part-9/
(contains more items about exception handling).
[2] Unless there is a bug in [Compress], which also mandates a runtime exception
(to be reported here in order to be fixed).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Torsten Curdt <tc...@vafer.org>.
>
> "If there is runtime exception there is a bug in the code"
>
> I don't think that's correct because IllegalArgumentException is a
> RuntimeException.


I have a hard time following that causality. The way I've seen this
exception used mainly is in case of program errors, not in case of input
validations. Similar to


https://docs.oracle.com/javase/7/docs/api/java/lang/IllegalStateException.html

It's interesting that the IllegalArgumentException from the java API is
meant to prove a point while the ones I referenced didn't seem to count ;)

Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Gary Gregory <ga...@gmail.com>.
"If there is runtime exception there is a bug in the code"

I don't think that's correct because IllegalArgumentException is a
RuntimeException. See
https://docs.oracle.com/javase/8/docs/api/java/lang/IllegalArgumentException.html
Gary

On Wed, Jun 30, 2021, 16:50 Torsten Curdt <tc...@vafer.org> wrote:

> >
> > It is not the job of commons-compress to stop you from using a
> > corrupt archive. If you choose to do so, then ...
> >
>
> I don't think that's what it is.
> That would be more like getting an exception and continuing walking through
> the archive.
>
> If there is a parse exception there is a bug in the input.
> If there is runtime exception there is a bug in the code.
>
> If the user does a try/catch for a runtime exception there is no indication
> which of the two it is.
>
> I personally find it a shame if we were to blur those lines.
>

Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Torsten Curdt <tc...@vafer.org>.
>
> It is not the job of commons-compress to stop you from using a
> corrupt archive. If you choose to do so, then ...
>

I don't think that's what it is.
That would be more like getting an exception and continuing walking through
the archive.

If there is a parse exception there is a bug in the input.
If there is runtime exception there is a bug in the code.

If the user does a try/catch for a runtime exception there is no indication
which of the two it is.

I personally find it a shame if we were to blur those lines.

Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Jochen Wiedmann <jo...@gmail.com>.
On Tue, Jun 29, 2021 at 10:24 PM Stefan Bodewig <bo...@apache.org> wrote:

> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.

Strongly in favour of this one. It is the obvious implementation
strategy. If a user gets an Exception, and uses the archive anyways,
think of the old Unix saying:

    It is not UNIX’s job to stop you from shooting your foot. If you
so choose to do so, then it is UNIX’s job to deliver Mr. Bullet to Mr
Foot in the most efficient way it knows.

Translated:

   It is not the job of commons-compress to stop you from using a
corrupt archive. If you choose to do so, then ...

Jochen

-- 

Look, that's why there's rules, understand? So that you think before
you break 'em.

    -- (Terry Pratchett, Thief of Time)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Torsten Curdt <tc...@vafer.org>.
> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
>     subclass) and throw the IOException
>

+0



> (2) catch only a subset of all RuntimeExceptions, wrap them in an
>     IOException (possibly a subclass) and throw the IOException - allow
>     the remaining RuntimeExceptions to fly through
>

+1


>
> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
>     exception (which one could be discussed later) and throw this one
>

-1


>
> (4) don't catch RuntimeExceptions at all, just document broken archives
>     can cause arbitrary RuntimeExceptions and code that tries to read
>     archives from untrusted sources is expected to deal with them
>     itself.
>

-1

Re: [compress] [Poll Non Result] Dealing with uncaught RuntimeExceptions

Posted by Torsten Curdt <tc...@vafer.org>.
> >> That certainly doesn't prevent anybody else from trying to find a
> >> compromise :-)
>
> > It feels like Optionals could be a compromise.
>
> I must admit I've lost track of the later discussion threads. If you
> mean that we'd return Optional<> results, this would become an entirely
> different API.
>

It certainly does. And I personally don't think it's the best course of
action at this stage - unless someone is eager to work on a new major
version. But if you look for a compromise it seem to be the only one
acceptable to both "camps".


I'd very much like us to get to a compromise that helps our users and
> doesn't force us to randomly fix RuntimeExceptions that slipped through
> overly optimistic parser code.
>

Well, I am fine with the solution you proposed :)

cheers,
Torsten

Re: [compress] [Poll Non Result] Dealing with uncaught RuntimeExceptions

Posted by Stefan Bodewig <bo...@apache.org>.
On 2021-07-01, Torsten Curdt wrote:

>> That certainly doesn't prevent anybody else from trying to find a
>> compromise :-)

> It feels like Optionals could be a compromise.

I must admit I've lost track of the later discussion threads. If you
mean that we'd return Optional<> results, this would become an entirely
different API.

I'd very much like us to get to a compromise that helps our users and
doesn't force us to randomly fix RuntimeExceptions that slipped through
overly optimistic parser code.

So if you want to explore the Optionals idea further this would be
great, but I doubt enough people have seen it.

Thanks

        Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] [Poll Non Result] Dealing with uncaught RuntimeExceptions

Posted by Torsten Curdt <tc...@vafer.org>.
> That certainly doesn't prevent anybody else from trying to find a
> compromise :-)
>

It feels like Optionals could be a compromise.

[compress] [Poll Non Result] Dealing with uncaught RuntimeExceptions

Posted by Stefan Bodewig <bo...@apache.org>.
Hi all

there isn't a single option that hasn't at least received two -1s with
eight people indicating their preference. So neither option seems to be
an option that could lead to a compromise.

With this I run out of ideas and will rest my case and not try to find a
generic solution - but rather try to get 1.21 out with no changes in
this area.

That certainly doesn't prevent anybody else from trying to find a
compromise :-)

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[compress] [Poll] Dealing with uncaught RuntimeExceptions

Posted by Stefan Bodewig <bo...@apache.org>.
Hi

I'm sorry, but I'm unable to see what would or would not work for the
people who chimed in. Short of calling for a vote, lets try with a poll
that could show whether there is some sort of solution that is
acceptable to everybody.

Please use +1 to mean "I like this option", +0 to mean "the option is
OK, but I'd prefer a different one", -0 for "I don't like the option but
I can live with it" and -1 for "this option is not acceptable to me.

Options raised during the thread:

(1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
    subclass) and throw the IOException

(2) catch only a subset of all RuntimeExceptions, wrap them in an
    IOException (possibly a subclass) and throw the IOException - allow
    the remaining RuntimeExceptions to fly through

(3) catch all RuntimeExceptions, wrap them in an specific unchecked
    exception (which one could be discussed later) and throw this one

(4) don't catch RuntimeExceptions at all, just document broken archives
    can cause arbitrary RuntimeExceptions and code that tries to read
    archives from untrusted sources is expected to deal with them
    itself.

"Just harden all parsers" is a variation of (4) in my view as I don't
believe we would manage to cover all cases no matter how hard and long
we try.

I hope I didn't overlook any. 

Thanks

        Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Torsten Curdt <tc...@vafer.org>.
I personally think that we should look at it from the point of view of the
API consumer.
What would they want to happen in case of an error reading an archive?

IMO an IOException.
How we get to that exception on the implementation side is another matter.

If we could check all conditions and throw "legitimate" IO exceptions - it
would be perfect. Since that's unfortunately not feasible for the time
being, wrapping a runtime exceptions sounds like the most pragmatic and
future proof approach to me.

So I am in favor of 2)

cheers,
Torsten

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by "Bruno P. Kinoshita" <ki...@apache.org>.
 
+1 for 2)

In [Imaging] we do something similar, but it was easier because there were many methods that were already doing it (throwing ImageReadException or ImageWriteException).


Bruno


    On Sunday, 27 June 2021, 11:03:31 pm NZST, Stefan Bodewig <bo...@apache.org> wrote:  
 
 Hi

I'd like to get closure on which approach we want to take.

When we read a broken archive it may trigger arbitrary RuntimeExceptions
because we are not explicitly checking for each and every sizuation
where a bounds check could fail, a negative size is sent to a classlib
method that then throws an IllegalArgumentException or whatnot (even a
NullPointerException may escape us every now and then).

Uncaught RuntimeExceptions are considered security issued by some tools
because of a potential DoS attack. Historically we have never agreed
with this point of view and I'm not suggesting to change that.

Even though we may not know what is wrong, when the RuntimeException
occurs, we do know the archive is broken and this is the reason for the
exception.

AFAICS there are two ways we can deal with it:

(1) every method that reads from the archive declares it can throw
    arbitrary RuntimeExceptions as well. And we document that broken
    archives may cause RuntimeExceptions and that we never consider such
    a case a security issue.

(2) we catch RuntimeExceptions at every method that reads from the
    archive and wrap them in a custom IOException, making sure such a
    case can never escape us.

Personally I prefer (2) but can live with (1) - I've suggested something
along the lines of (2) in [1] and it seemed Gilles was opposed to this
idea (and Matt was torn).

In [2] Bernd seemed to support (2).

Are there any other opinions?

Stefan

[1] https://lists.apache.org/thread.html/r5d2427566dff4c7d293e8d48f9ac62b7958d19047f730836ce5b3c60%40%3Cdev.commons.apache.org%3E

[2] https://lists.apache.org/thread.html/r3ce77eb9ab9429097ca57c48cb99b8be497ee5b69d419b52a6722616%40%3Cdev.commons.apache.org%3E

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

  

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Matt Sicker <bo...@gmail.com>.
Checked exceptions are also used when the error isn’t a programmer error.
From an aesthetic perspective, I prefer the unchecked exceptions unless an
API already established them. Subclassing IOException is fairly common for
example.

On Sun, Jun 27, 2021 at 10:37 Torsten Curdt <tc...@vafer.org> wrote:

> > Can you expand on that - I don't understand the horror or the futility.
>
> >
> > The futility is because it is wrong for the caller to expect that no
> > runtime exception can be thrown.
> >
>
> Based on that premise we could also just forget about checked exceptions
> altogether.
>
> I think the distinction is that runtime exceptions are more for unexpected
> errors.
> Parsing an archive I personally don't see in that realm.
>
>
> > > It certainly is a bit of code smell - but only because there is no
> quick
> > > way to harden the implementations.
> >
> > Maybe I do not follow what you mean, but there is no way to "harden":
> > if the input is garbage, it cannot be processed and _some_ exception
> > will result.
> >
>
> If there are checks in place, the implementation could throw a checked
> exception.
> Harden as in adding those checks.
>
> > But I rather have a clear and stable API and fix the the implementation
> > > along the way than to leak the problems to the API consumer and just
> > > document it.
> >
> > Signalling that a problem with the input was detected is not leaking
> > since the problem was the API caller/consumer in the first place.
> > [Fixing library bugs is not what is being discussed AFAICT.]
> >
>
> I'd argue that signaling this problem should be a checked exception.
> IMO this provides a clearer contract to the user.
>
> I guess it's all about managing expectations.
> If I throw garbage at a parser I'd not expect a runtime exception but a
> ParseException.
> If it does throw a runtime exception instead it is leaking the problem as
> invalid input is not something the caller can even check beforehand.
>

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Gilles Sadowski <gi...@gmail.com>.
Le dim. 27 juin 2021 à 19:05, Torsten Curdt <tc...@vafer.org> a écrit :
>> [...]
> > > I'd argue that signaling this problem should be a checked exception.
> > > IMO this provides a clearer contract to the user.
> >
> > It doesn't.  The user would have a false sense of security believing so.
> >
>
> I guess I disagree there - and so seem the authors of the JDK.

At the time of Java 1 surely so.
Today, you shouldn't bet on it.

> But that's fine - I just wanted to give my 2 cents :)
> I don't have enough stake anymore to keep this discussion going.

This discussion happened some 15 (?) years ago for Commons Math.
All were removed because they were just PITA (i.e. no believer in
checked exceptions could demonstrate the slightest usefulness).
I've read somewhere that Java was the only language having this
duality.  Right against everyone else?

>
> cheers,
> Torsten

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Matt Sicker <bo...@gmail.com>.
Perhaps the key point here is throwing a more specific exception than
RuntimeException? Even if it's a subclass of it. Adding the javadocs
for which exceptions are allowed to be thrown might be sufficient to
cover the DoS attacks.

On Sun, Jun 27, 2021 at 12:05 PM Torsten Curdt <tc...@vafer.org> wrote:
>
> >
> > > Based on that premise we could also just forget about checked exceptions
> > > altogether.
> >
> > As Gary said (as Joshua Bloch said, as many people said), checked
> > exceptions are for recoverable errors.
> >
>
> Maybe it boils down to the definition of "recoverable".
>
>
> > Parsing an archive I personally don't see in that realm.
> >
> > Even if the archive is just garbage (or any non-supported file format)?
> >
>
> Especially then. Similar to
>
>
> https://docs.oracle.com/javase/7/docs/api/java/net/MalformedURLException.html
>
>
> Comparing all the subclasses of
>
>   https://docs.oracle.com/javase/7/docs/api/java/lang/RuntimeException.html
>
> vs
>
>   https://docs.oracle.com/javase/7/docs/api/java/io/IOException.html
>
> I do see this case fit naturally as a subclass of IOException.
>
>
>
> > > I'd argue that signaling this problem should be a checked exception.
> > > IMO this provides a clearer contract to the user.
> >
> > It doesn't.  The user would have a false sense of security believing so.
> >
>
> I guess I disagree there - and so seem the authors of the JDK.
> But that's fine - I just wanted to give my 2 cents :)
> I don't have enough stake anymore to keep this discussion going.
>
> cheers,
> Torsten

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Torsten Curdt <tc...@vafer.org>.
>
> > Based on that premise we could also just forget about checked exceptions
> > altogether.
>
> As Gary said (as Joshua Bloch said, as many people said), checked
> exceptions are for recoverable errors.
>

Maybe it boils down to the definition of "recoverable".


> Parsing an archive I personally don't see in that realm.
>
> Even if the archive is just garbage (or any non-supported file format)?
>

Especially then. Similar to


https://docs.oracle.com/javase/7/docs/api/java/net/MalformedURLException.html


Comparing all the subclasses of

  https://docs.oracle.com/javase/7/docs/api/java/lang/RuntimeException.html

vs

  https://docs.oracle.com/javase/7/docs/api/java/io/IOException.html

I do see this case fit naturally as a subclass of IOException.



> > I'd argue that signaling this problem should be a checked exception.
> > IMO this provides a clearer contract to the user.
>
> It doesn't.  The user would have a false sense of security believing so.
>

I guess I disagree there - and so seem the authors of the JDK.
But that's fine - I just wanted to give my 2 cents :)
I don't have enough stake anymore to keep this discussion going.

cheers,
Torsten

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Gilles Sadowski <gi...@gmail.com>.
Hi.

Le dim. 27 juin 2021 à 17:37, Torsten Curdt <tc...@vafer.org> a écrit :
>
> > Can you expand on that - I don't understand the horror or the futility.
>
> >
> > The futility is because it is wrong for the caller to expect that no
> > runtime exception can be thrown.
> >
>
> Based on that premise we could also just forget about checked exceptions
> altogether.

As Gary said (as Joshua Bloch said, as many people said), checked
exceptions are for recoverable errors.

> I think the distinction is that runtime exceptions are more for unexpected
> errors.

Agreed.

> Parsing an archive I personally don't see in that realm.

Even if the archive is just garbage (or any non-supported file format)?

> > > It certainly is a bit of code smell - but only because there is no quick
> > > way to harden the implementations.
> >
> > Maybe I do not follow what you mean, but there is no way to "harden":
> > if the input is garbage, it cannot be processed and _some_ exception
> > will result.
> >
>
> If there are checks in place, the implementation could throw a checked
> exception.
> Harden as in adding those checks.

The question is the rationale for choosing checked vs unchecked
exceptions.

> > But I rather have a clear and stable API and fix the the implementation
> > > along the way than to leak the problems to the API consumer and just
> > > document it.
> >
> > Signalling that a problem with the input was detected is not leaking
> > since the problem was the API caller/consumer in the first place.
> > [Fixing library bugs is not what is being discussed AFAICT.]
> >
>
> I'd argue that signaling this problem should be a checked exception.
> IMO this provides a clearer contract to the user.

It doesn't.  The user would have a false sense of security believing so.

> I guess it's all about managing expectations.
> If I throw garbage at a parser I'd not expect a runtime exception but a
> ParseException.

That's acceptable if the rationale is to behave like the JDK does
in circumstances deemed similar.
[But I still think that the one-rule of "recoverable or not" is
easier to follow consistently.]

> If it does throw a runtime exception instead it is leaking the problem as
> invalid input is not something the caller can even check beforehand.

In this particular case, it's true that we are on the fence for that very
reason.  It is however no less true that the error is not recoverable
from the POV of the Commons library.  For example, if it _could_
be checked beforehand that the input is wrong, we should (IMO)
throw an IAE; hence it is indeed inconsistent that depending on
how/when the error is discovered, the exception would be different.

But as said previously, it could boil down to a choice of API.

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Torsten Curdt <tc...@vafer.org>.
> Can you expand on that - I don't understand the horror or the futility.

>
> The futility is because it is wrong for the caller to expect that no
> runtime exception can be thrown.
>

Based on that premise we could also just forget about checked exceptions
altogether.

I think the distinction is that runtime exceptions are more for unexpected
errors.
Parsing an archive I personally don't see in that realm.


> > It certainly is a bit of code smell - but only because there is no quick
> > way to harden the implementations.
>
> Maybe I do not follow what you mean, but there is no way to "harden":
> if the input is garbage, it cannot be processed and _some_ exception
> will result.
>

If there are checks in place, the implementation could throw a checked
exception.
Harden as in adding those checks.

> But I rather have a clear and stable API and fix the the implementation
> > along the way than to leak the problems to the API consumer and just
> > document it.
>
> Signalling that a problem with the input was detected is not leaking
> since the problem was the API caller/consumer in the first place.
> [Fixing library bugs is not what is being discussed AFAICT.]
>

I'd argue that signaling this problem should be a checked exception.
IMO this provides a clearer contract to the user.

I guess it's all about managing expectations.
If I throw garbage at a parser I'd not expect a runtime exception but a
ParseException.
If it does throw a runtime exception instead it is leaking the problem as
invalid input is not something the caller can even check beforehand.

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Torsten Curdt <tc...@vafer.org>.
> Recovery may not be possible for a single archive, but the caller may
> want to process multiple archives.
> (Or potentially try again, if the error was caused by a transient error
> [1])
>
> So ideally Compress will return a CE for parsing errors, and that CE
> should give some context to the error.
> Care must be taken however not to catch every RTE, lest this hide a coding
> bug.
>
> It may take a few iterations to catch and wrap all the possible RTEs,
> but the end result would be a more robust and easier to use component.
>

+1

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by PeterLee <pe...@apache.org>.
> So ideally Compress will return a CE for parsing errors, and that CE
> should give some context to the error.
> Care must be taken however not to catch every RTE, lest this hide a
coding bug.

+1 for this.

Lee

On Tue, Jun 29, 2021 at 7:57 PM sebb <se...@gmail.com> wrote:

> On Tue, 29 Jun 2021 at 12:16, Gary Gregory <ga...@gmail.com> wrote:
> >
> > The best reason to create a new IO exception is if you are going to catch
> > it instead of the original and recover differently. I don't think that's
> > the case here, there is no "recovery" possible on a corrupted archive.
> For
> > my money, the exception can remain unchecked as it is unrealistic that
> > recovery is not possible. We can throw a better unchecked exception with
> a
> > better message at best. A plain IOException is possible and some folks
> > favor that but again, recovery is not possible on broken archives.
>
> Recovery may not be possible for a single archive, but the caller may
> want to process multiple archives.
> (Or potentially try again, if the error was caused by a transient error
> [1])
>
> So ideally Compress will return a CE for parsing errors, and that CE
> should give some context to the error.
> Care must be taken however not to catch every RTE, lest this hide a coding
> bug.
>
> It may take a few iterations to catch and wrap all the possible RTEs,
> but the end result would be a more robust and easier to use component.
>
> Sebb
> [1] It's not unknown for files to be temporarily locked, or for
> network errors to occur, etc.
>
> > Gary
> >
> > On Tue, Jun 29, 2021, 03:54 Miguel Munoz <swingguy1024@yahoo.com
> .invalid>
> > wrote:
> >
> > > Catching all RuntimeExceptions and wrapping them in an IOException
> looks
> > > like the cleanest solution. RuntimeExceptions usually mean bugs, so if
> the
> > > archive code is throwing them due to a corrupted archive, it makes
> sense to
> > > wrap it in a checked exception. I would like to suggest creating a new
> > > class looking something like this:
> > >
> > > public class CorruptedArchiveException extends IOException { }
> > > In fact, since we will only be generating them in response to a
> > > RuntimeException, we may want to make sure all constructors require an
> > > RuntimeException as a parameter, to guarantee that we wrap the original
> > > unchecked exception.
> > > — Miguel Muñoz
> > > ———
> > > 4210 Via Arbolada #226Los Angeles, CA 90042
> > > 323-225-7285
> > > –
> > >
> > > The Sun, with all those planets going around it and dependent on it,
> can
> > > still ripen a vine of grapes like it has nothing else to do in the
> world.
> > >   — Galileo
> > >
> > >     On Monday, June 28, 2021, 06:52:02 AM PDT, Gilles Sadowski <
> > > gilleseran@gmail.com> wrote:
> > >
> > >  Le lun. 28 juin 2021 à 08:52, Stefan Bodewig <bo...@apache.org> a
> > > écrit :
> > > >
> > > > On 2021-06-27, Gilles Sadowski wrote:
> > > >
> > > > > Le dim. 27 juin 2021 à 21:15, Stefan Bodewig <bo...@apache.org>
> a
> > > écrit :
> > > >
> > > > >> As I said, we can as well document that each method could throw
> > > > >> arbitrary RuntimeExceptions, but I don't believe we can list the
> kinds
> > > > >> of RuntimeExceptions exhaustively
> > > >
> > > > > Why not?
> > > > > Listing all runtime exceptions is considered part of good
> > > > > documentation.
> > > >
> > > > Because we do not know which RuntimeExceptions may be caused by an
> > > > invalid archive.
> > > >
> > > > Most of the RuntimeException happen because our parsers believe in a
> > > > world where every archive is valid. For example we may read a few
> bytes
> > > > that are the size of an array and then create the array without
> checking
> > > > whether the size is negative and a NegativeArraySizeException occurs.
> > > >
> > > > As we haven't considered the case of a negative size in code, we
> also do
> > > > not know this exception could be thrown. If we had considered the
> case
> > > > of negative sizes, the parser could have avoided the exception
> > > > alltogether. This is what I meant with
> > > >
> > > > >> - if we knew which exceptions can be thrown, then we could as well
> > > > >> check the error conditions ourselves beforehand.
> > > >
> > > > Our parsers could of course be hardened, but this is pretty
> difficult to
> > > > do years after they've been written and would certainly miss a few
> > > > corner cases anyway.
> > >
> > > Thank you for the example.  I now understand that the aim is not
> > > to increase the number of precondition checks but only to ensure
> > > that a single exception can escape from calls with "undefined"
> > > behaviour whenever some precondition is not fulfilled.
> > >
> > > IIUC, the situation where the library assumes some precondition
> > > but does not check it, and continues its operations until something
> > > unexpected occurs, would be described by "IllegalStateException".
> > > [A very much "non-recoverable" state, as the library doesn't know
> > > the root cause (which may be a long way from where the symptom
> > > appeared).]
> > >
> > > In principle, it looks interesting information for users to be able to
> > > distinguish between a known failure (i.e. the library knows the root
> > > cause) and an unknown failure (?).
> > >
> > > > And then there is a certain category of exceptions thrown by Java
> > > > classlib methods we invoke. We've just added a catch block around
> > > > java.util.zip.ZipEntry#setExtra because certain invalid archives
> cause a
> > > > RuntimeException there - and if I remember correctly a
> RuntimeException
> > > > the method's javadoc doesn't list.
> > >
> > > It shows that similar JDK functionality can indeed throw a runtime
> > > exception when it gets unexpected garbage.  What is their rationale
> > > for choosing this or "IOException" (I have no idea)?
> > >
> > > Regards,
> > > Gilles
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > For additional commands, e-mail: dev-help@commons.apache.org
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by sebb <se...@gmail.com>.
On Tue, 29 Jun 2021 at 12:16, Gary Gregory <ga...@gmail.com> wrote:
>
> The best reason to create a new IO exception is if you are going to catch
> it instead of the original and recover differently. I don't think that's
> the case here, there is no "recovery" possible on a corrupted archive. For
> my money, the exception can remain unchecked as it is unrealistic that
> recovery is not possible. We can throw a better unchecked exception with a
> better message at best. A plain IOException is possible and some folks
> favor that but again, recovery is not possible on broken archives.

Recovery may not be possible for a single archive, but the caller may
want to process multiple archives.
(Or potentially try again, if the error was caused by a transient error [1])

So ideally Compress will return a CE for parsing errors, and that CE
should give some context to the error.
Care must be taken however not to catch every RTE, lest this hide a coding bug.

It may take a few iterations to catch and wrap all the possible RTEs,
but the end result would be a more robust and easier to use component.

Sebb
[1] It's not unknown for files to be temporarily locked, or for
network errors to occur, etc.

> Gary
>
> On Tue, Jun 29, 2021, 03:54 Miguel Munoz <sw...@yahoo.com.invalid>
> wrote:
>
> > Catching all RuntimeExceptions and wrapping them in an IOException looks
> > like the cleanest solution. RuntimeExceptions usually mean bugs, so if the
> > archive code is throwing them due to a corrupted archive, it makes sense to
> > wrap it in a checked exception. I would like to suggest creating a new
> > class looking something like this:
> >
> > public class CorruptedArchiveException extends IOException { }
> > In fact, since we will only be generating them in response to a
> > RuntimeException, we may want to make sure all constructors require an
> > RuntimeException as a parameter, to guarantee that we wrap the original
> > unchecked exception.
> > — Miguel Muñoz
> > ———
> > 4210 Via Arbolada #226Los Angeles, CA 90042
> > 323-225-7285
> > –
> >
> > The Sun, with all those planets going around it and dependent on it, can
> > still ripen a vine of grapes like it has nothing else to do in the world.
> >   — Galileo
> >
> >     On Monday, June 28, 2021, 06:52:02 AM PDT, Gilles Sadowski <
> > gilleseran@gmail.com> wrote:
> >
> >  Le lun. 28 juin 2021 à 08:52, Stefan Bodewig <bo...@apache.org> a
> > écrit :
> > >
> > > On 2021-06-27, Gilles Sadowski wrote:
> > >
> > > > Le dim. 27 juin 2021 à 21:15, Stefan Bodewig <bo...@apache.org> a
> > écrit :
> > >
> > > >> As I said, we can as well document that each method could throw
> > > >> arbitrary RuntimeExceptions, but I don't believe we can list the kinds
> > > >> of RuntimeExceptions exhaustively
> > >
> > > > Why not?
> > > > Listing all runtime exceptions is considered part of good
> > > > documentation.
> > >
> > > Because we do not know which RuntimeExceptions may be caused by an
> > > invalid archive.
> > >
> > > Most of the RuntimeException happen because our parsers believe in a
> > > world where every archive is valid. For example we may read a few bytes
> > > that are the size of an array and then create the array without checking
> > > whether the size is negative and a NegativeArraySizeException occurs.
> > >
> > > As we haven't considered the case of a negative size in code, we also do
> > > not know this exception could be thrown. If we had considered the case
> > > of negative sizes, the parser could have avoided the exception
> > > alltogether. This is what I meant with
> > >
> > > >> - if we knew which exceptions can be thrown, then we could as well
> > > >> check the error conditions ourselves beforehand.
> > >
> > > Our parsers could of course be hardened, but this is pretty difficult to
> > > do years after they've been written and would certainly miss a few
> > > corner cases anyway.
> >
> > Thank you for the example.  I now understand that the aim is not
> > to increase the number of precondition checks but only to ensure
> > that a single exception can escape from calls with "undefined"
> > behaviour whenever some precondition is not fulfilled.
> >
> > IIUC, the situation where the library assumes some precondition
> > but does not check it, and continues its operations until something
> > unexpected occurs, would be described by "IllegalStateException".
> > [A very much "non-recoverable" state, as the library doesn't know
> > the root cause (which may be a long way from where the symptom
> > appeared).]
> >
> > In principle, it looks interesting information for users to be able to
> > distinguish between a known failure (i.e. the library knows the root
> > cause) and an unknown failure (?).
> >
> > > And then there is a certain category of exceptions thrown by Java
> > > classlib methods we invoke. We've just added a catch block around
> > > java.util.zip.ZipEntry#setExtra because certain invalid archives cause a
> > > RuntimeException there - and if I remember correctly a RuntimeException
> > > the method's javadoc doesn't list.
> >
> > It shows that similar JDK functionality can indeed throw a runtime
> > exception when it gets unexpected garbage.  What is their rationale
> > for choosing this or "IOException" (I have no idea)?
> >
> > Regards,
> > Gilles
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Gary Gregory <ga...@gmail.com>.
The best reason to create a new IO exception is if you are going to catch
it instead of the original and recover differently. I don't think that's
the case here, there is no "recovery" possible on a corrupted archive. For
my money, the exception can remain unchecked as it is unrealistic that
recovery is not possible. We can throw a better unchecked exception with a
better message at best. A plain IOException is possible and some folks
favor that but again, recovery is not possible on broken archives.

Gary

On Tue, Jun 29, 2021, 03:54 Miguel Munoz <sw...@yahoo.com.invalid>
wrote:

> Catching all RuntimeExceptions and wrapping them in an IOException looks
> like the cleanest solution. RuntimeExceptions usually mean bugs, so if the
> archive code is throwing them due to a corrupted archive, it makes sense to
> wrap it in a checked exception. I would like to suggest creating a new
> class looking something like this:
>
> public class CorruptedArchiveException extends IOException { }
> In fact, since we will only be generating them in response to a
> RuntimeException, we may want to make sure all constructors require an
> RuntimeException as a parameter, to guarantee that we wrap the original
> unchecked exception.
> — Miguel Muñoz
> ———
> 4210 Via Arbolada #226Los Angeles, CA 90042
> 323-225-7285
> –
>
> The Sun, with all those planets going around it and dependent on it, can
> still ripen a vine of grapes like it has nothing else to do in the world.
>   — Galileo
>
>     On Monday, June 28, 2021, 06:52:02 AM PDT, Gilles Sadowski <
> gilleseran@gmail.com> wrote:
>
>  Le lun. 28 juin 2021 à 08:52, Stefan Bodewig <bo...@apache.org> a
> écrit :
> >
> > On 2021-06-27, Gilles Sadowski wrote:
> >
> > > Le dim. 27 juin 2021 à 21:15, Stefan Bodewig <bo...@apache.org> a
> écrit :
> >
> > >> As I said, we can as well document that each method could throw
> > >> arbitrary RuntimeExceptions, but I don't believe we can list the kinds
> > >> of RuntimeExceptions exhaustively
> >
> > > Why not?
> > > Listing all runtime exceptions is considered part of good
> > > documentation.
> >
> > Because we do not know which RuntimeExceptions may be caused by an
> > invalid archive.
> >
> > Most of the RuntimeException happen because our parsers believe in a
> > world where every archive is valid. For example we may read a few bytes
> > that are the size of an array and then create the array without checking
> > whether the size is negative and a NegativeArraySizeException occurs.
> >
> > As we haven't considered the case of a negative size in code, we also do
> > not know this exception could be thrown. If we had considered the case
> > of negative sizes, the parser could have avoided the exception
> > alltogether. This is what I meant with
> >
> > >> - if we knew which exceptions can be thrown, then we could as well
> > >> check the error conditions ourselves beforehand.
> >
> > Our parsers could of course be hardened, but this is pretty difficult to
> > do years after they've been written and would certainly miss a few
> > corner cases anyway.
>
> Thank you for the example.  I now understand that the aim is not
> to increase the number of precondition checks but only to ensure
> that a single exception can escape from calls with "undefined"
> behaviour whenever some precondition is not fulfilled.
>
> IIUC, the situation where the library assumes some precondition
> but does not check it, and continues its operations until something
> unexpected occurs, would be described by "IllegalStateException".
> [A very much "non-recoverable" state, as the library doesn't know
> the root cause (which may be a long way from where the symptom
> appeared).]
>
> In principle, it looks interesting information for users to be able to
> distinguish between a known failure (i.e. the library knows the root
> cause) and an unknown failure (?).
>
> > And then there is a certain category of exceptions thrown by Java
> > classlib methods we invoke. We've just added a catch block around
> > java.util.zip.ZipEntry#setExtra because certain invalid archives cause a
> > RuntimeException there - and if I remember correctly a RuntimeException
> > the method's javadoc doesn't list.
>
> It shows that similar JDK functionality can indeed throw a runtime
> exception when it gets unexpected garbage.  What is their rationale
> for choosing this or "IOException" (I have no idea)?
>
> Regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Phil Steitz <ph...@gmail.com>.

On 6/29/21 8:08 AM, Stefan Bodewig wrote:
> On 2021-06-29, Miguel Munoz wrote:
>
>> Catching all RuntimeExceptions and wrapping them in an IOException
>> looks like the cleanest solution. RuntimeExceptions usually mean bugs,
>> so if the archive code is throwing them due to a corrupted archive, it
>> makes sense to wrap it in a checked exception. I would like to suggest
>> creating a new class looking something like this:
>> public class CorruptedArchiveException extends IOException { }
> https://github.com/apache/commons-compress/compare/catch-RuntimeExceptions
+1
As a user, this is what I would like to see.   I especially like the 
class javadoc :)

Phil
>
> :-)
>
> Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Stefan Bodewig <bo...@apache.org>.
On 2021-06-29, Miguel Munoz wrote:

> Catching all RuntimeExceptions and wrapping them in an IOException
> looks like the cleanest solution. RuntimeExceptions usually mean bugs,
> so if the archive code is throwing them due to a corrupted archive, it
> makes sense to wrap it in a checked exception. I would like to suggest
> creating a new class looking something like this:

> public class CorruptedArchiveException extends IOException { }

https://github.com/apache/commons-compress/compare/catch-RuntimeExceptions

:-)

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Miguel Munoz <sw...@yahoo.com.INVALID>.
Catching all RuntimeExceptions and wrapping them in an IOException looks like the cleanest solution. RuntimeExceptions usually mean bugs, so if the archive code is throwing them due to a corrupted archive, it makes sense to wrap it in a checked exception. I would like to suggest creating a new class looking something like this: 

public class CorruptedArchiveException extends IOException { }
In fact, since we will only be generating them in response to a RuntimeException, we may want to make sure all constructors require an RuntimeException as a parameter, to guarantee that we wrap the original unchecked exception.
— Miguel Muñoz
———
4210 Via Arbolada #226Los Angeles, CA 90042
323-225-7285
–

The Sun, with all those planets going around it and dependent on it, can still ripen a vine of grapes like it has nothing else to do in the world.
  — Galileo 

    On Monday, June 28, 2021, 06:52:02 AM PDT, Gilles Sadowski <gi...@gmail.com> wrote:  
 
 Le lun. 28 juin 2021 à 08:52, Stefan Bodewig <bo...@apache.org> a écrit :
>
> On 2021-06-27, Gilles Sadowski wrote:
>
> > Le dim. 27 juin 2021 à 21:15, Stefan Bodewig <bo...@apache.org> a écrit :
>
> >> As I said, we can as well document that each method could throw
> >> arbitrary RuntimeExceptions, but I don't believe we can list the kinds
> >> of RuntimeExceptions exhaustively
>
> > Why not?
> > Listing all runtime exceptions is considered part of good
> > documentation.
>
> Because we do not know which RuntimeExceptions may be caused by an
> invalid archive.
>
> Most of the RuntimeException happen because our parsers believe in a
> world where every archive is valid. For example we may read a few bytes
> that are the size of an array and then create the array without checking
> whether the size is negative and a NegativeArraySizeException occurs.
>
> As we haven't considered the case of a negative size in code, we also do
> not know this exception could be thrown. If we had considered the case
> of negative sizes, the parser could have avoided the exception
> alltogether. This is what I meant with
>
> >> - if we knew which exceptions can be thrown, then we could as well
> >> check the error conditions ourselves beforehand.
>
> Our parsers could of course be hardened, but this is pretty difficult to
> do years after they've been written and would certainly miss a few
> corner cases anyway.

Thank you for the example.  I now understand that the aim is not
to increase the number of precondition checks but only to ensure
that a single exception can escape from calls with "undefined"
behaviour whenever some precondition is not fulfilled.

IIUC, the situation where the library assumes some precondition
but does not check it, and continues its operations until something
unexpected occurs, would be described by "IllegalStateException".
[A very much "non-recoverable" state, as the library doesn't know
the root cause (which may be a long way from where the symptom
appeared).]

In principle, it looks interesting information for users to be able to
distinguish between a known failure (i.e. the library knows the root
cause) and an unknown failure (?).

> And then there is a certain category of exceptions thrown by Java
> classlib methods we invoke. We've just added a catch block around
> java.util.zip.ZipEntry#setExtra because certain invalid archives cause a
> RuntimeException there - and if I remember correctly a RuntimeException
> the method's javadoc doesn't list.

It shows that similar JDK functionality can indeed throw a runtime
exception when it gets unexpected garbage.  What is their rationale
for choosing this or "IOException" (I have no idea)?

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

  

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Gilles Sadowski <gi...@gmail.com>.
Le lun. 28 juin 2021 à 08:52, Stefan Bodewig <bo...@apache.org> a écrit :
>
> On 2021-06-27, Gilles Sadowski wrote:
>
> > Le dim. 27 juin 2021 à 21:15, Stefan Bodewig <bo...@apache.org> a écrit :
>
> >> As I said, we can as well document that each method could throw
> >> arbitrary RuntimeExceptions, but I don't believe we can list the kinds
> >> of RuntimeExceptions exhaustively
>
> > Why not?
> > Listing all runtime exceptions is considered part of good
> > documentation.
>
> Because we do not know which RuntimeExceptions may be caused by an
> invalid archive.
>
> Most of the RuntimeException happen because our parsers believe in a
> world where every archive is valid. For example we may read a few bytes
> that are the size of an array and then create the array without checking
> whether the size is negative and a NegativeArraySizeException occurs.
>
> As we haven't considered the case of a negative size in code, we also do
> not know this exception could be thrown. If we had considered the case
> of negative sizes, the parser could have avoided the exception
> alltogether. This is what I meant with
>
> >> - if we knew which exceptions can be thrown, then we could as well
> >> check the error conditions ourselves beforehand.
>
> Our parsers could of course be hardened, but this is pretty difficult to
> do years after they've been written and would certainly miss a few
> corner cases anyway.

Thank you for the example.  I now understand that the aim is not
to increase the number of precondition checks but only to ensure
that a single exception can escape from calls with "undefined"
behaviour whenever some precondition is not fulfilled.

IIUC, the situation where the library assumes some precondition
but does not check it, and continues its operations until something
unexpected occurs, would be described by "IllegalStateException".
[A very much "non-recoverable" state, as the library doesn't know
the root cause (which may be a long way from where the symptom
appeared).]

In principle, it looks interesting information for users to be able to
distinguish between a known failure (i.e. the library knows the root
cause) and an unknown failure (?).

> And then there is a certain category of exceptions thrown by Java
> classlib methods we invoke. We've just added a catch block around
> java.util.zip.ZipEntry#setExtra because certain invalid archives cause a
> RuntimeException there - and if I remember correctly a RuntimeException
> the method's javadoc doesn't list.

It shows that similar JDK functionality can indeed throw a runtime
exception when it gets unexpected garbage.  What is their rationale
for choosing this or "IOException" (I have no idea)?

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Stefan Bodewig <bo...@apache.org>.
On 2021-06-27, Gilles Sadowski wrote:

> Le dim. 27 juin 2021 à 21:15, Stefan Bodewig <bo...@apache.org> a écrit :

>> As I said, we can as well document that each method could throw
>> arbitrary RuntimeExceptions, but I don't believe we can list the kinds
>> of RuntimeExceptions exhaustively

> Why not?
> Listing all runtime exceptions is considered part of good
> documentation.

Because we do not know which RuntimeExceptions may be caused by an
invalid archive.

Most of the RuntimeException happen because our parsers believe in a
world where every archive is valid. For example we may read a few bytes
that are the size of an array and then create the array without checking
whether the size is negative and a NegativeArraySizeException occurs.

As we haven't considered the case of a negative size in code, we also do
not know this exception could be thrown. If we had considered the case
of negative sizes, the parser could have avoided the exception
alltogether. This is what I meant with

>> - if we knew which exceptions can be thrown, then we could as well
>> check the error conditions ourselves beforehand.

Our parsers could of course be hardened, but this is pretty difficult to
do years after they've been written and would certainly miss a few
corner cases anyway.

And then there is a certain category of exceptions thrown by Java
classlib methods we invoke. We've just added a catch block around
java.util.zip.ZipEntry#setExtra because certain invalid archives cause a
RuntimeException there - and if I remember correctly a RuntimeException
the method's javadoc doesn't list.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Gilles Sadowski <gi...@gmail.com>.
Le dim. 27 juin 2021 à 21:15, Stefan Bodewig <bo...@apache.org> a écrit :
>
> On 2021-06-27, Gilles Sadowski wrote:
>
> > Hi.
>
> >> [...]
>
> >> it seemed Gilles was opposed to this idea
>
> > Rather (IIRC) my last comment was that it was your choice as to
> > what the API should look like.
>
> Sorry, I didn't mean to misrepresent your POV.

You didn't misrepresent my opinion; just that it doesn't count if
the requirement is to not depart from what was done in the past
in [Compress].

>
> > My opinion on the matter was along Gary's lines (which is J. Bloch's
> > rationale provided in "Effective Java").
> > Indeed I personally would indeed *not* pick option 1 because it puts
> > the onus on the Commons library whereas input that does not comply
> > with preconditions (i.e. a supported format) should unsurprisingly
> > throw an IAE.
>
> In which case we need to catch all the other RuntimeExceptions and turn
> them into IAEs, right? :-)

Well, yes, if the code has detected illegal input!
But, again, if prefer to throw an "IOException"...
There is no arguing about taste and/or backward compatibility/

> Some if we want to throws any other specific RuntimeException following
> Matt's suggestion.
>
> We are already throwing checked IOExceptions for invalid archives in
> many many cases today. Our users expect us to do so for all invalid
> archives - well some of them.

Then fine (so to speak). :-}

> As I said, we can as well document that each method could throw
> arbitrary RuntimeExceptions, but I don't believe we can list the kinds
> of RuntimeExceptions exhaustively

Why not?
Listing all runtime exceptions is considered part of good
documentation.

> - if we knew which exceptions can be
> thrown, then we could as well check the error conditions ourselves
> beforehand.

I don't follow...

Regards,
Gilles

>
> Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Stefan Bodewig <bo...@apache.org>.
On 2021-06-27, Gilles Sadowski wrote:

> Hi.

>> [...]

>> it seemed Gilles was opposed to this idea

> Rather (IIRC) my last comment was that it was your choice as to
> what the API should look like.

Sorry, I didn't mean to misrepresent your POV.

> My opinion on the matter was along Gary's lines (which is J. Bloch's
> rationale provided in "Effective Java").
> Indeed I personally would indeed *not* pick option 1 because it puts
> the onus on the Commons library whereas input that does not comply
> with preconditions (i.e. a supported format) should unsurprisingly
> throw an IAE.

In which case we need to catch all the other RuntimeExceptions and turn
them into IAEs, right? :-)

Some if we want to throws any other specific RuntimeException following
Matt's suggestion.

We are already throwing checked IOExceptions for invalid archives in
many many cases today. Our users expect us to do so for all invalid
archives - well some of them.

As I said, we can as well document that each method could throw
arbitrary RuntimeExceptions, but I don't believe we can list the kinds
of RuntimeExceptions exhaustively - if we knew which exceptions can be
thrown, then we could as well check the error conditions ourselves
beforehand.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Gilles Sadowski <gi...@gmail.com>.
Hi.

> [...]

> it seemed Gilles was opposed to this idea

Rather (IIRC) my last comment was that it was your choice as to
what the API should look like.
My opinion on the matter was along Gary's lines (which is J. Bloch's
rationale provided in "Effective Java").
Indeed I personally would indeed *not* pick option 1 because it puts
the onus on the Commons library whereas input that does not comply
with preconditions (i.e. a supported format) should unsurprisingly
throw an IAE.

Of course, this is also a matter of taste e.g. if one wishes to behave
similarly to the JDK for things like "FileNotFoundException"...

> [...]

Le dim. 27 juin 2021 à 15:06, Torsten Curdt <tc...@vafer.org> a écrit :
>
> > feels like both a horror show and an exercise in futility
>
>
> Can you expand on that - I don't understand the horror or the futility.

The futility is because it is wrong for the caller to expect that no
runtime exception can be thrown.

> It certainly is a bit of code smell - but only because there is no quick
> way to harden the implementations.

Maybe I do not follow what you mean, but there is no way to "harden":
if the input is garbage, it cannot be processed and _some_ exception
will result.

> But I rather have a clear and stable API and fix the the implementation
> along the way than to leak the problems to the API consumer and just
> document it.

Signalling that a problem with the input was detected is not leaking
since the problem was the API caller/consumer in the first place.
[Fixing library bugs is not what is being discussed AFAICT.]

Regards,
Gilles

>
> For me it's not about "appease some tool" it's just that the tool has
> revealed a problem and we have to decide how to deal with this in the
> future.
>
> cheers,
> Torsten

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by sebb <se...@gmail.com>.
On Sun, 27 Jun 2021 at 15:01, sebb <se...@gmail.com> wrote:
>
> On Sun, 27 Jun 2021 at 14:06, Torsten Curdt <tc...@vafer.org> wrote:
> >
> > > feels like both a horror show and an exercise in futility
> >
> >
> > Can you expand on that - I don't understand the horror or the futility.
> >
> > It certainly is a bit of code smell - but only because there is no quick
> > way to harden the implementations.
> >
> > But I rather have a clear and stable API and fix the the implementation
> > along the way than to leak the problems to the API consumer and just
> > document it.
> >
> > For me it's not about "appease some tool" it's just that the tool has
> > revealed a problem and we have to decide how to deal with this in the
> > future.
>
> Agreed.
> At least some of the exceptions could presumably be avoided by
> increased validation during processing.

Sorry, pushed send by mistake.

I meant to add:

Wrapping the UE in an IOError or similar allows the code to provide
more context to the user as to what was being done when the error
occurred, and thus help debug the issue.

> > cheers,
> > Torsten

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by sebb <se...@gmail.com>.
On Sun, 27 Jun 2021 at 14:06, Torsten Curdt <tc...@vafer.org> wrote:
>
> > feels like both a horror show and an exercise in futility
>
>
> Can you expand on that - I don't understand the horror or the futility.
>
> It certainly is a bit of code smell - but only because there is no quick
> way to harden the implementations.
>
> But I rather have a clear and stable API and fix the the implementation
> along the way than to leak the problems to the API consumer and just
> document it.
>
> For me it's not about "appease some tool" it's just that the tool has
> revealed a problem and we have to decide how to deal with this in the
> future.

Agreed.
At least some of the exceptions could presumably be avoided by
increased validation during processing.
> cheers,
> Torsten

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Torsten Curdt <tc...@vafer.org>.
> feels like both a horror show and an exercise in futility


Can you expand on that - I don't understand the horror or the futility.

It certainly is a bit of code smell - but only because there is no quick
way to harden the implementations.

But I rather have a clear and stable API and fix the the implementation
along the way than to leak the problems to the API consumer and just
document it.

For me it's not about "appease some tool" it's just that the tool has
revealed a problem and we have to decide how to deal with this in the
future.

cheers,
Torsten

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Stefan Bodewig <bo...@apache.org>.
On 2021-06-27, Gary Gregory wrote:

> Catching all unchecked exceptions (UE) and rethrowing as checked exceptions
> (CE) feels like both a horror show and an exercise in futility, especially
> in order to appease some tool that complains today of one thing which may
> complain differently tomorrow, I really don't like that idea on paper.

Independent of the nonsense JFrog does our users have repeatedly tols us
they expect Compress to throw an IOException rather than a
RuntimeException for broken archives.

https://issues.apache.org/jira/browse/COMPRESS-169 fixed in Compress 1.4
ten years ago probably is the first one and it is easy to find many more
of this type (COMPRESS-424, COMPRESS-490, COMPRESS-131, COMPRESS-219
... here I stopped grepping through our changelog).


> Let's keep in mind that a CE means one can catch it and try to do something
> about it which is definitely not the case for a corrupted archive. I mean,
> you can download it again, sure, and end up where you started.

This is not quite true. You can not recover from an EOFException
either. There are lots of cases where we already throw IOExceptions for
irrecoverably corrupt archives today.

> IOW, let's do what we think is best, not what some tool some company wants
> to sell ("our tool is hard core and found 5 gazillion bugs in the open
> source echo system")

+1 can't argue with selecting the option we consider best for our users.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [compress] Dealing with uncaught RuntimeExceptions (again)

Posted by Gary Gregory <ga...@gmail.com>.
Catching all unchecked exceptions (UE) and rethrowing as checked exceptions
(CE) feels like both a horror show and an exercise in futility, especially
in order to appease some tool that complains today of one thing which may
complain differently tomorrow, I really don't like that idea on paper.

Let's keep in mind that a CE means one can catch it and try to do something
about it which is definitely not the case for a corrupted archive. I mean,
you can download it again, sure, and end up where you started.

The big picture for me is avoiding infinite loops and decompression bombs.

An defensive app will catch Exception anyway to catch IOException, NPE,
IAE, and ISE. We can document these where we know they can take place. We
can throw IOE, IAE or ISE to better document problems like bad array
indices.

IOW, let's do what we think is best, not what some tool some company wants
to sell ("our tool is hard core and found 5 gazillion bugs in the open
source echo system")

Gary

On Sun, Jun 27, 2021, 07:03 Stefan Bodewig <bo...@apache.org> wrote:

> Hi
>
> I'd like to get closure on which approach we want to take.
>
> When we read a broken archive it may trigger arbitrary RuntimeExceptions
> because we are not explicitly checking for each and every sizuation
> where a bounds check could fail, a negative size is sent to a classlib
> method that then throws an IllegalArgumentException or whatnot (even a
> NullPointerException may escape us every now and then).
>
> Uncaught RuntimeExceptions are considered security issued by some tools
> because of a potential DoS attack. Historically we have never agreed
> with this point of view and I'm not suggesting to change that.
>
> Even though we may not know what is wrong, when the RuntimeException
> occurs, we do know the archive is broken and this is the reason for the
> exception.
>
> AFAICS there are two ways we can deal with it:
>
> (1) every method that reads from the archive declares it can throw
>     arbitrary RuntimeExceptions as well. And we document that broken
>     archives may cause RuntimeExceptions and that we never consider such
>     a case a security issue.
>
> (2) we catch RuntimeExceptions at every method that reads from the
>     archive and wrap them in a custom IOException, making sure such a
>     case can never escape us.
>
> Personally I prefer (2) but can live with (1) - I've suggested something
> along the lines of (2) in [1] and it seemed Gilles was opposed to this
> idea (and Matt was torn).
>
> In [2] Bernd seemed to support (2).
>
> Are there any other opinions?
>
> Stefan
>
> [1]
> https://lists.apache.org/thread.html/r5d2427566dff4c7d293e8d48f9ac62b7958d19047f730836ce5b3c60%40%3Cdev.commons.apache.org%3E
>
> [2]
> https://lists.apache.org/thread.html/r3ce77eb9ab9429097ca57c48cb99b8be497ee5b69d419b52a6722616%40%3Cdev.commons.apache.org%3E
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>