You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Micah Kornfield <em...@gmail.com> on 2016/05/31 05:33:09 UTC

[C++] How careful do we want to be about exceptions?

We are following the google style guide which prohibits exceptions [1].
Several places in our code we use std::vector and std::make_shared
which can potentially throw std::bad_alloc exceptions that we don't
currently wrap in try/catch blocks.  These classes/objects are used
for constructing/manipulating metadata (all errors for allocations of
data are checked and returned via Status objects).

In general, it seems unlikely that these code paths would get
triggered, but there is still a possibility.

Does anybody have any thoughts on how thorough we want to be about
avoiding these types of edge cases?

My thought is to leave the code as is and accept the small chance that
these get hit.  The main rationale is that if these types of calls do
raise std::bad_alloc, it is highly likely that the program using the
Arrow library is likely in an unstable state already (most of the
allocations should be less then 100 bytes or so).  A second point is
that a cursory search of Kudu [2] and Impala [3] code bases  (both of
which also intend to follow the style guide) turn up some similar
calls that could throw std::bad_alloc (it is possible there is some
other magic happening here to avoid these).

This isn't completely satisfying though, so I'm happy to be convinced
otherwise.

Thanks,
-Micah


[1] https://google.github.io/styleguide/cppguide.html#Exceptions
[2] https://github.com/cloudera/kudu/blob/7f3691a826b9d27199319409f8d721ec1d08a8ba/src/kudu/consensus/log_reader.cc#L74
[3] https://github.com/cloudera/Impala/blob/a36dcfc0322e213c06d6cf8d3f330c4b06739523/be/src/common/object-pool.h

Re: [C++] How careful do we want to be about exceptions?

Posted by Leif Walsh <le...@gmail.com>.
I agree, as a C++ library it is acceptable to let these exceptions bubble
up to the client, and for the C bindings, all exceptions should be caught
and translated to appropriate error codes.  Most other languages that
interface with it will probably use the C wrapper and will gain visibility
into these exceptional conditions and handle it how they see fit.

On Tue, Jun 7, 2016 at 8:04 AM Todd Lipcon <to...@cloudera.com> wrote:

> Hi Micah,
>
> You're right - in Kudu at least we don't make attempts to recover from
> out-of-memory errors like std::bad_alloc. So, if 'new' or any of the STL
> libraries throw such an error, we'll just let it propagate up the stack and
> crash the server. In my experience, trying to recover from such errors is
> fraught with difficulty -- often you'd want to allocate in the recovery
> path, and short of implementing "emergency allocators" and such, things
> just get too hard to reason about and test.
>
> In the context of an embeddable *C* library I do think there's some value
> in eventually catching std::bad_alloc and returning error codes like
> ENOMEM. However, I'd personally vote that to be given very low priority in
> the general queue of features to be implemented, since I imagine that most
> of the big data applications that would embed Arrow are likely to follow
> the same "crash on alloc failure" anyway.
>
> Another thing to note is that in any cases where Arrow does allocate large
> amounts of memory (eg when reading an on-disk file or user data) I do think
> some form of memory limiting is advisable. For example, even though Kudu
> crashes on OOM, we closely track our memory usage and start to reject user
> write requests after crossing certain thresholds. Providing enough hooks
> for embedders of Arrow to do similar is a good idea.
>
> -Todd
>
> On Tue, May 31, 2016 at 7:33 AM, Micah Kornfield <em...@gmail.com>
> wrote:
>
> > We are following the google style guide which prohibits exceptions [1].
> > Several places in our code we use std::vector and std::make_shared
> > which can potentially throw std::bad_alloc exceptions that we don't
> > currently wrap in try/catch blocks.  These classes/objects are used
> > for constructing/manipulating metadata (all errors for allocations of
> > data are checked and returned via Status objects).
> >
> > In general, it seems unlikely that these code paths would get
> > triggered, but there is still a possibility.
> >
> > Does anybody have any thoughts on how thorough we want to be about
> > avoiding these types of edge cases?
> >
> > My thought is to leave the code as is and accept the small chance that
> > these get hit.  The main rationale is that if these types of calls do
> > raise std::bad_alloc, it is highly likely that the program using the
> > Arrow library is likely in an unstable state already (most of the
> > allocations should be less then 100 bytes or so).  A second point is
> > that a cursory search of Kudu [2] and Impala [3] code bases  (both of
> > which also intend to follow the style guide) turn up some similar
> > calls that could throw std::bad_alloc (it is possible there is some
> > other magic happening here to avoid these).
> >
> > This isn't completely satisfying though, so I'm happy to be convinced
> > otherwise.
> >
> > Thanks,
> > -Micah
> >
> >
> > [1] https://google.github.io/styleguide/cppguide.html#Exceptions
> > [2]
> >
> https://github.com/cloudera/kudu/blob/7f3691a826b9d27199319409f8d721ec1d08a8ba/src/kudu/consensus/log_reader.cc#L74
> > [3]
> >
> https://github.com/cloudera/Impala/blob/a36dcfc0322e213c06d6cf8d3f330c4b06739523/be/src/common/object-pool.h
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
-- 
-- 
Cheers,
Leif

Re: [C++] How careful do we want to be about exceptions?

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Micah,

You're right - in Kudu at least we don't make attempts to recover from
out-of-memory errors like std::bad_alloc. So, if 'new' or any of the STL
libraries throw such an error, we'll just let it propagate up the stack and
crash the server. In my experience, trying to recover from such errors is
fraught with difficulty -- often you'd want to allocate in the recovery
path, and short of implementing "emergency allocators" and such, things
just get too hard to reason about and test.

In the context of an embeddable *C* library I do think there's some value
in eventually catching std::bad_alloc and returning error codes like
ENOMEM. However, I'd personally vote that to be given very low priority in
the general queue of features to be implemented, since I imagine that most
of the big data applications that would embed Arrow are likely to follow
the same "crash on alloc failure" anyway.

Another thing to note is that in any cases where Arrow does allocate large
amounts of memory (eg when reading an on-disk file or user data) I do think
some form of memory limiting is advisable. For example, even though Kudu
crashes on OOM, we closely track our memory usage and start to reject user
write requests after crossing certain thresholds. Providing enough hooks
for embedders of Arrow to do similar is a good idea.

-Todd

On Tue, May 31, 2016 at 7:33 AM, Micah Kornfield <em...@gmail.com>
wrote:

> We are following the google style guide which prohibits exceptions [1].
> Several places in our code we use std::vector and std::make_shared
> which can potentially throw std::bad_alloc exceptions that we don't
> currently wrap in try/catch blocks.  These classes/objects are used
> for constructing/manipulating metadata (all errors for allocations of
> data are checked and returned via Status objects).
>
> In general, it seems unlikely that these code paths would get
> triggered, but there is still a possibility.
>
> Does anybody have any thoughts on how thorough we want to be about
> avoiding these types of edge cases?
>
> My thought is to leave the code as is and accept the small chance that
> these get hit.  The main rationale is that if these types of calls do
> raise std::bad_alloc, it is highly likely that the program using the
> Arrow library is likely in an unstable state already (most of the
> allocations should be less then 100 bytes or so).  A second point is
> that a cursory search of Kudu [2] and Impala [3] code bases  (both of
> which also intend to follow the style guide) turn up some similar
> calls that could throw std::bad_alloc (it is possible there is some
> other magic happening here to avoid these).
>
> This isn't completely satisfying though, so I'm happy to be convinced
> otherwise.
>
> Thanks,
> -Micah
>
>
> [1] https://google.github.io/styleguide/cppguide.html#Exceptions
> [2]
> https://github.com/cloudera/kudu/blob/7f3691a826b9d27199319409f8d721ec1d08a8ba/src/kudu/consensus/log_reader.cc#L74
> [3]
> https://github.com/cloudera/Impala/blob/a36dcfc0322e213c06d6cf8d3f330c4b06739523/be/src/common/object-pool.h
>



-- 
Todd Lipcon
Software Engineer, Cloudera