You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Abdel Hakim Deneche <ad...@maprtech.com> on 2015/06/30 21:06:31 UTC

[DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

our current implementations of BufferAllocator.buffer(int, int) returns
null when it cannot allocate the buffer.

But looking through the code, there are many places that don't check if the
allocated buffer is null before trying to access it which will throw a
NullPointerException.

ValueVectors' allocateNewSafe() seem to be the only place that handle the
null in a specific manner.

Should we update the allocators' implementation to throw an
OutOfMemoryRuntimeException instead of returning null ? this has the added
benefit of displaying a proper out of memory error message to the user.

Thanks!

-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Chris Westin <ch...@gmail.com>.
I would resist using a checked exception. We would end up having to add it
to almost everything everywhere. Or constantly wrapping it with
RuntimeException to get around that; in which case why make it checked to
begin with?

On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org> wrote:

> I would propose throwing a checked exception encouraging explicit and
> consistent handling of this event. Each sub-system has liberty to decide if
> an OOM failure is fatal or non-fatal depending on its capabilities. Also if
> at some point a sub-system needs to communicate with its callers via a
> different mechanism such like using flags (boolean, enum etc) or raising an
> unchecked exception that's still fine, just handle the exception. If there
> is a need to suppress the checked exception that's fine too, just use a
> helper method.
>
> Either way, returning *null* sounds problematic in many ways i) it is
> implicit ii) unsafe iii) its handling logic is repetitive iv) it is
> semantically unclean to make null mean something - even worse something
> context specific.
>
>
> -Hanifi
>
> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <ad...@maprtech.com>:
>
> > I guess that would fix the issue too. But we may still run into
> situations
> > where the caller will pass a flag to "mute" the exception and not handle
> > the case anyway.
> >
> > If .buffer() unconditionally throws an exception, can't the caller, who
> > wants to, just catch that and handle it properly ?
> >
> > On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <ch...@gmail.com>
> > wrote:
> >
> > > No, but we should do something close to that.
> > >
> > > There are cases where the caller can handle the inability to get more
> > > memory, and may be able to go to disk. However, you are correct that
> > there
> > > are many that can't handle an OOM, and that fail to check.
> > >
> > > Instead of unconditionally throwing OutOfMemoryRuntimeException, I
> would
> > > suggest that the buffer() call take a flag that indicates whether or
> not
> > it
> > > should throw if it is unable to fulfill the request. This way, the call
> > > sites that can handle an OOM can pass in the flag to return null, and
> the
> > > rest can pass in the flag value to throw, and not have to have any
> > checking
> > > code.
> > >
> > >
> > > On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > adeneche@maprtech.com
> > > > wrote:
> > >
> > > > our current implementations of BufferAllocator.buffer(int, int)
> returns
> > > > null when it cannot allocate the buffer.
> > > >
> > > > But looking through the code, there are many places that don't check
> if
> > > the
> > > > allocated buffer is null before trying to access it which will throw
> a
> > > > NullPointerException.
> > > >
> > > > ValueVectors' allocateNewSafe() seem to be the only place that handle
> > the
> > > > null in a specific manner.
> > > >
> > > > Should we update the allocators' implementation to throw an
> > > > OutOfMemoryRuntimeException instead of returning null ? this has the
> > > added
> > > > benefit of displaying a proper out of memory error message to the
> user.
> > > >
> > > > Thanks!
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >   <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Jacques Nadeau <ja...@apache.org>.
Do you think we should be reaching that deep in the other places?
On Jun 30, 2015 3:55 PM, "Abdel Hakim Deneche" <ad...@maprtech.com>
wrote:

> buffer(int) is mostly called from value vectors, but there are an
> additional 10+ places where it's being used and most of them don't bother
> checking if the returned buffer is null.
>
> On Tue, Jun 30, 2015 at 3:39 PM, Jacques Nadeau <ja...@apache.org>
> wrote:
>
> > What is your sense of the use of this interface? I would hope new uses
> > would be scarce. Do you think this will be a frequent occurrence that
> most
> > devs will experience?
> > On Jun 30, 2015 3:35 PM, "Abdel Hakim Deneche" <ad...@maprtech.com>
> > wrote:
> >
> > > I started this discussion while fixing DRILL-3312
> > > <https://issues.apache.org/jira/browse/DRILL-3312>, it took me quite
> > some
> > > time to debug and find the root of an error I was seeing. I first
> thought
> > > about fixing those few places that call buffer() without checking if
> the
> > > buffer is null or not, but this won't prevent new code from repeating
> the
> > > same mistakes.
> > >
> > >
> > > On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <ja...@apache.org>
> > > wrote:
> > >
> > > > Lots of different comments on this thread.  Here are few of my
> > thoughts:
> > > >
> > > > I think you need to differentiate the different tiers of interfaces a
> > > > little bit more.
> > > >
> > > > BufferAllocator.buffer() was meant to be a low level advanced
> > interface.
> > > > It should be used rarely and when done so, only with extreme care.
> > Most
> > > > devs shouldn't use it.  Having an advanced interface seems fine here.
> > > This
> > > > was how the interface was envisioned.   It looks like there are
> > currently
> > > > 16 classes that reference this code not including vectors.  (There
> > > probably
> > > > should be less, it looks like some code duplication contributes to
> > this.)
> > > >
> > > > All the vector classes also access this code.  Vectors have two
> > methods:
> > > > allocateNew and allocateNewSafe.  allocateNew has an exception
> behavior
> > > and
> > > > allocateNewSafe has a return value behavior.  These are main APIs
> that
> > > are
> > > > used extensively throughout the code and the ones we should be most
> > > focused
> > > > on.  I like giving the developer the choice of behavior (thus the two
> > > > options).  It sounds like others are saying to remove this choice
> > > (because
> > > > code is not managing it correctly).
> > > >
> > > > If there are bugs in the 16 classes on how they use the advanced
> > > interface,
> > > > this doesn't seem like a big challenge to correct rather than bulk
> > > > modifying the API.  If people are using the wrong method on the
> vector
> > > > classes (or the vector classes aren't correctly behaving), that
> doesn't
> > > > seem like something we should address at the allocator API level.
> > > >
> > > > Hakim, can you be more specific about your concern?  It doesn't seem
> > like
> > > > we're talking about thousands of places here.
> > > >
> > > > Regarding checked versus unchecked...  I'm strongly in support of
> using
> > > > checked exceptions for most things.  However, I'm against it for an
> out
> > > of
> > > > memory condition.  (Note how the JVM also treats this as an unchecked
> > > > exception.) It causes early catching when, in most cases, we really
> > want
> > > to
> > > > bubble way up the stack and fail the query.  I know of very few
> places
> > > > where we can actually locally manage an out of memory condition so
> why
> > > > write huge amounts of code the bubble the exception all the way to
> > where
> > > we
> > > > can actually handle it.
> > > >
> > > > A quick sidebar to Jason's "Hash agg out of memory" comment
> > > > Two thoughts here.  I'm not sure this is a meaningful message.  Where
> > we
> > > > run out of memory may have nothing to do with where we are using most
> > > > memory.  And if we did to decide this was useful, this can be done
> just
> > > as
> > > > easily with unchecked exceptions as checked exception.  What I'm want
> > to
> > > > avoid is adding throws OutOfMemoryException in 1000 places.  The
> > reality
> > > is
> > > > everybody should assume that an OutOfMemory can occur pretty much
> > > anytime.
> > > >
> > > > TL;DR
> > > >
> > > > In general, I think this discussion is too abstract and thus has the
> > > danger
> > > > of trending towards a religious discussion (I already see a bit of
> that
> > > > fervor in some of the responses.).  The number of places we are
> talking
> > > > about are not that large and are easy to find.  Let's solve this by
> > > looking
> > > > at specifics.  For example, if people think checked would be better,
> > > > someone should go through and make an example changeset for everyone
> to
> > > > review.  My guess is, the only that it works will be if very quickly,
> > the
> > > > handling code converts the checked exception into an unchecked
> > > > UserException.  But I'm more than happy to be proven wrong.
> > > >
> > > > On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <
> dbarclay@maprtech.com
> > >
> > > > wrote:
> > > >
> > > > > Hanifi GUNES wrote:
> > > > >
> > > > >> - We would end up having to add it to almost everything everywhere
> > > > >> Why would one propagate the checked exception for no reason? And
> why
> > > > would
> > > > >> one not propagate the exception for a good reason like
> robustness? I
> > > > agree
> > > > >> that one has to avoid propagating the checked exception for no
> > reason
> > > > >> however I support propagating it for a good reason.
> > > > >>
> > > > >> The added benefit of raising a checked exception is reminding as
> > well
> > > as
> > > > >> enforcing devs to handle it and be more cautious about this
> > particular
> > > > >> event. I find this compile-time safety check invaluable for
> > > robustness.
> > > > >>
> > > > >
> > > > > +(1 times <some large number>)
> > > > >
> > > > > Daniel
> > > > >
> > > > >
> > > > >
> > > > >>
> > > > >> - Or constantly wrapping it with RuntimeException to get around
> that
> > > > >> If it has to be done, I would recommend relying on a helper to do
> > so.
> > > > >> There
> > > > >> is not much of man-work involved here.
> > > > >>
> > > > >>
> > > > >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <
> > adeneche@maprtech.com
> > > >:
> > > > >>
> > > > >>  +1 to Hanifi's
> > > > >>>
> > > > >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> > > > >>> altekrusejason@gmail.com
> > > > >>>
> > > > >>>>
> > > > >>>>  wrote:
> > > > >>>
> > > > >>>  +1 to Hanifi's comments, I think it makes much more sense to
> have
> > a
> > > > >>>>
> > > > >>> number
> > > > >>>
> > > > >>>> of sites where the operators are explicitly catching a checked
> OOM
> > > > >>>> exception and either decide to handle it or produce a message
> like
> > > > "Hash
> > > > >>>> Aggregate does not support our of memory conditions". This would
> > be
> > > > >>>> particularly useful for debugging queries, as the user exception
> > can
> > > > >>>> provide context information about the current operation. This
> way
> > > > users
> > > > >>>>
> > > > >>> can
> > > > >>>
> > > > >>>> have some idea about the part of their query that might be
> causing
> > > an
> > > > >>>> excessive strain on system resources. I understand that there
> are
> > > also
> > > > >>>> cases where operators competing for memory can make it a toss up
> > to
> > > > >>>> which
> > > > >>>> will actually fail, but this would at least be a step to give
> more
> > > > >>>>
> > > > >>> detailed
> > > > >>>
> > > > >>>> information to users.
> > > > >>>>
> > > > >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org>
> > > wrote:
> > > > >>>>
> > > > >>>>  I would propose throwing a checked exception encouraging
> explicit
> > > and
> > > > >>>>> consistent handling of this event. Each sub-system has liberty
> to
> > > > >>>>>
> > > > >>>> decide
> > > > >>>
> > > > >>>> if
> > > > >>>>
> > > > >>>>> an OOM failure is fatal or non-fatal depending on its
> > capabilities.
> > > > >>>>>
> > > > >>>> Also
> > > > >>>
> > > > >>>> if
> > > > >>>>
> > > > >>>>> at some point a sub-system needs to communicate with its
> callers
> > > via
> > > > a
> > > > >>>>> different mechanism such like using flags (boolean, enum etc)
> or
> > > > >>>>>
> > > > >>>> raising
> > > > >>>
> > > > >>>> an
> > > > >>>>
> > > > >>>>> unchecked exception that's still fine, just handle the
> exception.
> > > If
> > > > >>>>>
> > > > >>>> there
> > > > >>>>
> > > > >>>>> is a need to suppress the checked exception that's fine too,
> just
> > > > use a
> > > > >>>>> helper method.
> > > > >>>>>
> > > > >>>>> Either way, returning *null* sounds problematic in many ways i)
> > it
> > > is
> > > > >>>>> implicit ii) unsafe iii) its handling logic is repetitive iv)
> it
> > is
> > > > >>>>> semantically unclean to make null mean something - even worse
> > > > something
> > > > >>>>> context specific.
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> -Hanifi
> > > > >>>>>
> > > > >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> > > > adeneche@maprtech.com
> > > > >>>>>
> > > > >>>> :
> > > > >>>>
> > > > >>>>>
> > > > >>>>>  I guess that would fix the issue too. But we may still run
> into
> > > > >>>>>>
> > > > >>>>> situations
> > > > >>>>>
> > > > >>>>>> where the caller will pass a flag to "mute" the exception and
> > not
> > > > >>>>>>
> > > > >>>>> handle
> > > > >>>>
> > > > >>>>> the case anyway.
> > > > >>>>>>
> > > > >>>>>> If .buffer() unconditionally throws an exception, can't the
> > > caller,
> > > > >>>>>>
> > > > >>>>> who
> > > > >>>
> > > > >>>> wants to, just catch that and handle it properly ?
> > > > >>>>>>
> > > > >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > > > >>>>>>
> > > > >>>>> chriswestin42@gmail.com>
> > > > >>>>
> > > > >>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>  No, but we should do something close to that.
> > > > >>>>>>>
> > > > >>>>>>> There are cases where the caller can handle the inability to
> > get
> > > > >>>>>>>
> > > > >>>>>> more
> > > > >>>
> > > > >>>> memory, and may be able to go to disk. However, you are correct
> > > > >>>>>>>
> > > > >>>>>> that
> > > > >>>
> > > > >>>> there
> > > > >>>>>>
> > > > >>>>>>> are many that can't handle an OOM, and that fail to check.
> > > > >>>>>>>
> > > > >>>>>>> Instead of unconditionally throwing
> > OutOfMemoryRuntimeException,
> > > I
> > > > >>>>>>>
> > > > >>>>>> would
> > > > >>>>>
> > > > >>>>>> suggest that the buffer() call take a flag that indicates
> > whether
> > > > >>>>>>>
> > > > >>>>>> or
> > > > >>>
> > > > >>>> not
> > > > >>>>>
> > > > >>>>>> it
> > > > >>>>>>
> > > > >>>>>>> should throw if it is unable to fulfill the request. This
> way,
> > > the
> > > > >>>>>>>
> > > > >>>>>> call
> > > > >>>>
> > > > >>>>> sites that can handle an OOM can pass in the flag to return
> null,
> > > > >>>>>>>
> > > > >>>>>> and
> > > > >>>
> > > > >>>> the
> > > > >>>>>
> > > > >>>>>> rest can pass in the flag value to throw, and not have to have
> > any
> > > > >>>>>>>
> > > > >>>>>> checking
> > > > >>>>>>
> > > > >>>>>>> code.
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > > >>>>>>> adeneche@maprtech.com
> > > > >>>>>>>
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>  our current implementations of BufferAllocator.buffer(int,
> > int)
> > > > >>>>>>>>
> > > > >>>>>>> returns
> > > > >>>>>
> > > > >>>>>> null when it cannot allocate the buffer.
> > > > >>>>>>>>
> > > > >>>>>>>> But looking through the code, there are many places that
> don't
> > > > >>>>>>>>
> > > > >>>>>>> check
> > > > >>>>
> > > > >>>>> if
> > > > >>>>>
> > > > >>>>>> the
> > > > >>>>>>>
> > > > >>>>>>>> allocated buffer is null before trying to access it which
> will
> > > > >>>>>>>>
> > > > >>>>>>> throw
> > > > >>>>
> > > > >>>>> a
> > > > >>>>>
> > > > >>>>>> NullPointerException.
> > > > >>>>>>>>
> > > > >>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place
> that
> > > > >>>>>>>>
> > > > >>>>>>> handle
> > > > >>>>
> > > > >>>>> the
> > > > >>>>>>
> > > > >>>>>>> null in a specific manner.
> > > > >>>>>>>>
> > > > >>>>>>>> Should we update the allocators' implementation to throw an
> > > > >>>>>>>> OutOfMemoryRuntimeException instead of returning null ? this
> > has
> > > > >>>>>>>>
> > > > >>>>>>> the
> > > > >>>>
> > > > >>>>> added
> > > > >>>>>>>
> > > > >>>>>>>> benefit of displaying a proper out of memory error message
> to
> > > the
> > > > >>>>>>>>
> > > > >>>>>>> user.
> > > > >>>>>
> > > > >>>>>>
> > > > >>>>>>>> Thanks!
> > > > >>>>>>>>
> > > > >>>>>>>> --
> > > > >>>>>>>>
> > > > >>>>>>>> Abdelhakim Deneche
> > > > >>>>>>>>
> > > > >>>>>>>> Software Engineer
> > > > >>>>>>>>
> > > > >>>>>>>>    <http://www.mapr.com/>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Now Available - Free Hadoop On-Demand Training
> > > > >>>>>>>> <
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >>>
> > > > >>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> --
> > > > >>>>>>
> > > > >>>>>> Abdelhakim Deneche
> > > > >>>>>>
> > > > >>>>>> Software Engineer
> > > > >>>>>>
> > > > >>>>>>    <http://www.mapr.com/>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Now Available - Free Hadoop On-Demand Training
> > > > >>>>>> <
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >>>
> > > > >>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>>
> > > > >>> Abdelhakim Deneche
> > > > >>>
> > > > >>> Software Engineer
> > > > >>>
> > > > >>>    <http://www.mapr.com/>
> > > > >>>
> > > > >>>
> > > > >>> Now Available - Free Hadoop On-Demand Training
> > > > >>> <
> > > > >>>
> > > > >>>
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >>>
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >
> > > > > --
> > > > > Daniel Barclay
> > > > > MapR Technologies
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
Ok then, I will create a JIRA to track this change.

On Tue, Jun 30, 2015 at 4:29 PM, Jacques Nadeau <ja...@apache.org> wrote:

> You're right, that should be fine.
> On Jun 30, 2015 4:24 PM, "Abdel Hakim Deneche" <ad...@maprtech.com>
> wrote:
>
> > What would be the impact of making buffer() throw an
> > OutOfMemoryRuntimeException ? we'll need to change a few places where it
> > really needs to be handled (like in allocateNewSafe()) but other than
> that,
> > just letting the unchecked exception propagate should be fine, and Drill
> > will still display the proper error message.
> >
> > Of course if this happens in the RPC layer, we still need to handle it
> > properly (DRILL-3317 <https://issues.apache.org/jira/browse/DRILL-3317>)
> >
> > On Tue, Jun 30, 2015 at 4:12 PM, Hanifi Gunes <hg...@maprtech.com>
> wrote:
> >
> > > -I like giving the developer the choice of behavior (thus the two
> > options).
> > > It sounds like others are saying to remove this choice (because code is
> > not
> > > managing it correctly).
> > > I might be misunderstanding the context above but my understanding is
> > that
> > > the discussion is not particularly concerned about VVs nor adding,
> > removing
> > > methods to/from them. It is the behavior of Allocator#buffer when we
> run
> > > OOM. Current behavior is to return null to communicate this. Given
> > #buffer
> > > is used in very few classes, I think we can easily change this for good
> > > avoiding possible problems to come.
> > >
> > >
> > > - What I'm want to avoid is adding throws OutOfMemoryException in 1000
> > > places.
> > > Agreed.
> > >
> > >
> > >
> > > On Tue, Jun 30, 2015 at 3:54 PM, Abdel Hakim Deneche <
> > > adeneche@maprtech.com>
> > > wrote:
> > >
> > > > buffer(int) is mostly called from value vectors, but there are an
> > > > additional 10+ places where it's being used and most of them don't
> > bother
> > > > checking if the returned buffer is null.
> > > >
> > > > On Tue, Jun 30, 2015 at 3:39 PM, Jacques Nadeau <ja...@apache.org>
> > > > wrote:
> > > >
> > > > > What is your sense of the use of this interface? I would hope new
> > uses
> > > > > would be scarce. Do you think this will be a frequent occurrence
> that
> > > > most
> > > > > devs will experience?
> > > > > On Jun 30, 2015 3:35 PM, "Abdel Hakim Deneche" <
> > adeneche@maprtech.com>
> > > > > wrote:
> > > > >
> > > > > > I started this discussion while fixing DRILL-3312
> > > > > > <https://issues.apache.org/jira/browse/DRILL-3312>, it took me
> > quite
> > > > > some
> > > > > > time to debug and find the root of an error I was seeing. I first
> > > > thought
> > > > > > about fixing those few places that call buffer() without checking
> > if
> > > > the
> > > > > > buffer is null or not, but this won't prevent new code from
> > repeating
> > > > the
> > > > > > same mistakes.
> > > > > >
> > > > > >
> > > > > > On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <
> > jacques@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Lots of different comments on this thread.  Here are few of my
> > > > > thoughts:
> > > > > > >
> > > > > > > I think you need to differentiate the different tiers of
> > > interfaces a
> > > > > > > little bit more.
> > > > > > >
> > > > > > > BufferAllocator.buffer() was meant to be a low level advanced
> > > > > interface.
> > > > > > > It should be used rarely and when done so, only with extreme
> > care.
> > > > > Most
> > > > > > > devs shouldn't use it.  Having an advanced interface seems fine
> > > here.
> > > > > > This
> > > > > > > was how the interface was envisioned.   It looks like there are
> > > > > currently
> > > > > > > 16 classes that reference this code not including vectors.
> > (There
> > > > > > probably
> > > > > > > should be less, it looks like some code duplication contributes
> > to
> > > > > this.)
> > > > > > >
> > > > > > > All the vector classes also access this code.  Vectors have two
> > > > > methods:
> > > > > > > allocateNew and allocateNewSafe.  allocateNew has an exception
> > > > behavior
> > > > > > and
> > > > > > > allocateNewSafe has a return value behavior.  These are main
> APIs
> > > > that
> > > > > > are
> > > > > > > used extensively throughout the code and the ones we should be
> > most
> > > > > > focused
> > > > > > > on.  I like giving the developer the choice of behavior (thus
> the
> > > two
> > > > > > > options).  It sounds like others are saying to remove this
> choice
> > > > > > (because
> > > > > > > code is not managing it correctly).
> > > > > > >
> > > > > > > If there are bugs in the 16 classes on how they use the
> advanced
> > > > > > interface,
> > > > > > > this doesn't seem like a big challenge to correct rather than
> > bulk
> > > > > > > modifying the API.  If people are using the wrong method on the
> > > > vector
> > > > > > > classes (or the vector classes aren't correctly behaving), that
> > > > doesn't
> > > > > > > seem like something we should address at the allocator API
> level.
> > > > > > >
> > > > > > > Hakim, can you be more specific about your concern?  It doesn't
> > > seem
> > > > > like
> > > > > > > we're talking about thousands of places here.
> > > > > > >
> > > > > > > Regarding checked versus unchecked...  I'm strongly in support
> of
> > > > using
> > > > > > > checked exceptions for most things.  However, I'm against it
> for
> > an
> > > > out
> > > > > > of
> > > > > > > memory condition.  (Note how the JVM also treats this as an
> > > unchecked
> > > > > > > exception.) It causes early catching when, in most cases, we
> > really
> > > > > want
> > > > > > to
> > > > > > > bubble way up the stack and fail the query.  I know of very few
> > > > places
> > > > > > > where we can actually locally manage an out of memory condition
> > so
> > > > why
> > > > > > > write huge amounts of code the bubble the exception all the way
> > to
> > > > > where
> > > > > > we
> > > > > > > can actually handle it.
> > > > > > >
> > > > > > > A quick sidebar to Jason's "Hash agg out of memory" comment
> > > > > > > Two thoughts here.  I'm not sure this is a meaningful message.
> > > Where
> > > > > we
> > > > > > > run out of memory may have nothing to do with where we are
> using
> > > most
> > > > > > > memory.  And if we did to decide this was useful, this can be
> > done
> > > > just
> > > > > > as
> > > > > > > easily with unchecked exceptions as checked exception.  What
> I'm
> > > want
> > > > > to
> > > > > > > avoid is adding throws OutOfMemoryException in 1000 places.
> The
> > > > > reality
> > > > > > is
> > > > > > > everybody should assume that an OutOfMemory can occur pretty
> much
> > > > > > anytime.
> > > > > > >
> > > > > > > TL;DR
> > > > > > >
> > > > > > > In general, I think this discussion is too abstract and thus
> has
> > > the
> > > > > > danger
> > > > > > > of trending towards a religious discussion (I already see a bit
> > of
> > > > that
> > > > > > > fervor in some of the responses.).  The number of places we are
> > > > talking
> > > > > > > about are not that large and are easy to find.  Let's solve
> this
> > by
> > > > > > looking
> > > > > > > at specifics.  For example, if people think checked would be
> > > better,
> > > > > > > someone should go through and make an example changeset for
> > > everyone
> > > > to
> > > > > > > review.  My guess is, the only that it works will be if very
> > > quickly,
> > > > > the
> > > > > > > handling code converts the checked exception into an unchecked
> > > > > > > UserException.  But I'm more than happy to be proven wrong.
> > > > > > >
> > > > > > > On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <
> > > > dbarclay@maprtech.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hanifi GUNES wrote:
> > > > > > > >
> > > > > > > >> - We would end up having to add it to almost everything
> > > everywhere
> > > > > > > >> Why would one propagate the checked exception for no reason?
> > And
> > > > why
> > > > > > > would
> > > > > > > >> one not propagate the exception for a good reason like
> > > > robustness? I
> > > > > > > agree
> > > > > > > >> that one has to avoid propagating the checked exception for
> no
> > > > > reason
> > > > > > > >> however I support propagating it for a good reason.
> > > > > > > >>
> > > > > > > >> The added benefit of raising a checked exception is
> reminding
> > as
> > > > > well
> > > > > > as
> > > > > > > >> enforcing devs to handle it and be more cautious about this
> > > > > particular
> > > > > > > >> event. I find this compile-time safety check invaluable for
> > > > > > robustness.
> > > > > > > >>
> > > > > > > >
> > > > > > > > +(1 times <some large number>)
> > > > > > > >
> > > > > > > > Daniel
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >>
> > > > > > > >> - Or constantly wrapping it with RuntimeException to get
> > around
> > > > that
> > > > > > > >> If it has to be done, I would recommend relying on a helper
> to
> > > do
> > > > > so.
> > > > > > > >> There
> > > > > > > >> is not much of man-work involved here.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <
> > > > > adeneche@maprtech.com
> > > > > > >:
> > > > > > > >>
> > > > > > > >>  +1 to Hanifi's
> > > > > > > >>>
> > > > > > > >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> > > > > > > >>> altekrusejason@gmail.com
> > > > > > > >>>
> > > > > > > >>>>
> > > > > > > >>>>  wrote:
> > > > > > > >>>
> > > > > > > >>>  +1 to Hanifi's comments, I think it makes much more sense
> to
> > > > have
> > > > > a
> > > > > > > >>>>
> > > > > > > >>> number
> > > > > > > >>>
> > > > > > > >>>> of sites where the operators are explicitly catching a
> > checked
> > > > OOM
> > > > > > > >>>> exception and either decide to handle it or produce a
> > message
> > > > like
> > > > > > > "Hash
> > > > > > > >>>> Aggregate does not support our of memory conditions". This
> > > would
> > > > > be
> > > > > > > >>>> particularly useful for debugging queries, as the user
> > > exception
> > > > > can
> > > > > > > >>>> provide context information about the current operation.
> > This
> > > > way
> > > > > > > users
> > > > > > > >>>>
> > > > > > > >>> can
> > > > > > > >>>
> > > > > > > >>>> have some idea about the part of their query that might be
> > > > causing
> > > > > > an
> > > > > > > >>>> excessive strain on system resources. I understand that
> > there
> > > > are
> > > > > > also
> > > > > > > >>>> cases where operators competing for memory can make it a
> > toss
> > > up
> > > > > to
> > > > > > > >>>> which
> > > > > > > >>>> will actually fail, but this would at least be a step to
> > give
> > > > more
> > > > > > > >>>>
> > > > > > > >>> detailed
> > > > > > > >>>
> > > > > > > >>>> information to users.
> > > > > > > >>>>
> > > > > > > >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <
> > hg@apache.org>
> > > > > > wrote:
> > > > > > > >>>>
> > > > > > > >>>>  I would propose throwing a checked exception encouraging
> > > > explicit
> > > > > > and
> > > > > > > >>>>> consistent handling of this event. Each sub-system has
> > > liberty
> > > > to
> > > > > > > >>>>>
> > > > > > > >>>> decide
> > > > > > > >>>
> > > > > > > >>>> if
> > > > > > > >>>>
> > > > > > > >>>>> an OOM failure is fatal or non-fatal depending on its
> > > > > capabilities.
> > > > > > > >>>>>
> > > > > > > >>>> Also
> > > > > > > >>>
> > > > > > > >>>> if
> > > > > > > >>>>
> > > > > > > >>>>> at some point a sub-system needs to communicate with its
> > > > callers
> > > > > > via
> > > > > > > a
> > > > > > > >>>>> different mechanism such like using flags (boolean, enum
> > etc)
> > > > or
> > > > > > > >>>>>
> > > > > > > >>>> raising
> > > > > > > >>>
> > > > > > > >>>> an
> > > > > > > >>>>
> > > > > > > >>>>> unchecked exception that's still fine, just handle the
> > > > exception.
> > > > > > If
> > > > > > > >>>>>
> > > > > > > >>>> there
> > > > > > > >>>>
> > > > > > > >>>>> is a need to suppress the checked exception that's fine
> > too,
> > > > just
> > > > > > > use a
> > > > > > > >>>>> helper method.
> > > > > > > >>>>>
> > > > > > > >>>>> Either way, returning *null* sounds problematic in many
> > ways
> > > i)
> > > > > it
> > > > > > is
> > > > > > > >>>>> implicit ii) unsafe iii) its handling logic is repetitive
> > iv)
> > > > it
> > > > > is
> > > > > > > >>>>> semantically unclean to make null mean something - even
> > worse
> > > > > > > something
> > > > > > > >>>>> context specific.
> > > > > > > >>>>>
> > > > > > > >>>>>
> > > > > > > >>>>> -Hanifi
> > > > > > > >>>>>
> > > > > > > >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> > > > > > > adeneche@maprtech.com
> > > > > > > >>>>>
> > > > > > > >>>> :
> > > > > > > >>>>
> > > > > > > >>>>>
> > > > > > > >>>>>  I guess that would fix the issue too. But we may still
> run
> > > > into
> > > > > > > >>>>>>
> > > > > > > >>>>> situations
> > > > > > > >>>>>
> > > > > > > >>>>>> where the caller will pass a flag to "mute" the
> exception
> > > and
> > > > > not
> > > > > > > >>>>>>
> > > > > > > >>>>> handle
> > > > > > > >>>>
> > > > > > > >>>>> the case anyway.
> > > > > > > >>>>>>
> > > > > > > >>>>>> If .buffer() unconditionally throws an exception, can't
> > the
> > > > > > caller,
> > > > > > > >>>>>>
> > > > > > > >>>>> who
> > > > > > > >>>
> > > > > > > >>>> wants to, just catch that and handle it properly ?
> > > > > > > >>>>>>
> > > > > > > >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > > > > > > >>>>>>
> > > > > > > >>>>> chriswestin42@gmail.com>
> > > > > > > >>>>
> > > > > > > >>>>> wrote:
> > > > > > > >>>>>>
> > > > > > > >>>>>>  No, but we should do something close to that.
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> There are cases where the caller can handle the
> inability
> > > to
> > > > > get
> > > > > > > >>>>>>>
> > > > > > > >>>>>> more
> > > > > > > >>>
> > > > > > > >>>> memory, and may be able to go to disk. However, you are
> > > correct
> > > > > > > >>>>>>>
> > > > > > > >>>>>> that
> > > > > > > >>>
> > > > > > > >>>> there
> > > > > > > >>>>>>
> > > > > > > >>>>>>> are many that can't handle an OOM, and that fail to
> > check.
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Instead of unconditionally throwing
> > > > > OutOfMemoryRuntimeException,
> > > > > > I
> > > > > > > >>>>>>>
> > > > > > > >>>>>> would
> > > > > > > >>>>>
> > > > > > > >>>>>> suggest that the buffer() call take a flag that
> indicates
> > > > > whether
> > > > > > > >>>>>>>
> > > > > > > >>>>>> or
> > > > > > > >>>
> > > > > > > >>>> not
> > > > > > > >>>>>
> > > > > > > >>>>>> it
> > > > > > > >>>>>>
> > > > > > > >>>>>>> should throw if it is unable to fulfill the request.
> This
> > > > way,
> > > > > > the
> > > > > > > >>>>>>>
> > > > > > > >>>>>> call
> > > > > > > >>>>
> > > > > > > >>>>> sites that can handle an OOM can pass in the flag to
> return
> > > > null,
> > > > > > > >>>>>>>
> > > > > > > >>>>>> and
> > > > > > > >>>
> > > > > > > >>>> the
> > > > > > > >>>>>
> > > > > > > >>>>>> rest can pass in the flag value to throw, and not have
> to
> > > have
> > > > > any
> > > > > > > >>>>>>>
> > > > > > > >>>>>> checking
> > > > > > > >>>>>>
> > > > > > > >>>>>>> code.
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > > > > > >>>>>>> adeneche@maprtech.com
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>> wrote:
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>  our current implementations of
> > BufferAllocator.buffer(int,
> > > > > int)
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>> returns
> > > > > > > >>>>>
> > > > > > > >>>>>> null when it cannot allocate the buffer.
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> But looking through the code, there are many places
> that
> > > > don't
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>> check
> > > > > > > >>>>
> > > > > > > >>>>> if
> > > > > > > >>>>>
> > > > > > > >>>>>> the
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>> allocated buffer is null before trying to access it
> > which
> > > > will
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>> throw
> > > > > > > >>>>
> > > > > > > >>>>> a
> > > > > > > >>>>>
> > > > > > > >>>>>> NullPointerException.
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> ValueVectors' allocateNewSafe() seem to be the only
> > place
> > > > that
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>> handle
> > > > > > > >>>>
> > > > > > > >>>>> the
> > > > > > > >>>>>>
> > > > > > > >>>>>>> null in a specific manner.
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> Should we update the allocators' implementation to
> throw
> > > an
> > > > > > > >>>>>>>> OutOfMemoryRuntimeException instead of returning null
> ?
> > > this
> > > > > has
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>> the
> > > > > > > >>>>
> > > > > > > >>>>> added
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>> benefit of displaying a proper out of memory error
> > message
> > > > to
> > > > > > the
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>> user.
> > > > > > > >>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>>>> Thanks!
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> --
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> Abdelhakim Deneche
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> Software Engineer
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>>    <http://www.mapr.com/>
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> Now Available - Free Hadoop On-Demand Training
> > > > > > > >>>>>>>> <
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>
> > > > > > > >>>>
> > > > > > > >>>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > > >>>
> > > > > > > >>>>
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>> --
> > > > > > > >>>>>>
> > > > > > > >>>>>> Abdelhakim Deneche
> > > > > > > >>>>>>
> > > > > > > >>>>>> Software Engineer
> > > > > > > >>>>>>
> > > > > > > >>>>>>    <http://www.mapr.com/>
> > > > > > > >>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>> Now Available - Free Hadoop On-Demand Training
> > > > > > > >>>>>> <
> > > > > > > >>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>
> > > > > > > >>>>
> > > > > > > >>>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > > >>>
> > > > > > > >>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>
> > > > > > > >>>>
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> --
> > > > > > > >>>
> > > > > > > >>> Abdelhakim Deneche
> > > > > > > >>>
> > > > > > > >>> Software Engineer
> > > > > > > >>>
> > > > > > > >>>    <http://www.mapr.com/>
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> Now Available - Free Hadoop On-Demand Training
> > > > > > > >>> <
> > > > > > > >>>
> > > > > > > >>>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > > >>>
> > > > > > > >>>>
> > > > > > > >>>>
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >
> > > > > > > > --
> > > > > > > > Daniel Barclay
> > > > > > > > MapR Technologies
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Abdelhakim Deneche
> > > > > >
> > > > > > Software Engineer
> > > > > >
> > > > > >   <http://www.mapr.com/>
> > > > > >
> > > > > >
> > > > > > Now Available - Free Hadoop On-Demand Training
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >   <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Jacques Nadeau <ja...@apache.org>.
You're right, that should be fine.
On Jun 30, 2015 4:24 PM, "Abdel Hakim Deneche" <ad...@maprtech.com>
wrote:

> What would be the impact of making buffer() throw an
> OutOfMemoryRuntimeException ? we'll need to change a few places where it
> really needs to be handled (like in allocateNewSafe()) but other than that,
> just letting the unchecked exception propagate should be fine, and Drill
> will still display the proper error message.
>
> Of course if this happens in the RPC layer, we still need to handle it
> properly (DRILL-3317 <https://issues.apache.org/jira/browse/DRILL-3317>)
>
> On Tue, Jun 30, 2015 at 4:12 PM, Hanifi Gunes <hg...@maprtech.com> wrote:
>
> > -I like giving the developer the choice of behavior (thus the two
> options).
> > It sounds like others are saying to remove this choice (because code is
> not
> > managing it correctly).
> > I might be misunderstanding the context above but my understanding is
> that
> > the discussion is not particularly concerned about VVs nor adding,
> removing
> > methods to/from them. It is the behavior of Allocator#buffer when we run
> > OOM. Current behavior is to return null to communicate this. Given
> #buffer
> > is used in very few classes, I think we can easily change this for good
> > avoiding possible problems to come.
> >
> >
> > - What I'm want to avoid is adding throws OutOfMemoryException in 1000
> > places.
> > Agreed.
> >
> >
> >
> > On Tue, Jun 30, 2015 at 3:54 PM, Abdel Hakim Deneche <
> > adeneche@maprtech.com>
> > wrote:
> >
> > > buffer(int) is mostly called from value vectors, but there are an
> > > additional 10+ places where it's being used and most of them don't
> bother
> > > checking if the returned buffer is null.
> > >
> > > On Tue, Jun 30, 2015 at 3:39 PM, Jacques Nadeau <ja...@apache.org>
> > > wrote:
> > >
> > > > What is your sense of the use of this interface? I would hope new
> uses
> > > > would be scarce. Do you think this will be a frequent occurrence that
> > > most
> > > > devs will experience?
> > > > On Jun 30, 2015 3:35 PM, "Abdel Hakim Deneche" <
> adeneche@maprtech.com>
> > > > wrote:
> > > >
> > > > > I started this discussion while fixing DRILL-3312
> > > > > <https://issues.apache.org/jira/browse/DRILL-3312>, it took me
> quite
> > > > some
> > > > > time to debug and find the root of an error I was seeing. I first
> > > thought
> > > > > about fixing those few places that call buffer() without checking
> if
> > > the
> > > > > buffer is null or not, but this won't prevent new code from
> repeating
> > > the
> > > > > same mistakes.
> > > > >
> > > > >
> > > > > On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <
> jacques@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Lots of different comments on this thread.  Here are few of my
> > > > thoughts:
> > > > > >
> > > > > > I think you need to differentiate the different tiers of
> > interfaces a
> > > > > > little bit more.
> > > > > >
> > > > > > BufferAllocator.buffer() was meant to be a low level advanced
> > > > interface.
> > > > > > It should be used rarely and when done so, only with extreme
> care.
> > > > Most
> > > > > > devs shouldn't use it.  Having an advanced interface seems fine
> > here.
> > > > > This
> > > > > > was how the interface was envisioned.   It looks like there are
> > > > currently
> > > > > > 16 classes that reference this code not including vectors.
> (There
> > > > > probably
> > > > > > should be less, it looks like some code duplication contributes
> to
> > > > this.)
> > > > > >
> > > > > > All the vector classes also access this code.  Vectors have two
> > > > methods:
> > > > > > allocateNew and allocateNewSafe.  allocateNew has an exception
> > > behavior
> > > > > and
> > > > > > allocateNewSafe has a return value behavior.  These are main APIs
> > > that
> > > > > are
> > > > > > used extensively throughout the code and the ones we should be
> most
> > > > > focused
> > > > > > on.  I like giving the developer the choice of behavior (thus the
> > two
> > > > > > options).  It sounds like others are saying to remove this choice
> > > > > (because
> > > > > > code is not managing it correctly).
> > > > > >
> > > > > > If there are bugs in the 16 classes on how they use the advanced
> > > > > interface,
> > > > > > this doesn't seem like a big challenge to correct rather than
> bulk
> > > > > > modifying the API.  If people are using the wrong method on the
> > > vector
> > > > > > classes (or the vector classes aren't correctly behaving), that
> > > doesn't
> > > > > > seem like something we should address at the allocator API level.
> > > > > >
> > > > > > Hakim, can you be more specific about your concern?  It doesn't
> > seem
> > > > like
> > > > > > we're talking about thousands of places here.
> > > > > >
> > > > > > Regarding checked versus unchecked...  I'm strongly in support of
> > > using
> > > > > > checked exceptions for most things.  However, I'm against it for
> an
> > > out
> > > > > of
> > > > > > memory condition.  (Note how the JVM also treats this as an
> > unchecked
> > > > > > exception.) It causes early catching when, in most cases, we
> really
> > > > want
> > > > > to
> > > > > > bubble way up the stack and fail the query.  I know of very few
> > > places
> > > > > > where we can actually locally manage an out of memory condition
> so
> > > why
> > > > > > write huge amounts of code the bubble the exception all the way
> to
> > > > where
> > > > > we
> > > > > > can actually handle it.
> > > > > >
> > > > > > A quick sidebar to Jason's "Hash agg out of memory" comment
> > > > > > Two thoughts here.  I'm not sure this is a meaningful message.
> > Where
> > > > we
> > > > > > run out of memory may have nothing to do with where we are using
> > most
> > > > > > memory.  And if we did to decide this was useful, this can be
> done
> > > just
> > > > > as
> > > > > > easily with unchecked exceptions as checked exception.  What I'm
> > want
> > > > to
> > > > > > avoid is adding throws OutOfMemoryException in 1000 places.  The
> > > > reality
> > > > > is
> > > > > > everybody should assume that an OutOfMemory can occur pretty much
> > > > > anytime.
> > > > > >
> > > > > > TL;DR
> > > > > >
> > > > > > In general, I think this discussion is too abstract and thus has
> > the
> > > > > danger
> > > > > > of trending towards a religious discussion (I already see a bit
> of
> > > that
> > > > > > fervor in some of the responses.).  The number of places we are
> > > talking
> > > > > > about are not that large and are easy to find.  Let's solve this
> by
> > > > > looking
> > > > > > at specifics.  For example, if people think checked would be
> > better,
> > > > > > someone should go through and make an example changeset for
> > everyone
> > > to
> > > > > > review.  My guess is, the only that it works will be if very
> > quickly,
> > > > the
> > > > > > handling code converts the checked exception into an unchecked
> > > > > > UserException.  But I'm more than happy to be proven wrong.
> > > > > >
> > > > > > On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <
> > > dbarclay@maprtech.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hanifi GUNES wrote:
> > > > > > >
> > > > > > >> - We would end up having to add it to almost everything
> > everywhere
> > > > > > >> Why would one propagate the checked exception for no reason?
> And
> > > why
> > > > > > would
> > > > > > >> one not propagate the exception for a good reason like
> > > robustness? I
> > > > > > agree
> > > > > > >> that one has to avoid propagating the checked exception for no
> > > > reason
> > > > > > >> however I support propagating it for a good reason.
> > > > > > >>
> > > > > > >> The added benefit of raising a checked exception is reminding
> as
> > > > well
> > > > > as
> > > > > > >> enforcing devs to handle it and be more cautious about this
> > > > particular
> > > > > > >> event. I find this compile-time safety check invaluable for
> > > > > robustness.
> > > > > > >>
> > > > > > >
> > > > > > > +(1 times <some large number>)
> > > > > > >
> > > > > > > Daniel
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >>
> > > > > > >> - Or constantly wrapping it with RuntimeException to get
> around
> > > that
> > > > > > >> If it has to be done, I would recommend relying on a helper to
> > do
> > > > so.
> > > > > > >> There
> > > > > > >> is not much of man-work involved here.
> > > > > > >>
> > > > > > >>
> > > > > > >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <
> > > > adeneche@maprtech.com
> > > > > >:
> > > > > > >>
> > > > > > >>  +1 to Hanifi's
> > > > > > >>>
> > > > > > >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> > > > > > >>> altekrusejason@gmail.com
> > > > > > >>>
> > > > > > >>>>
> > > > > > >>>>  wrote:
> > > > > > >>>
> > > > > > >>>  +1 to Hanifi's comments, I think it makes much more sense to
> > > have
> > > > a
> > > > > > >>>>
> > > > > > >>> number
> > > > > > >>>
> > > > > > >>>> of sites where the operators are explicitly catching a
> checked
> > > OOM
> > > > > > >>>> exception and either decide to handle it or produce a
> message
> > > like
> > > > > > "Hash
> > > > > > >>>> Aggregate does not support our of memory conditions". This
> > would
> > > > be
> > > > > > >>>> particularly useful for debugging queries, as the user
> > exception
> > > > can
> > > > > > >>>> provide context information about the current operation.
> This
> > > way
> > > > > > users
> > > > > > >>>>
> > > > > > >>> can
> > > > > > >>>
> > > > > > >>>> have some idea about the part of their query that might be
> > > causing
> > > > > an
> > > > > > >>>> excessive strain on system resources. I understand that
> there
> > > are
> > > > > also
> > > > > > >>>> cases where operators competing for memory can make it a
> toss
> > up
> > > > to
> > > > > > >>>> which
> > > > > > >>>> will actually fail, but this would at least be a step to
> give
> > > more
> > > > > > >>>>
> > > > > > >>> detailed
> > > > > > >>>
> > > > > > >>>> information to users.
> > > > > > >>>>
> > > > > > >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <
> hg@apache.org>
> > > > > wrote:
> > > > > > >>>>
> > > > > > >>>>  I would propose throwing a checked exception encouraging
> > > explicit
> > > > > and
> > > > > > >>>>> consistent handling of this event. Each sub-system has
> > liberty
> > > to
> > > > > > >>>>>
> > > > > > >>>> decide
> > > > > > >>>
> > > > > > >>>> if
> > > > > > >>>>
> > > > > > >>>>> an OOM failure is fatal or non-fatal depending on its
> > > > capabilities.
> > > > > > >>>>>
> > > > > > >>>> Also
> > > > > > >>>
> > > > > > >>>> if
> > > > > > >>>>
> > > > > > >>>>> at some point a sub-system needs to communicate with its
> > > callers
> > > > > via
> > > > > > a
> > > > > > >>>>> different mechanism such like using flags (boolean, enum
> etc)
> > > or
> > > > > > >>>>>
> > > > > > >>>> raising
> > > > > > >>>
> > > > > > >>>> an
> > > > > > >>>>
> > > > > > >>>>> unchecked exception that's still fine, just handle the
> > > exception.
> > > > > If
> > > > > > >>>>>
> > > > > > >>>> there
> > > > > > >>>>
> > > > > > >>>>> is a need to suppress the checked exception that's fine
> too,
> > > just
> > > > > > use a
> > > > > > >>>>> helper method.
> > > > > > >>>>>
> > > > > > >>>>> Either way, returning *null* sounds problematic in many
> ways
> > i)
> > > > it
> > > > > is
> > > > > > >>>>> implicit ii) unsafe iii) its handling logic is repetitive
> iv)
> > > it
> > > > is
> > > > > > >>>>> semantically unclean to make null mean something - even
> worse
> > > > > > something
> > > > > > >>>>> context specific.
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> -Hanifi
> > > > > > >>>>>
> > > > > > >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> > > > > > adeneche@maprtech.com
> > > > > > >>>>>
> > > > > > >>>> :
> > > > > > >>>>
> > > > > > >>>>>
> > > > > > >>>>>  I guess that would fix the issue too. But we may still run
> > > into
> > > > > > >>>>>>
> > > > > > >>>>> situations
> > > > > > >>>>>
> > > > > > >>>>>> where the caller will pass a flag to "mute" the exception
> > and
> > > > not
> > > > > > >>>>>>
> > > > > > >>>>> handle
> > > > > > >>>>
> > > > > > >>>>> the case anyway.
> > > > > > >>>>>>
> > > > > > >>>>>> If .buffer() unconditionally throws an exception, can't
> the
> > > > > caller,
> > > > > > >>>>>>
> > > > > > >>>>> who
> > > > > > >>>
> > > > > > >>>> wants to, just catch that and handle it properly ?
> > > > > > >>>>>>
> > > > > > >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > > > > > >>>>>>
> > > > > > >>>>> chriswestin42@gmail.com>
> > > > > > >>>>
> > > > > > >>>>> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>>  No, but we should do something close to that.
> > > > > > >>>>>>>
> > > > > > >>>>>>> There are cases where the caller can handle the inability
> > to
> > > > get
> > > > > > >>>>>>>
> > > > > > >>>>>> more
> > > > > > >>>
> > > > > > >>>> memory, and may be able to go to disk. However, you are
> > correct
> > > > > > >>>>>>>
> > > > > > >>>>>> that
> > > > > > >>>
> > > > > > >>>> there
> > > > > > >>>>>>
> > > > > > >>>>>>> are many that can't handle an OOM, and that fail to
> check.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Instead of unconditionally throwing
> > > > OutOfMemoryRuntimeException,
> > > > > I
> > > > > > >>>>>>>
> > > > > > >>>>>> would
> > > > > > >>>>>
> > > > > > >>>>>> suggest that the buffer() call take a flag that indicates
> > > > whether
> > > > > > >>>>>>>
> > > > > > >>>>>> or
> > > > > > >>>
> > > > > > >>>> not
> > > > > > >>>>>
> > > > > > >>>>>> it
> > > > > > >>>>>>
> > > > > > >>>>>>> should throw if it is unable to fulfill the request. This
> > > way,
> > > > > the
> > > > > > >>>>>>>
> > > > > > >>>>>> call
> > > > > > >>>>
> > > > > > >>>>> sites that can handle an OOM can pass in the flag to return
> > > null,
> > > > > > >>>>>>>
> > > > > > >>>>>> and
> > > > > > >>>
> > > > > > >>>> the
> > > > > > >>>>>
> > > > > > >>>>>> rest can pass in the flag value to throw, and not have to
> > have
> > > > any
> > > > > > >>>>>>>
> > > > > > >>>>>> checking
> > > > > > >>>>>>
> > > > > > >>>>>>> code.
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > > > > >>>>>>> adeneche@maprtech.com
> > > > > > >>>>>>>
> > > > > > >>>>>>>> wrote:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>  our current implementations of
> BufferAllocator.buffer(int,
> > > > int)
> > > > > > >>>>>>>>
> > > > > > >>>>>>> returns
> > > > > > >>>>>
> > > > > > >>>>>> null when it cannot allocate the buffer.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> But looking through the code, there are many places that
> > > don't
> > > > > > >>>>>>>>
> > > > > > >>>>>>> check
> > > > > > >>>>
> > > > > > >>>>> if
> > > > > > >>>>>
> > > > > > >>>>>> the
> > > > > > >>>>>>>
> > > > > > >>>>>>>> allocated buffer is null before trying to access it
> which
> > > will
> > > > > > >>>>>>>>
> > > > > > >>>>>>> throw
> > > > > > >>>>
> > > > > > >>>>> a
> > > > > > >>>>>
> > > > > > >>>>>> NullPointerException.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> ValueVectors' allocateNewSafe() seem to be the only
> place
> > > that
> > > > > > >>>>>>>>
> > > > > > >>>>>>> handle
> > > > > > >>>>
> > > > > > >>>>> the
> > > > > > >>>>>>
> > > > > > >>>>>>> null in a specific manner.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Should we update the allocators' implementation to throw
> > an
> > > > > > >>>>>>>> OutOfMemoryRuntimeException instead of returning null ?
> > this
> > > > has
> > > > > > >>>>>>>>
> > > > > > >>>>>>> the
> > > > > > >>>>
> > > > > > >>>>> added
> > > > > > >>>>>>>
> > > > > > >>>>>>>> benefit of displaying a proper out of memory error
> message
> > > to
> > > > > the
> > > > > > >>>>>>>>
> > > > > > >>>>>>> user.
> > > > > > >>>>>
> > > > > > >>>>>>
> > > > > > >>>>>>>> Thanks!
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> --
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Abdelhakim Deneche
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Software Engineer
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>    <http://www.mapr.com/>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Now Available - Free Hadoop On-Demand Training
> > > > > > >>>>>>>> <
> > > > > > >>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > >>>
> > > > > > >>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> --
> > > > > > >>>>>>
> > > > > > >>>>>> Abdelhakim Deneche
> > > > > > >>>>>>
> > > > > > >>>>>> Software Engineer
> > > > > > >>>>>>
> > > > > > >>>>>>    <http://www.mapr.com/>
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> Now Available - Free Hadoop On-Demand Training
> > > > > > >>>>>> <
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > >>>
> > > > > > >>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> --
> > > > > > >>>
> > > > > > >>> Abdelhakim Deneche
> > > > > > >>>
> > > > > > >>> Software Engineer
> > > > > > >>>
> > > > > > >>>    <http://www.mapr.com/>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Now Available - Free Hadoop On-Demand Training
> > > > > > >>> <
> > > > > > >>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > >>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > > > --
> > > > > > > Daniel Barclay
> > > > > > > MapR Technologies
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Abdelhakim Deneche
> > > > >
> > > > > Software Engineer
> > > > >
> > > > >   <http://www.mapr.com/>
> > > > >
> > > > >
> > > > > Now Available - Free Hadoop On-Demand Training
> > > > > <
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
What would be the impact of making buffer() throw an
OutOfMemoryRuntimeException ? we'll need to change a few places where it
really needs to be handled (like in allocateNewSafe()) but other than that,
just letting the unchecked exception propagate should be fine, and Drill
will still display the proper error message.

Of course if this happens in the RPC layer, we still need to handle it
properly (DRILL-3317 <https://issues.apache.org/jira/browse/DRILL-3317>)

On Tue, Jun 30, 2015 at 4:12 PM, Hanifi Gunes <hg...@maprtech.com> wrote:

> -I like giving the developer the choice of behavior (thus the two options).
> It sounds like others are saying to remove this choice (because code is not
> managing it correctly).
> I might be misunderstanding the context above but my understanding is that
> the discussion is not particularly concerned about VVs nor adding, removing
> methods to/from them. It is the behavior of Allocator#buffer when we run
> OOM. Current behavior is to return null to communicate this. Given #buffer
> is used in very few classes, I think we can easily change this for good
> avoiding possible problems to come.
>
>
> - What I'm want to avoid is adding throws OutOfMemoryException in 1000
> places.
> Agreed.
>
>
>
> On Tue, Jun 30, 2015 at 3:54 PM, Abdel Hakim Deneche <
> adeneche@maprtech.com>
> wrote:
>
> > buffer(int) is mostly called from value vectors, but there are an
> > additional 10+ places where it's being used and most of them don't bother
> > checking if the returned buffer is null.
> >
> > On Tue, Jun 30, 2015 at 3:39 PM, Jacques Nadeau <ja...@apache.org>
> > wrote:
> >
> > > What is your sense of the use of this interface? I would hope new uses
> > > would be scarce. Do you think this will be a frequent occurrence that
> > most
> > > devs will experience?
> > > On Jun 30, 2015 3:35 PM, "Abdel Hakim Deneche" <ad...@maprtech.com>
> > > wrote:
> > >
> > > > I started this discussion while fixing DRILL-3312
> > > > <https://issues.apache.org/jira/browse/DRILL-3312>, it took me quite
> > > some
> > > > time to debug and find the root of an error I was seeing. I first
> > thought
> > > > about fixing those few places that call buffer() without checking if
> > the
> > > > buffer is null or not, but this won't prevent new code from repeating
> > the
> > > > same mistakes.
> > > >
> > > >
> > > > On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <ja...@apache.org>
> > > > wrote:
> > > >
> > > > > Lots of different comments on this thread.  Here are few of my
> > > thoughts:
> > > > >
> > > > > I think you need to differentiate the different tiers of
> interfaces a
> > > > > little bit more.
> > > > >
> > > > > BufferAllocator.buffer() was meant to be a low level advanced
> > > interface.
> > > > > It should be used rarely and when done so, only with extreme care.
> > > Most
> > > > > devs shouldn't use it.  Having an advanced interface seems fine
> here.
> > > > This
> > > > > was how the interface was envisioned.   It looks like there are
> > > currently
> > > > > 16 classes that reference this code not including vectors.  (There
> > > > probably
> > > > > should be less, it looks like some code duplication contributes to
> > > this.)
> > > > >
> > > > > All the vector classes also access this code.  Vectors have two
> > > methods:
> > > > > allocateNew and allocateNewSafe.  allocateNew has an exception
> > behavior
> > > > and
> > > > > allocateNewSafe has a return value behavior.  These are main APIs
> > that
> > > > are
> > > > > used extensively throughout the code and the ones we should be most
> > > > focused
> > > > > on.  I like giving the developer the choice of behavior (thus the
> two
> > > > > options).  It sounds like others are saying to remove this choice
> > > > (because
> > > > > code is not managing it correctly).
> > > > >
> > > > > If there are bugs in the 16 classes on how they use the advanced
> > > > interface,
> > > > > this doesn't seem like a big challenge to correct rather than bulk
> > > > > modifying the API.  If people are using the wrong method on the
> > vector
> > > > > classes (or the vector classes aren't correctly behaving), that
> > doesn't
> > > > > seem like something we should address at the allocator API level.
> > > > >
> > > > > Hakim, can you be more specific about your concern?  It doesn't
> seem
> > > like
> > > > > we're talking about thousands of places here.
> > > > >
> > > > > Regarding checked versus unchecked...  I'm strongly in support of
> > using
> > > > > checked exceptions for most things.  However, I'm against it for an
> > out
> > > > of
> > > > > memory condition.  (Note how the JVM also treats this as an
> unchecked
> > > > > exception.) It causes early catching when, in most cases, we really
> > > want
> > > > to
> > > > > bubble way up the stack and fail the query.  I know of very few
> > places
> > > > > where we can actually locally manage an out of memory condition so
> > why
> > > > > write huge amounts of code the bubble the exception all the way to
> > > where
> > > > we
> > > > > can actually handle it.
> > > > >
> > > > > A quick sidebar to Jason's "Hash agg out of memory" comment
> > > > > Two thoughts here.  I'm not sure this is a meaningful message.
> Where
> > > we
> > > > > run out of memory may have nothing to do with where we are using
> most
> > > > > memory.  And if we did to decide this was useful, this can be done
> > just
> > > > as
> > > > > easily with unchecked exceptions as checked exception.  What I'm
> want
> > > to
> > > > > avoid is adding throws OutOfMemoryException in 1000 places.  The
> > > reality
> > > > is
> > > > > everybody should assume that an OutOfMemory can occur pretty much
> > > > anytime.
> > > > >
> > > > > TL;DR
> > > > >
> > > > > In general, I think this discussion is too abstract and thus has
> the
> > > > danger
> > > > > of trending towards a religious discussion (I already see a bit of
> > that
> > > > > fervor in some of the responses.).  The number of places we are
> > talking
> > > > > about are not that large and are easy to find.  Let's solve this by
> > > > looking
> > > > > at specifics.  For example, if people think checked would be
> better,
> > > > > someone should go through and make an example changeset for
> everyone
> > to
> > > > > review.  My guess is, the only that it works will be if very
> quickly,
> > > the
> > > > > handling code converts the checked exception into an unchecked
> > > > > UserException.  But I'm more than happy to be proven wrong.
> > > > >
> > > > > On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <
> > dbarclay@maprtech.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hanifi GUNES wrote:
> > > > > >
> > > > > >> - We would end up having to add it to almost everything
> everywhere
> > > > > >> Why would one propagate the checked exception for no reason? And
> > why
> > > > > would
> > > > > >> one not propagate the exception for a good reason like
> > robustness? I
> > > > > agree
> > > > > >> that one has to avoid propagating the checked exception for no
> > > reason
> > > > > >> however I support propagating it for a good reason.
> > > > > >>
> > > > > >> The added benefit of raising a checked exception is reminding as
> > > well
> > > > as
> > > > > >> enforcing devs to handle it and be more cautious about this
> > > particular
> > > > > >> event. I find this compile-time safety check invaluable for
> > > > robustness.
> > > > > >>
> > > > > >
> > > > > > +(1 times <some large number>)
> > > > > >
> > > > > > Daniel
> > > > > >
> > > > > >
> > > > > >
> > > > > >>
> > > > > >> - Or constantly wrapping it with RuntimeException to get around
> > that
> > > > > >> If it has to be done, I would recommend relying on a helper to
> do
> > > so.
> > > > > >> There
> > > > > >> is not much of man-work involved here.
> > > > > >>
> > > > > >>
> > > > > >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <
> > > adeneche@maprtech.com
> > > > >:
> > > > > >>
> > > > > >>  +1 to Hanifi's
> > > > > >>>
> > > > > >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> > > > > >>> altekrusejason@gmail.com
> > > > > >>>
> > > > > >>>>
> > > > > >>>>  wrote:
> > > > > >>>
> > > > > >>>  +1 to Hanifi's comments, I think it makes much more sense to
> > have
> > > a
> > > > > >>>>
> > > > > >>> number
> > > > > >>>
> > > > > >>>> of sites where the operators are explicitly catching a checked
> > OOM
> > > > > >>>> exception and either decide to handle it or produce a message
> > like
> > > > > "Hash
> > > > > >>>> Aggregate does not support our of memory conditions". This
> would
> > > be
> > > > > >>>> particularly useful for debugging queries, as the user
> exception
> > > can
> > > > > >>>> provide context information about the current operation. This
> > way
> > > > > users
> > > > > >>>>
> > > > > >>> can
> > > > > >>>
> > > > > >>>> have some idea about the part of their query that might be
> > causing
> > > > an
> > > > > >>>> excessive strain on system resources. I understand that there
> > are
> > > > also
> > > > > >>>> cases where operators competing for memory can make it a toss
> up
> > > to
> > > > > >>>> which
> > > > > >>>> will actually fail, but this would at least be a step to give
> > more
> > > > > >>>>
> > > > > >>> detailed
> > > > > >>>
> > > > > >>>> information to users.
> > > > > >>>>
> > > > > >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org>
> > > > wrote:
> > > > > >>>>
> > > > > >>>>  I would propose throwing a checked exception encouraging
> > explicit
> > > > and
> > > > > >>>>> consistent handling of this event. Each sub-system has
> liberty
> > to
> > > > > >>>>>
> > > > > >>>> decide
> > > > > >>>
> > > > > >>>> if
> > > > > >>>>
> > > > > >>>>> an OOM failure is fatal or non-fatal depending on its
> > > capabilities.
> > > > > >>>>>
> > > > > >>>> Also
> > > > > >>>
> > > > > >>>> if
> > > > > >>>>
> > > > > >>>>> at some point a sub-system needs to communicate with its
> > callers
> > > > via
> > > > > a
> > > > > >>>>> different mechanism such like using flags (boolean, enum etc)
> > or
> > > > > >>>>>
> > > > > >>>> raising
> > > > > >>>
> > > > > >>>> an
> > > > > >>>>
> > > > > >>>>> unchecked exception that's still fine, just handle the
> > exception.
> > > > If
> > > > > >>>>>
> > > > > >>>> there
> > > > > >>>>
> > > > > >>>>> is a need to suppress the checked exception that's fine too,
> > just
> > > > > use a
> > > > > >>>>> helper method.
> > > > > >>>>>
> > > > > >>>>> Either way, returning *null* sounds problematic in many ways
> i)
> > > it
> > > > is
> > > > > >>>>> implicit ii) unsafe iii) its handling logic is repetitive iv)
> > it
> > > is
> > > > > >>>>> semantically unclean to make null mean something - even worse
> > > > > something
> > > > > >>>>> context specific.
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> -Hanifi
> > > > > >>>>>
> > > > > >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> > > > > adeneche@maprtech.com
> > > > > >>>>>
> > > > > >>>> :
> > > > > >>>>
> > > > > >>>>>
> > > > > >>>>>  I guess that would fix the issue too. But we may still run
> > into
> > > > > >>>>>>
> > > > > >>>>> situations
> > > > > >>>>>
> > > > > >>>>>> where the caller will pass a flag to "mute" the exception
> and
> > > not
> > > > > >>>>>>
> > > > > >>>>> handle
> > > > > >>>>
> > > > > >>>>> the case anyway.
> > > > > >>>>>>
> > > > > >>>>>> If .buffer() unconditionally throws an exception, can't the
> > > > caller,
> > > > > >>>>>>
> > > > > >>>>> who
> > > > > >>>
> > > > > >>>> wants to, just catch that and handle it properly ?
> > > > > >>>>>>
> > > > > >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > > > > >>>>>>
> > > > > >>>>> chriswestin42@gmail.com>
> > > > > >>>>
> > > > > >>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>  No, but we should do something close to that.
> > > > > >>>>>>>
> > > > > >>>>>>> There are cases where the caller can handle the inability
> to
> > > get
> > > > > >>>>>>>
> > > > > >>>>>> more
> > > > > >>>
> > > > > >>>> memory, and may be able to go to disk. However, you are
> correct
> > > > > >>>>>>>
> > > > > >>>>>> that
> > > > > >>>
> > > > > >>>> there
> > > > > >>>>>>
> > > > > >>>>>>> are many that can't handle an OOM, and that fail to check.
> > > > > >>>>>>>
> > > > > >>>>>>> Instead of unconditionally throwing
> > > OutOfMemoryRuntimeException,
> > > > I
> > > > > >>>>>>>
> > > > > >>>>>> would
> > > > > >>>>>
> > > > > >>>>>> suggest that the buffer() call take a flag that indicates
> > > whether
> > > > > >>>>>>>
> > > > > >>>>>> or
> > > > > >>>
> > > > > >>>> not
> > > > > >>>>>
> > > > > >>>>>> it
> > > > > >>>>>>
> > > > > >>>>>>> should throw if it is unable to fulfill the request. This
> > way,
> > > > the
> > > > > >>>>>>>
> > > > > >>>>>> call
> > > > > >>>>
> > > > > >>>>> sites that can handle an OOM can pass in the flag to return
> > null,
> > > > > >>>>>>>
> > > > > >>>>>> and
> > > > > >>>
> > > > > >>>> the
> > > > > >>>>>
> > > > > >>>>>> rest can pass in the flag value to throw, and not have to
> have
> > > any
> > > > > >>>>>>>
> > > > > >>>>>> checking
> > > > > >>>>>>
> > > > > >>>>>>> code.
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > > > >>>>>>> adeneche@maprtech.com
> > > > > >>>>>>>
> > > > > >>>>>>>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>  our current implementations of BufferAllocator.buffer(int,
> > > int)
> > > > > >>>>>>>>
> > > > > >>>>>>> returns
> > > > > >>>>>
> > > > > >>>>>> null when it cannot allocate the buffer.
> > > > > >>>>>>>>
> > > > > >>>>>>>> But looking through the code, there are many places that
> > don't
> > > > > >>>>>>>>
> > > > > >>>>>>> check
> > > > > >>>>
> > > > > >>>>> if
> > > > > >>>>>
> > > > > >>>>>> the
> > > > > >>>>>>>
> > > > > >>>>>>>> allocated buffer is null before trying to access it which
> > will
> > > > > >>>>>>>>
> > > > > >>>>>>> throw
> > > > > >>>>
> > > > > >>>>> a
> > > > > >>>>>
> > > > > >>>>>> NullPointerException.
> > > > > >>>>>>>>
> > > > > >>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place
> > that
> > > > > >>>>>>>>
> > > > > >>>>>>> handle
> > > > > >>>>
> > > > > >>>>> the
> > > > > >>>>>>
> > > > > >>>>>>> null in a specific manner.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Should we update the allocators' implementation to throw
> an
> > > > > >>>>>>>> OutOfMemoryRuntimeException instead of returning null ?
> this
> > > has
> > > > > >>>>>>>>
> > > > > >>>>>>> the
> > > > > >>>>
> > > > > >>>>> added
> > > > > >>>>>>>
> > > > > >>>>>>>> benefit of displaying a proper out of memory error message
> > to
> > > > the
> > > > > >>>>>>>>
> > > > > >>>>>>> user.
> > > > > >>>>>
> > > > > >>>>>>
> > > > > >>>>>>>> Thanks!
> > > > > >>>>>>>>
> > > > > >>>>>>>> --
> > > > > >>>>>>>>
> > > > > >>>>>>>> Abdelhakim Deneche
> > > > > >>>>>>>>
> > > > > >>>>>>>> Software Engineer
> > > > > >>>>>>>>
> > > > > >>>>>>>>    <http://www.mapr.com/>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> Now Available - Free Hadoop On-Demand Training
> > > > > >>>>>>>> <
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > >>>
> > > > > >>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> --
> > > > > >>>>>>
> > > > > >>>>>> Abdelhakim Deneche
> > > > > >>>>>>
> > > > > >>>>>> Software Engineer
> > > > > >>>>>>
> > > > > >>>>>>    <http://www.mapr.com/>
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> Now Available - Free Hadoop On-Demand Training
> > > > > >>>>>> <
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > >>>
> > > > > >>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>>
> > > > > >>> --
> > > > > >>>
> > > > > >>> Abdelhakim Deneche
> > > > > >>>
> > > > > >>> Software Engineer
> > > > > >>>
> > > > > >>>    <http://www.mapr.com/>
> > > > > >>>
> > > > > >>>
> > > > > >>> Now Available - Free Hadoop On-Demand Training
> > > > > >>> <
> > > > > >>>
> > > > > >>>
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > >>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > > >
> > > > > > --
> > > > > > Daniel Barclay
> > > > > > MapR Technologies
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >   <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Hanifi Gunes <hg...@maprtech.com>.
-I like giving the developer the choice of behavior (thus the two options).
It sounds like others are saying to remove this choice (because code is not
managing it correctly).
I might be misunderstanding the context above but my understanding is that
the discussion is not particularly concerned about VVs nor adding, removing
methods to/from them. It is the behavior of Allocator#buffer when we run
OOM. Current behavior is to return null to communicate this. Given #buffer
is used in very few classes, I think we can easily change this for good
avoiding possible problems to come.


- What I'm want to avoid is adding throws OutOfMemoryException in 1000
places.
Agreed.



On Tue, Jun 30, 2015 at 3:54 PM, Abdel Hakim Deneche <ad...@maprtech.com>
wrote:

> buffer(int) is mostly called from value vectors, but there are an
> additional 10+ places where it's being used and most of them don't bother
> checking if the returned buffer is null.
>
> On Tue, Jun 30, 2015 at 3:39 PM, Jacques Nadeau <ja...@apache.org>
> wrote:
>
> > What is your sense of the use of this interface? I would hope new uses
> > would be scarce. Do you think this will be a frequent occurrence that
> most
> > devs will experience?
> > On Jun 30, 2015 3:35 PM, "Abdel Hakim Deneche" <ad...@maprtech.com>
> > wrote:
> >
> > > I started this discussion while fixing DRILL-3312
> > > <https://issues.apache.org/jira/browse/DRILL-3312>, it took me quite
> > some
> > > time to debug and find the root of an error I was seeing. I first
> thought
> > > about fixing those few places that call buffer() without checking if
> the
> > > buffer is null or not, but this won't prevent new code from repeating
> the
> > > same mistakes.
> > >
> > >
> > > On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <ja...@apache.org>
> > > wrote:
> > >
> > > > Lots of different comments on this thread.  Here are few of my
> > thoughts:
> > > >
> > > > I think you need to differentiate the different tiers of interfaces a
> > > > little bit more.
> > > >
> > > > BufferAllocator.buffer() was meant to be a low level advanced
> > interface.
> > > > It should be used rarely and when done so, only with extreme care.
> > Most
> > > > devs shouldn't use it.  Having an advanced interface seems fine here.
> > > This
> > > > was how the interface was envisioned.   It looks like there are
> > currently
> > > > 16 classes that reference this code not including vectors.  (There
> > > probably
> > > > should be less, it looks like some code duplication contributes to
> > this.)
> > > >
> > > > All the vector classes also access this code.  Vectors have two
> > methods:
> > > > allocateNew and allocateNewSafe.  allocateNew has an exception
> behavior
> > > and
> > > > allocateNewSafe has a return value behavior.  These are main APIs
> that
> > > are
> > > > used extensively throughout the code and the ones we should be most
> > > focused
> > > > on.  I like giving the developer the choice of behavior (thus the two
> > > > options).  It sounds like others are saying to remove this choice
> > > (because
> > > > code is not managing it correctly).
> > > >
> > > > If there are bugs in the 16 classes on how they use the advanced
> > > interface,
> > > > this doesn't seem like a big challenge to correct rather than bulk
> > > > modifying the API.  If people are using the wrong method on the
> vector
> > > > classes (or the vector classes aren't correctly behaving), that
> doesn't
> > > > seem like something we should address at the allocator API level.
> > > >
> > > > Hakim, can you be more specific about your concern?  It doesn't seem
> > like
> > > > we're talking about thousands of places here.
> > > >
> > > > Regarding checked versus unchecked...  I'm strongly in support of
> using
> > > > checked exceptions for most things.  However, I'm against it for an
> out
> > > of
> > > > memory condition.  (Note how the JVM also treats this as an unchecked
> > > > exception.) It causes early catching when, in most cases, we really
> > want
> > > to
> > > > bubble way up the stack and fail the query.  I know of very few
> places
> > > > where we can actually locally manage an out of memory condition so
> why
> > > > write huge amounts of code the bubble the exception all the way to
> > where
> > > we
> > > > can actually handle it.
> > > >
> > > > A quick sidebar to Jason's "Hash agg out of memory" comment
> > > > Two thoughts here.  I'm not sure this is a meaningful message.  Where
> > we
> > > > run out of memory may have nothing to do with where we are using most
> > > > memory.  And if we did to decide this was useful, this can be done
> just
> > > as
> > > > easily with unchecked exceptions as checked exception.  What I'm want
> > to
> > > > avoid is adding throws OutOfMemoryException in 1000 places.  The
> > reality
> > > is
> > > > everybody should assume that an OutOfMemory can occur pretty much
> > > anytime.
> > > >
> > > > TL;DR
> > > >
> > > > In general, I think this discussion is too abstract and thus has the
> > > danger
> > > > of trending towards a religious discussion (I already see a bit of
> that
> > > > fervor in some of the responses.).  The number of places we are
> talking
> > > > about are not that large and are easy to find.  Let's solve this by
> > > looking
> > > > at specifics.  For example, if people think checked would be better,
> > > > someone should go through and make an example changeset for everyone
> to
> > > > review.  My guess is, the only that it works will be if very quickly,
> > the
> > > > handling code converts the checked exception into an unchecked
> > > > UserException.  But I'm more than happy to be proven wrong.
> > > >
> > > > On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <
> dbarclay@maprtech.com
> > >
> > > > wrote:
> > > >
> > > > > Hanifi GUNES wrote:
> > > > >
> > > > >> - We would end up having to add it to almost everything everywhere
> > > > >> Why would one propagate the checked exception for no reason? And
> why
> > > > would
> > > > >> one not propagate the exception for a good reason like
> robustness? I
> > > > agree
> > > > >> that one has to avoid propagating the checked exception for no
> > reason
> > > > >> however I support propagating it for a good reason.
> > > > >>
> > > > >> The added benefit of raising a checked exception is reminding as
> > well
> > > as
> > > > >> enforcing devs to handle it and be more cautious about this
> > particular
> > > > >> event. I find this compile-time safety check invaluable for
> > > robustness.
> > > > >>
> > > > >
> > > > > +(1 times <some large number>)
> > > > >
> > > > > Daniel
> > > > >
> > > > >
> > > > >
> > > > >>
> > > > >> - Or constantly wrapping it with RuntimeException to get around
> that
> > > > >> If it has to be done, I would recommend relying on a helper to do
> > so.
> > > > >> There
> > > > >> is not much of man-work involved here.
> > > > >>
> > > > >>
> > > > >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <
> > adeneche@maprtech.com
> > > >:
> > > > >>
> > > > >>  +1 to Hanifi's
> > > > >>>
> > > > >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> > > > >>> altekrusejason@gmail.com
> > > > >>>
> > > > >>>>
> > > > >>>>  wrote:
> > > > >>>
> > > > >>>  +1 to Hanifi's comments, I think it makes much more sense to
> have
> > a
> > > > >>>>
> > > > >>> number
> > > > >>>
> > > > >>>> of sites where the operators are explicitly catching a checked
> OOM
> > > > >>>> exception and either decide to handle it or produce a message
> like
> > > > "Hash
> > > > >>>> Aggregate does not support our of memory conditions". This would
> > be
> > > > >>>> particularly useful for debugging queries, as the user exception
> > can
> > > > >>>> provide context information about the current operation. This
> way
> > > > users
> > > > >>>>
> > > > >>> can
> > > > >>>
> > > > >>>> have some idea about the part of their query that might be
> causing
> > > an
> > > > >>>> excessive strain on system resources. I understand that there
> are
> > > also
> > > > >>>> cases where operators competing for memory can make it a toss up
> > to
> > > > >>>> which
> > > > >>>> will actually fail, but this would at least be a step to give
> more
> > > > >>>>
> > > > >>> detailed
> > > > >>>
> > > > >>>> information to users.
> > > > >>>>
> > > > >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org>
> > > wrote:
> > > > >>>>
> > > > >>>>  I would propose throwing a checked exception encouraging
> explicit
> > > and
> > > > >>>>> consistent handling of this event. Each sub-system has liberty
> to
> > > > >>>>>
> > > > >>>> decide
> > > > >>>
> > > > >>>> if
> > > > >>>>
> > > > >>>>> an OOM failure is fatal or non-fatal depending on its
> > capabilities.
> > > > >>>>>
> > > > >>>> Also
> > > > >>>
> > > > >>>> if
> > > > >>>>
> > > > >>>>> at some point a sub-system needs to communicate with its
> callers
> > > via
> > > > a
> > > > >>>>> different mechanism such like using flags (boolean, enum etc)
> or
> > > > >>>>>
> > > > >>>> raising
> > > > >>>
> > > > >>>> an
> > > > >>>>
> > > > >>>>> unchecked exception that's still fine, just handle the
> exception.
> > > If
> > > > >>>>>
> > > > >>>> there
> > > > >>>>
> > > > >>>>> is a need to suppress the checked exception that's fine too,
> just
> > > > use a
> > > > >>>>> helper method.
> > > > >>>>>
> > > > >>>>> Either way, returning *null* sounds problematic in many ways i)
> > it
> > > is
> > > > >>>>> implicit ii) unsafe iii) its handling logic is repetitive iv)
> it
> > is
> > > > >>>>> semantically unclean to make null mean something - even worse
> > > > something
> > > > >>>>> context specific.
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> -Hanifi
> > > > >>>>>
> > > > >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> > > > adeneche@maprtech.com
> > > > >>>>>
> > > > >>>> :
> > > > >>>>
> > > > >>>>>
> > > > >>>>>  I guess that would fix the issue too. But we may still run
> into
> > > > >>>>>>
> > > > >>>>> situations
> > > > >>>>>
> > > > >>>>>> where the caller will pass a flag to "mute" the exception and
> > not
> > > > >>>>>>
> > > > >>>>> handle
> > > > >>>>
> > > > >>>>> the case anyway.
> > > > >>>>>>
> > > > >>>>>> If .buffer() unconditionally throws an exception, can't the
> > > caller,
> > > > >>>>>>
> > > > >>>>> who
> > > > >>>
> > > > >>>> wants to, just catch that and handle it properly ?
> > > > >>>>>>
> > > > >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > > > >>>>>>
> > > > >>>>> chriswestin42@gmail.com>
> > > > >>>>
> > > > >>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>  No, but we should do something close to that.
> > > > >>>>>>>
> > > > >>>>>>> There are cases where the caller can handle the inability to
> > get
> > > > >>>>>>>
> > > > >>>>>> more
> > > > >>>
> > > > >>>> memory, and may be able to go to disk. However, you are correct
> > > > >>>>>>>
> > > > >>>>>> that
> > > > >>>
> > > > >>>> there
> > > > >>>>>>
> > > > >>>>>>> are many that can't handle an OOM, and that fail to check.
> > > > >>>>>>>
> > > > >>>>>>> Instead of unconditionally throwing
> > OutOfMemoryRuntimeException,
> > > I
> > > > >>>>>>>
> > > > >>>>>> would
> > > > >>>>>
> > > > >>>>>> suggest that the buffer() call take a flag that indicates
> > whether
> > > > >>>>>>>
> > > > >>>>>> or
> > > > >>>
> > > > >>>> not
> > > > >>>>>
> > > > >>>>>> it
> > > > >>>>>>
> > > > >>>>>>> should throw if it is unable to fulfill the request. This
> way,
> > > the
> > > > >>>>>>>
> > > > >>>>>> call
> > > > >>>>
> > > > >>>>> sites that can handle an OOM can pass in the flag to return
> null,
> > > > >>>>>>>
> > > > >>>>>> and
> > > > >>>
> > > > >>>> the
> > > > >>>>>
> > > > >>>>>> rest can pass in the flag value to throw, and not have to have
> > any
> > > > >>>>>>>
> > > > >>>>>> checking
> > > > >>>>>>
> > > > >>>>>>> code.
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > > >>>>>>> adeneche@maprtech.com
> > > > >>>>>>>
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>  our current implementations of BufferAllocator.buffer(int,
> > int)
> > > > >>>>>>>>
> > > > >>>>>>> returns
> > > > >>>>>
> > > > >>>>>> null when it cannot allocate the buffer.
> > > > >>>>>>>>
> > > > >>>>>>>> But looking through the code, there are many places that
> don't
> > > > >>>>>>>>
> > > > >>>>>>> check
> > > > >>>>
> > > > >>>>> if
> > > > >>>>>
> > > > >>>>>> the
> > > > >>>>>>>
> > > > >>>>>>>> allocated buffer is null before trying to access it which
> will
> > > > >>>>>>>>
> > > > >>>>>>> throw
> > > > >>>>
> > > > >>>>> a
> > > > >>>>>
> > > > >>>>>> NullPointerException.
> > > > >>>>>>>>
> > > > >>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place
> that
> > > > >>>>>>>>
> > > > >>>>>>> handle
> > > > >>>>
> > > > >>>>> the
> > > > >>>>>>
> > > > >>>>>>> null in a specific manner.
> > > > >>>>>>>>
> > > > >>>>>>>> Should we update the allocators' implementation to throw an
> > > > >>>>>>>> OutOfMemoryRuntimeException instead of returning null ? this
> > has
> > > > >>>>>>>>
> > > > >>>>>>> the
> > > > >>>>
> > > > >>>>> added
> > > > >>>>>>>
> > > > >>>>>>>> benefit of displaying a proper out of memory error message
> to
> > > the
> > > > >>>>>>>>
> > > > >>>>>>> user.
> > > > >>>>>
> > > > >>>>>>
> > > > >>>>>>>> Thanks!
> > > > >>>>>>>>
> > > > >>>>>>>> --
> > > > >>>>>>>>
> > > > >>>>>>>> Abdelhakim Deneche
> > > > >>>>>>>>
> > > > >>>>>>>> Software Engineer
> > > > >>>>>>>>
> > > > >>>>>>>>    <http://www.mapr.com/>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Now Available - Free Hadoop On-Demand Training
> > > > >>>>>>>> <
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >>>
> > > > >>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> --
> > > > >>>>>>
> > > > >>>>>> Abdelhakim Deneche
> > > > >>>>>>
> > > > >>>>>> Software Engineer
> > > > >>>>>>
> > > > >>>>>>    <http://www.mapr.com/>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Now Available - Free Hadoop On-Demand Training
> > > > >>>>>> <
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >>>
> > > > >>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>>
> > > > >>> Abdelhakim Deneche
> > > > >>>
> > > > >>> Software Engineer
> > > > >>>
> > > > >>>    <http://www.mapr.com/>
> > > > >>>
> > > > >>>
> > > > >>> Now Available - Free Hadoop On-Demand Training
> > > > >>> <
> > > > >>>
> > > > >>>
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >>>
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >
> > > > > --
> > > > > Daniel Barclay
> > > > > MapR Technologies
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
buffer(int) is mostly called from value vectors, but there are an
additional 10+ places where it's being used and most of them don't bother
checking if the returned buffer is null.

On Tue, Jun 30, 2015 at 3:39 PM, Jacques Nadeau <ja...@apache.org> wrote:

> What is your sense of the use of this interface? I would hope new uses
> would be scarce. Do you think this will be a frequent occurrence that most
> devs will experience?
> On Jun 30, 2015 3:35 PM, "Abdel Hakim Deneche" <ad...@maprtech.com>
> wrote:
>
> > I started this discussion while fixing DRILL-3312
> > <https://issues.apache.org/jira/browse/DRILL-3312>, it took me quite
> some
> > time to debug and find the root of an error I was seeing. I first thought
> > about fixing those few places that call buffer() without checking if the
> > buffer is null or not, but this won't prevent new code from repeating the
> > same mistakes.
> >
> >
> > On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <ja...@apache.org>
> > wrote:
> >
> > > Lots of different comments on this thread.  Here are few of my
> thoughts:
> > >
> > > I think you need to differentiate the different tiers of interfaces a
> > > little bit more.
> > >
> > > BufferAllocator.buffer() was meant to be a low level advanced
> interface.
> > > It should be used rarely and when done so, only with extreme care.
> Most
> > > devs shouldn't use it.  Having an advanced interface seems fine here.
> > This
> > > was how the interface was envisioned.   It looks like there are
> currently
> > > 16 classes that reference this code not including vectors.  (There
> > probably
> > > should be less, it looks like some code duplication contributes to
> this.)
> > >
> > > All the vector classes also access this code.  Vectors have two
> methods:
> > > allocateNew and allocateNewSafe.  allocateNew has an exception behavior
> > and
> > > allocateNewSafe has a return value behavior.  These are main APIs that
> > are
> > > used extensively throughout the code and the ones we should be most
> > focused
> > > on.  I like giving the developer the choice of behavior (thus the two
> > > options).  It sounds like others are saying to remove this choice
> > (because
> > > code is not managing it correctly).
> > >
> > > If there are bugs in the 16 classes on how they use the advanced
> > interface,
> > > this doesn't seem like a big challenge to correct rather than bulk
> > > modifying the API.  If people are using the wrong method on the vector
> > > classes (or the vector classes aren't correctly behaving), that doesn't
> > > seem like something we should address at the allocator API level.
> > >
> > > Hakim, can you be more specific about your concern?  It doesn't seem
> like
> > > we're talking about thousands of places here.
> > >
> > > Regarding checked versus unchecked...  I'm strongly in support of using
> > > checked exceptions for most things.  However, I'm against it for an out
> > of
> > > memory condition.  (Note how the JVM also treats this as an unchecked
> > > exception.) It causes early catching when, in most cases, we really
> want
> > to
> > > bubble way up the stack and fail the query.  I know of very few places
> > > where we can actually locally manage an out of memory condition so why
> > > write huge amounts of code the bubble the exception all the way to
> where
> > we
> > > can actually handle it.
> > >
> > > A quick sidebar to Jason's "Hash agg out of memory" comment
> > > Two thoughts here.  I'm not sure this is a meaningful message.  Where
> we
> > > run out of memory may have nothing to do with where we are using most
> > > memory.  And if we did to decide this was useful, this can be done just
> > as
> > > easily with unchecked exceptions as checked exception.  What I'm want
> to
> > > avoid is adding throws OutOfMemoryException in 1000 places.  The
> reality
> > is
> > > everybody should assume that an OutOfMemory can occur pretty much
> > anytime.
> > >
> > > TL;DR
> > >
> > > In general, I think this discussion is too abstract and thus has the
> > danger
> > > of trending towards a religious discussion (I already see a bit of that
> > > fervor in some of the responses.).  The number of places we are talking
> > > about are not that large and are easy to find.  Let's solve this by
> > looking
> > > at specifics.  For example, if people think checked would be better,
> > > someone should go through and make an example changeset for everyone to
> > > review.  My guess is, the only that it works will be if very quickly,
> the
> > > handling code converts the checked exception into an unchecked
> > > UserException.  But I'm more than happy to be proven wrong.
> > >
> > > On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <dbarclay@maprtech.com
> >
> > > wrote:
> > >
> > > > Hanifi GUNES wrote:
> > > >
> > > >> - We would end up having to add it to almost everything everywhere
> > > >> Why would one propagate the checked exception for no reason? And why
> > > would
> > > >> one not propagate the exception for a good reason like robustness? I
> > > agree
> > > >> that one has to avoid propagating the checked exception for no
> reason
> > > >> however I support propagating it for a good reason.
> > > >>
> > > >> The added benefit of raising a checked exception is reminding as
> well
> > as
> > > >> enforcing devs to handle it and be more cautious about this
> particular
> > > >> event. I find this compile-time safety check invaluable for
> > robustness.
> > > >>
> > > >
> > > > +(1 times <some large number>)
> > > >
> > > > Daniel
> > > >
> > > >
> > > >
> > > >>
> > > >> - Or constantly wrapping it with RuntimeException to get around that
> > > >> If it has to be done, I would recommend relying on a helper to do
> so.
> > > >> There
> > > >> is not much of man-work involved here.
> > > >>
> > > >>
> > > >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <
> adeneche@maprtech.com
> > >:
> > > >>
> > > >>  +1 to Hanifi's
> > > >>>
> > > >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> > > >>> altekrusejason@gmail.com
> > > >>>
> > > >>>>
> > > >>>>  wrote:
> > > >>>
> > > >>>  +1 to Hanifi's comments, I think it makes much more sense to have
> a
> > > >>>>
> > > >>> number
> > > >>>
> > > >>>> of sites where the operators are explicitly catching a checked OOM
> > > >>>> exception and either decide to handle it or produce a message like
> > > "Hash
> > > >>>> Aggregate does not support our of memory conditions". This would
> be
> > > >>>> particularly useful for debugging queries, as the user exception
> can
> > > >>>> provide context information about the current operation. This way
> > > users
> > > >>>>
> > > >>> can
> > > >>>
> > > >>>> have some idea about the part of their query that might be causing
> > an
> > > >>>> excessive strain on system resources. I understand that there are
> > also
> > > >>>> cases where operators competing for memory can make it a toss up
> to
> > > >>>> which
> > > >>>> will actually fail, but this would at least be a step to give more
> > > >>>>
> > > >>> detailed
> > > >>>
> > > >>>> information to users.
> > > >>>>
> > > >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org>
> > wrote:
> > > >>>>
> > > >>>>  I would propose throwing a checked exception encouraging explicit
> > and
> > > >>>>> consistent handling of this event. Each sub-system has liberty to
> > > >>>>>
> > > >>>> decide
> > > >>>
> > > >>>> if
> > > >>>>
> > > >>>>> an OOM failure is fatal or non-fatal depending on its
> capabilities.
> > > >>>>>
> > > >>>> Also
> > > >>>
> > > >>>> if
> > > >>>>
> > > >>>>> at some point a sub-system needs to communicate with its callers
> > via
> > > a
> > > >>>>> different mechanism such like using flags (boolean, enum etc) or
> > > >>>>>
> > > >>>> raising
> > > >>>
> > > >>>> an
> > > >>>>
> > > >>>>> unchecked exception that's still fine, just handle the exception.
> > If
> > > >>>>>
> > > >>>> there
> > > >>>>
> > > >>>>> is a need to suppress the checked exception that's fine too, just
> > > use a
> > > >>>>> helper method.
> > > >>>>>
> > > >>>>> Either way, returning *null* sounds problematic in many ways i)
> it
> > is
> > > >>>>> implicit ii) unsafe iii) its handling logic is repetitive iv) it
> is
> > > >>>>> semantically unclean to make null mean something - even worse
> > > something
> > > >>>>> context specific.
> > > >>>>>
> > > >>>>>
> > > >>>>> -Hanifi
> > > >>>>>
> > > >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> > > adeneche@maprtech.com
> > > >>>>>
> > > >>>> :
> > > >>>>
> > > >>>>>
> > > >>>>>  I guess that would fix the issue too. But we may still run into
> > > >>>>>>
> > > >>>>> situations
> > > >>>>>
> > > >>>>>> where the caller will pass a flag to "mute" the exception and
> not
> > > >>>>>>
> > > >>>>> handle
> > > >>>>
> > > >>>>> the case anyway.
> > > >>>>>>
> > > >>>>>> If .buffer() unconditionally throws an exception, can't the
> > caller,
> > > >>>>>>
> > > >>>>> who
> > > >>>
> > > >>>> wants to, just catch that and handle it properly ?
> > > >>>>>>
> > > >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > > >>>>>>
> > > >>>>> chriswestin42@gmail.com>
> > > >>>>
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>>  No, but we should do something close to that.
> > > >>>>>>>
> > > >>>>>>> There are cases where the caller can handle the inability to
> get
> > > >>>>>>>
> > > >>>>>> more
> > > >>>
> > > >>>> memory, and may be able to go to disk. However, you are correct
> > > >>>>>>>
> > > >>>>>> that
> > > >>>
> > > >>>> there
> > > >>>>>>
> > > >>>>>>> are many that can't handle an OOM, and that fail to check.
> > > >>>>>>>
> > > >>>>>>> Instead of unconditionally throwing
> OutOfMemoryRuntimeException,
> > I
> > > >>>>>>>
> > > >>>>>> would
> > > >>>>>
> > > >>>>>> suggest that the buffer() call take a flag that indicates
> whether
> > > >>>>>>>
> > > >>>>>> or
> > > >>>
> > > >>>> not
> > > >>>>>
> > > >>>>>> it
> > > >>>>>>
> > > >>>>>>> should throw if it is unable to fulfill the request. This way,
> > the
> > > >>>>>>>
> > > >>>>>> call
> > > >>>>
> > > >>>>> sites that can handle an OOM can pass in the flag to return null,
> > > >>>>>>>
> > > >>>>>> and
> > > >>>
> > > >>>> the
> > > >>>>>
> > > >>>>>> rest can pass in the flag value to throw, and not have to have
> any
> > > >>>>>>>
> > > >>>>>> checking
> > > >>>>>>
> > > >>>>>>> code.
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > >>>>>>> adeneche@maprtech.com
> > > >>>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>>  our current implementations of BufferAllocator.buffer(int,
> int)
> > > >>>>>>>>
> > > >>>>>>> returns
> > > >>>>>
> > > >>>>>> null when it cannot allocate the buffer.
> > > >>>>>>>>
> > > >>>>>>>> But looking through the code, there are many places that don't
> > > >>>>>>>>
> > > >>>>>>> check
> > > >>>>
> > > >>>>> if
> > > >>>>>
> > > >>>>>> the
> > > >>>>>>>
> > > >>>>>>>> allocated buffer is null before trying to access it which will
> > > >>>>>>>>
> > > >>>>>>> throw
> > > >>>>
> > > >>>>> a
> > > >>>>>
> > > >>>>>> NullPointerException.
> > > >>>>>>>>
> > > >>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place that
> > > >>>>>>>>
> > > >>>>>>> handle
> > > >>>>
> > > >>>>> the
> > > >>>>>>
> > > >>>>>>> null in a specific manner.
> > > >>>>>>>>
> > > >>>>>>>> Should we update the allocators' implementation to throw an
> > > >>>>>>>> OutOfMemoryRuntimeException instead of returning null ? this
> has
> > > >>>>>>>>
> > > >>>>>>> the
> > > >>>>
> > > >>>>> added
> > > >>>>>>>
> > > >>>>>>>> benefit of displaying a proper out of memory error message to
> > the
> > > >>>>>>>>
> > > >>>>>>> user.
> > > >>>>>
> > > >>>>>>
> > > >>>>>>>> Thanks!
> > > >>>>>>>>
> > > >>>>>>>> --
> > > >>>>>>>>
> > > >>>>>>>> Abdelhakim Deneche
> > > >>>>>>>>
> > > >>>>>>>> Software Engineer
> > > >>>>>>>>
> > > >>>>>>>>    <http://www.mapr.com/>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Now Available - Free Hadoop On-Demand Training
> > > >>>>>>>> <
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >>>
> > > >>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>>
> > > >>>>>> Abdelhakim Deneche
> > > >>>>>>
> > > >>>>>> Software Engineer
> > > >>>>>>
> > > >>>>>>    <http://www.mapr.com/>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Now Available - Free Hadoop On-Demand Training
> > > >>>>>> <
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >>>
> > > >>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>>
> > > >>> Abdelhakim Deneche
> > > >>>
> > > >>> Software Engineer
> > > >>>
> > > >>>    <http://www.mapr.com/>
> > > >>>
> > > >>>
> > > >>> Now Available - Free Hadoop On-Demand Training
> > > >>> <
> > > >>>
> > > >>>
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > > >
> > > > --
> > > > Daniel Barclay
> > > > MapR Technologies
> > > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Jacques Nadeau <ja...@apache.org>.
What is your sense of the use of this interface? I would hope new uses
would be scarce. Do you think this will be a frequent occurrence that most
devs will experience?
On Jun 30, 2015 3:35 PM, "Abdel Hakim Deneche" <ad...@maprtech.com>
wrote:

> I started this discussion while fixing DRILL-3312
> <https://issues.apache.org/jira/browse/DRILL-3312>, it took me quite some
> time to debug and find the root of an error I was seeing. I first thought
> about fixing those few places that call buffer() without checking if the
> buffer is null or not, but this won't prevent new code from repeating the
> same mistakes.
>
>
> On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <ja...@apache.org>
> wrote:
>
> > Lots of different comments on this thread.  Here are few of my thoughts:
> >
> > I think you need to differentiate the different tiers of interfaces a
> > little bit more.
> >
> > BufferAllocator.buffer() was meant to be a low level advanced interface.
> > It should be used rarely and when done so, only with extreme care.  Most
> > devs shouldn't use it.  Having an advanced interface seems fine here.
> This
> > was how the interface was envisioned.   It looks like there are currently
> > 16 classes that reference this code not including vectors.  (There
> probably
> > should be less, it looks like some code duplication contributes to this.)
> >
> > All the vector classes also access this code.  Vectors have two methods:
> > allocateNew and allocateNewSafe.  allocateNew has an exception behavior
> and
> > allocateNewSafe has a return value behavior.  These are main APIs that
> are
> > used extensively throughout the code and the ones we should be most
> focused
> > on.  I like giving the developer the choice of behavior (thus the two
> > options).  It sounds like others are saying to remove this choice
> (because
> > code is not managing it correctly).
> >
> > If there are bugs in the 16 classes on how they use the advanced
> interface,
> > this doesn't seem like a big challenge to correct rather than bulk
> > modifying the API.  If people are using the wrong method on the vector
> > classes (or the vector classes aren't correctly behaving), that doesn't
> > seem like something we should address at the allocator API level.
> >
> > Hakim, can you be more specific about your concern?  It doesn't seem like
> > we're talking about thousands of places here.
> >
> > Regarding checked versus unchecked...  I'm strongly in support of using
> > checked exceptions for most things.  However, I'm against it for an out
> of
> > memory condition.  (Note how the JVM also treats this as an unchecked
> > exception.) It causes early catching when, in most cases, we really want
> to
> > bubble way up the stack and fail the query.  I know of very few places
> > where we can actually locally manage an out of memory condition so why
> > write huge amounts of code the bubble the exception all the way to where
> we
> > can actually handle it.
> >
> > A quick sidebar to Jason's "Hash agg out of memory" comment
> > Two thoughts here.  I'm not sure this is a meaningful message.  Where we
> > run out of memory may have nothing to do with where we are using most
> > memory.  And if we did to decide this was useful, this can be done just
> as
> > easily with unchecked exceptions as checked exception.  What I'm want to
> > avoid is adding throws OutOfMemoryException in 1000 places.  The reality
> is
> > everybody should assume that an OutOfMemory can occur pretty much
> anytime.
> >
> > TL;DR
> >
> > In general, I think this discussion is too abstract and thus has the
> danger
> > of trending towards a religious discussion (I already see a bit of that
> > fervor in some of the responses.).  The number of places we are talking
> > about are not that large and are easy to find.  Let's solve this by
> looking
> > at specifics.  For example, if people think checked would be better,
> > someone should go through and make an example changeset for everyone to
> > review.  My guess is, the only that it works will be if very quickly, the
> > handling code converts the checked exception into an unchecked
> > UserException.  But I'm more than happy to be proven wrong.
> >
> > On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <db...@maprtech.com>
> > wrote:
> >
> > > Hanifi GUNES wrote:
> > >
> > >> - We would end up having to add it to almost everything everywhere
> > >> Why would one propagate the checked exception for no reason? And why
> > would
> > >> one not propagate the exception for a good reason like robustness? I
> > agree
> > >> that one has to avoid propagating the checked exception for no reason
> > >> however I support propagating it for a good reason.
> > >>
> > >> The added benefit of raising a checked exception is reminding as well
> as
> > >> enforcing devs to handle it and be more cautious about this particular
> > >> event. I find this compile-time safety check invaluable for
> robustness.
> > >>
> > >
> > > +(1 times <some large number>)
> > >
> > > Daniel
> > >
> > >
> > >
> > >>
> > >> - Or constantly wrapping it with RuntimeException to get around that
> > >> If it has to be done, I would recommend relying on a helper to do so.
> > >> There
> > >> is not much of man-work involved here.
> > >>
> > >>
> > >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <adeneche@maprtech.com
> >:
> > >>
> > >>  +1 to Hanifi's
> > >>>
> > >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> > >>> altekrusejason@gmail.com
> > >>>
> > >>>>
> > >>>>  wrote:
> > >>>
> > >>>  +1 to Hanifi's comments, I think it makes much more sense to have a
> > >>>>
> > >>> number
> > >>>
> > >>>> of sites where the operators are explicitly catching a checked OOM
> > >>>> exception and either decide to handle it or produce a message like
> > "Hash
> > >>>> Aggregate does not support our of memory conditions". This would be
> > >>>> particularly useful for debugging queries, as the user exception can
> > >>>> provide context information about the current operation. This way
> > users
> > >>>>
> > >>> can
> > >>>
> > >>>> have some idea about the part of their query that might be causing
> an
> > >>>> excessive strain on system resources. I understand that there are
> also
> > >>>> cases where operators competing for memory can make it a toss up to
> > >>>> which
> > >>>> will actually fail, but this would at least be a step to give more
> > >>>>
> > >>> detailed
> > >>>
> > >>>> information to users.
> > >>>>
> > >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org>
> wrote:
> > >>>>
> > >>>>  I would propose throwing a checked exception encouraging explicit
> and
> > >>>>> consistent handling of this event. Each sub-system has liberty to
> > >>>>>
> > >>>> decide
> > >>>
> > >>>> if
> > >>>>
> > >>>>> an OOM failure is fatal or non-fatal depending on its capabilities.
> > >>>>>
> > >>>> Also
> > >>>
> > >>>> if
> > >>>>
> > >>>>> at some point a sub-system needs to communicate with its callers
> via
> > a
> > >>>>> different mechanism such like using flags (boolean, enum etc) or
> > >>>>>
> > >>>> raising
> > >>>
> > >>>> an
> > >>>>
> > >>>>> unchecked exception that's still fine, just handle the exception.
> If
> > >>>>>
> > >>>> there
> > >>>>
> > >>>>> is a need to suppress the checked exception that's fine too, just
> > use a
> > >>>>> helper method.
> > >>>>>
> > >>>>> Either way, returning *null* sounds problematic in many ways i) it
> is
> > >>>>> implicit ii) unsafe iii) its handling logic is repetitive iv) it is
> > >>>>> semantically unclean to make null mean something - even worse
> > something
> > >>>>> context specific.
> > >>>>>
> > >>>>>
> > >>>>> -Hanifi
> > >>>>>
> > >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> > adeneche@maprtech.com
> > >>>>>
> > >>>> :
> > >>>>
> > >>>>>
> > >>>>>  I guess that would fix the issue too. But we may still run into
> > >>>>>>
> > >>>>> situations
> > >>>>>
> > >>>>>> where the caller will pass a flag to "mute" the exception and not
> > >>>>>>
> > >>>>> handle
> > >>>>
> > >>>>> the case anyway.
> > >>>>>>
> > >>>>>> If .buffer() unconditionally throws an exception, can't the
> caller,
> > >>>>>>
> > >>>>> who
> > >>>
> > >>>> wants to, just catch that and handle it properly ?
> > >>>>>>
> > >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > >>>>>>
> > >>>>> chriswestin42@gmail.com>
> > >>>>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>  No, but we should do something close to that.
> > >>>>>>>
> > >>>>>>> There are cases where the caller can handle the inability to get
> > >>>>>>>
> > >>>>>> more
> > >>>
> > >>>> memory, and may be able to go to disk. However, you are correct
> > >>>>>>>
> > >>>>>> that
> > >>>
> > >>>> there
> > >>>>>>
> > >>>>>>> are many that can't handle an OOM, and that fail to check.
> > >>>>>>>
> > >>>>>>> Instead of unconditionally throwing OutOfMemoryRuntimeException,
> I
> > >>>>>>>
> > >>>>>> would
> > >>>>>
> > >>>>>> suggest that the buffer() call take a flag that indicates whether
> > >>>>>>>
> > >>>>>> or
> > >>>
> > >>>> not
> > >>>>>
> > >>>>>> it
> > >>>>>>
> > >>>>>>> should throw if it is unable to fulfill the request. This way,
> the
> > >>>>>>>
> > >>>>>> call
> > >>>>
> > >>>>> sites that can handle an OOM can pass in the flag to return null,
> > >>>>>>>
> > >>>>>> and
> > >>>
> > >>>> the
> > >>>>>
> > >>>>>> rest can pass in the flag value to throw, and not have to have any
> > >>>>>>>
> > >>>>>> checking
> > >>>>>>
> > >>>>>>> code.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > >>>>>>> adeneche@maprtech.com
> > >>>>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>  our current implementations of BufferAllocator.buffer(int, int)
> > >>>>>>>>
> > >>>>>>> returns
> > >>>>>
> > >>>>>> null when it cannot allocate the buffer.
> > >>>>>>>>
> > >>>>>>>> But looking through the code, there are many places that don't
> > >>>>>>>>
> > >>>>>>> check
> > >>>>
> > >>>>> if
> > >>>>>
> > >>>>>> the
> > >>>>>>>
> > >>>>>>>> allocated buffer is null before trying to access it which will
> > >>>>>>>>
> > >>>>>>> throw
> > >>>>
> > >>>>> a
> > >>>>>
> > >>>>>> NullPointerException.
> > >>>>>>>>
> > >>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place that
> > >>>>>>>>
> > >>>>>>> handle
> > >>>>
> > >>>>> the
> > >>>>>>
> > >>>>>>> null in a specific manner.
> > >>>>>>>>
> > >>>>>>>> Should we update the allocators' implementation to throw an
> > >>>>>>>> OutOfMemoryRuntimeException instead of returning null ? this has
> > >>>>>>>>
> > >>>>>>> the
> > >>>>
> > >>>>> added
> > >>>>>>>
> > >>>>>>>> benefit of displaying a proper out of memory error message to
> the
> > >>>>>>>>
> > >>>>>>> user.
> > >>>>>
> > >>>>>>
> > >>>>>>>> Thanks!
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>>
> > >>>>>>>> Abdelhakim Deneche
> > >>>>>>>>
> > >>>>>>>> Software Engineer
> > >>>>>>>>
> > >>>>>>>>    <http://www.mapr.com/>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Now Available - Free Hadoop On-Demand Training
> > >>>>>>>> <
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >>>
> > >>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>>
> > >>>>>> Abdelhakim Deneche
> > >>>>>>
> > >>>>>> Software Engineer
> > >>>>>>
> > >>>>>>    <http://www.mapr.com/>
> > >>>>>>
> > >>>>>>
> > >>>>>> Now Available - Free Hadoop On-Demand Training
> > >>>>>> <
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >>>
> > >>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Abdelhakim Deneche
> > >>>
> > >>> Software Engineer
> > >>>
> > >>>    <http://www.mapr.com/>
> > >>>
> > >>>
> > >>> Now Available - Free Hadoop On-Demand Training
> > >>> <
> > >>>
> > >>>
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >
> > > --
> > > Daniel Barclay
> > > MapR Technologies
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
I started this discussion while fixing DRILL-3312
<https://issues.apache.org/jira/browse/DRILL-3312>, it took me quite some
time to debug and find the root of an error I was seeing. I first thought
about fixing those few places that call buffer() without checking if the
buffer is null or not, but this won't prevent new code from repeating the
same mistakes.


On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <ja...@apache.org> wrote:

> Lots of different comments on this thread.  Here are few of my thoughts:
>
> I think you need to differentiate the different tiers of interfaces a
> little bit more.
>
> BufferAllocator.buffer() was meant to be a low level advanced interface.
> It should be used rarely and when done so, only with extreme care.  Most
> devs shouldn't use it.  Having an advanced interface seems fine here.  This
> was how the interface was envisioned.   It looks like there are currently
> 16 classes that reference this code not including vectors.  (There probably
> should be less, it looks like some code duplication contributes to this.)
>
> All the vector classes also access this code.  Vectors have two methods:
> allocateNew and allocateNewSafe.  allocateNew has an exception behavior and
> allocateNewSafe has a return value behavior.  These are main APIs that are
> used extensively throughout the code and the ones we should be most focused
> on.  I like giving the developer the choice of behavior (thus the two
> options).  It sounds like others are saying to remove this choice (because
> code is not managing it correctly).
>
> If there are bugs in the 16 classes on how they use the advanced interface,
> this doesn't seem like a big challenge to correct rather than bulk
> modifying the API.  If people are using the wrong method on the vector
> classes (or the vector classes aren't correctly behaving), that doesn't
> seem like something we should address at the allocator API level.
>
> Hakim, can you be more specific about your concern?  It doesn't seem like
> we're talking about thousands of places here.
>
> Regarding checked versus unchecked...  I'm strongly in support of using
> checked exceptions for most things.  However, I'm against it for an out of
> memory condition.  (Note how the JVM also treats this as an unchecked
> exception.) It causes early catching when, in most cases, we really want to
> bubble way up the stack and fail the query.  I know of very few places
> where we can actually locally manage an out of memory condition so why
> write huge amounts of code the bubble the exception all the way to where we
> can actually handle it.
>
> A quick sidebar to Jason's "Hash agg out of memory" comment
> Two thoughts here.  I'm not sure this is a meaningful message.  Where we
> run out of memory may have nothing to do with where we are using most
> memory.  And if we did to decide this was useful, this can be done just as
> easily with unchecked exceptions as checked exception.  What I'm want to
> avoid is adding throws OutOfMemoryException in 1000 places.  The reality is
> everybody should assume that an OutOfMemory can occur pretty much anytime.
>
> TL;DR
>
> In general, I think this discussion is too abstract and thus has the danger
> of trending towards a religious discussion (I already see a bit of that
> fervor in some of the responses.).  The number of places we are talking
> about are not that large and are easy to find.  Let's solve this by looking
> at specifics.  For example, if people think checked would be better,
> someone should go through and make an example changeset for everyone to
> review.  My guess is, the only that it works will be if very quickly, the
> handling code converts the checked exception into an unchecked
> UserException.  But I'm more than happy to be proven wrong.
>
> On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <db...@maprtech.com>
> wrote:
>
> > Hanifi GUNES wrote:
> >
> >> - We would end up having to add it to almost everything everywhere
> >> Why would one propagate the checked exception for no reason? And why
> would
> >> one not propagate the exception for a good reason like robustness? I
> agree
> >> that one has to avoid propagating the checked exception for no reason
> >> however I support propagating it for a good reason.
> >>
> >> The added benefit of raising a checked exception is reminding as well as
> >> enforcing devs to handle it and be more cautious about this particular
> >> event. I find this compile-time safety check invaluable for robustness.
> >>
> >
> > +(1 times <some large number>)
> >
> > Daniel
> >
> >
> >
> >>
> >> - Or constantly wrapping it with RuntimeException to get around that
> >> If it has to be done, I would recommend relying on a helper to do so.
> >> There
> >> is not much of man-work involved here.
> >>
> >>
> >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <ad...@maprtech.com>:
> >>
> >>  +1 to Hanifi's
> >>>
> >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> >>> altekrusejason@gmail.com
> >>>
> >>>>
> >>>>  wrote:
> >>>
> >>>  +1 to Hanifi's comments, I think it makes much more sense to have a
> >>>>
> >>> number
> >>>
> >>>> of sites where the operators are explicitly catching a checked OOM
> >>>> exception and either decide to handle it or produce a message like
> "Hash
> >>>> Aggregate does not support our of memory conditions". This would be
> >>>> particularly useful for debugging queries, as the user exception can
> >>>> provide context information about the current operation. This way
> users
> >>>>
> >>> can
> >>>
> >>>> have some idea about the part of their query that might be causing an
> >>>> excessive strain on system resources. I understand that there are also
> >>>> cases where operators competing for memory can make it a toss up to
> >>>> which
> >>>> will actually fail, but this would at least be a step to give more
> >>>>
> >>> detailed
> >>>
> >>>> information to users.
> >>>>
> >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org> wrote:
> >>>>
> >>>>  I would propose throwing a checked exception encouraging explicit and
> >>>>> consistent handling of this event. Each sub-system has liberty to
> >>>>>
> >>>> decide
> >>>
> >>>> if
> >>>>
> >>>>> an OOM failure is fatal or non-fatal depending on its capabilities.
> >>>>>
> >>>> Also
> >>>
> >>>> if
> >>>>
> >>>>> at some point a sub-system needs to communicate with its callers via
> a
> >>>>> different mechanism such like using flags (boolean, enum etc) or
> >>>>>
> >>>> raising
> >>>
> >>>> an
> >>>>
> >>>>> unchecked exception that's still fine, just handle the exception. If
> >>>>>
> >>>> there
> >>>>
> >>>>> is a need to suppress the checked exception that's fine too, just
> use a
> >>>>> helper method.
> >>>>>
> >>>>> Either way, returning *null* sounds problematic in many ways i) it is
> >>>>> implicit ii) unsafe iii) its handling logic is repetitive iv) it is
> >>>>> semantically unclean to make null mean something - even worse
> something
> >>>>> context specific.
> >>>>>
> >>>>>
> >>>>> -Hanifi
> >>>>>
> >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> adeneche@maprtech.com
> >>>>>
> >>>> :
> >>>>
> >>>>>
> >>>>>  I guess that would fix the issue too. But we may still run into
> >>>>>>
> >>>>> situations
> >>>>>
> >>>>>> where the caller will pass a flag to "mute" the exception and not
> >>>>>>
> >>>>> handle
> >>>>
> >>>>> the case anyway.
> >>>>>>
> >>>>>> If .buffer() unconditionally throws an exception, can't the caller,
> >>>>>>
> >>>>> who
> >>>
> >>>> wants to, just catch that and handle it properly ?
> >>>>>>
> >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> >>>>>>
> >>>>> chriswestin42@gmail.com>
> >>>>
> >>>>> wrote:
> >>>>>>
> >>>>>>  No, but we should do something close to that.
> >>>>>>>
> >>>>>>> There are cases where the caller can handle the inability to get
> >>>>>>>
> >>>>>> more
> >>>
> >>>> memory, and may be able to go to disk. However, you are correct
> >>>>>>>
> >>>>>> that
> >>>
> >>>> there
> >>>>>>
> >>>>>>> are many that can't handle an OOM, and that fail to check.
> >>>>>>>
> >>>>>>> Instead of unconditionally throwing OutOfMemoryRuntimeException, I
> >>>>>>>
> >>>>>> would
> >>>>>
> >>>>>> suggest that the buffer() call take a flag that indicates whether
> >>>>>>>
> >>>>>> or
> >>>
> >>>> not
> >>>>>
> >>>>>> it
> >>>>>>
> >>>>>>> should throw if it is unable to fulfill the request. This way, the
> >>>>>>>
> >>>>>> call
> >>>>
> >>>>> sites that can handle an OOM can pass in the flag to return null,
> >>>>>>>
> >>>>>> and
> >>>
> >>>> the
> >>>>>
> >>>>>> rest can pass in the flag value to throw, and not have to have any
> >>>>>>>
> >>>>>> checking
> >>>>>>
> >>>>>>> code.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> >>>>>>> adeneche@maprtech.com
> >>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>
> >>>>>>>  our current implementations of BufferAllocator.buffer(int, int)
> >>>>>>>>
> >>>>>>> returns
> >>>>>
> >>>>>> null when it cannot allocate the buffer.
> >>>>>>>>
> >>>>>>>> But looking through the code, there are many places that don't
> >>>>>>>>
> >>>>>>> check
> >>>>
> >>>>> if
> >>>>>
> >>>>>> the
> >>>>>>>
> >>>>>>>> allocated buffer is null before trying to access it which will
> >>>>>>>>
> >>>>>>> throw
> >>>>
> >>>>> a
> >>>>>
> >>>>>> NullPointerException.
> >>>>>>>>
> >>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place that
> >>>>>>>>
> >>>>>>> handle
> >>>>
> >>>>> the
> >>>>>>
> >>>>>>> null in a specific manner.
> >>>>>>>>
> >>>>>>>> Should we update the allocators' implementation to throw an
> >>>>>>>> OutOfMemoryRuntimeException instead of returning null ? this has
> >>>>>>>>
> >>>>>>> the
> >>>>
> >>>>> added
> >>>>>>>
> >>>>>>>> benefit of displaying a proper out of memory error message to the
> >>>>>>>>
> >>>>>>> user.
> >>>>>
> >>>>>>
> >>>>>>>> Thanks!
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> Abdelhakim Deneche
> >>>>>>>>
> >>>>>>>> Software Engineer
> >>>>>>>>
> >>>>>>>>    <http://www.mapr.com/>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Now Available - Free Hadoop On-Demand Training
> >>>>>>>> <
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >>>
> >>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> Abdelhakim Deneche
> >>>>>>
> >>>>>> Software Engineer
> >>>>>>
> >>>>>>    <http://www.mapr.com/>
> >>>>>>
> >>>>>>
> >>>>>> Now Available - Free Hadoop On-Demand Training
> >>>>>> <
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >>>
> >>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>>
> >>> Abdelhakim Deneche
> >>>
> >>> Software Engineer
> >>>
> >>>    <http://www.mapr.com/>
> >>>
> >>>
> >>> Now Available - Free Hadoop On-Demand Training
> >>> <
> >>>
> >>>
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >>>
> >>>>
> >>>>
> >>>
> >>
> >
> > --
> > Daniel Barclay
> > MapR Technologies
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Jacques Nadeau <ja...@apache.org>.
Lots of different comments on this thread.  Here are few of my thoughts:

I think you need to differentiate the different tiers of interfaces a
little bit more.

BufferAllocator.buffer() was meant to be a low level advanced interface.
It should be used rarely and when done so, only with extreme care.  Most
devs shouldn't use it.  Having an advanced interface seems fine here.  This
was how the interface was envisioned.   It looks like there are currently
16 classes that reference this code not including vectors.  (There probably
should be less, it looks like some code duplication contributes to this.)

All the vector classes also access this code.  Vectors have two methods:
allocateNew and allocateNewSafe.  allocateNew has an exception behavior and
allocateNewSafe has a return value behavior.  These are main APIs that are
used extensively throughout the code and the ones we should be most focused
on.  I like giving the developer the choice of behavior (thus the two
options).  It sounds like others are saying to remove this choice (because
code is not managing it correctly).

If there are bugs in the 16 classes on how they use the advanced interface,
this doesn't seem like a big challenge to correct rather than bulk
modifying the API.  If people are using the wrong method on the vector
classes (or the vector classes aren't correctly behaving), that doesn't
seem like something we should address at the allocator API level.

Hakim, can you be more specific about your concern?  It doesn't seem like
we're talking about thousands of places here.

Regarding checked versus unchecked...  I'm strongly in support of using
checked exceptions for most things.  However, I'm against it for an out of
memory condition.  (Note how the JVM also treats this as an unchecked
exception.) It causes early catching when, in most cases, we really want to
bubble way up the stack and fail the query.  I know of very few places
where we can actually locally manage an out of memory condition so why
write huge amounts of code the bubble the exception all the way to where we
can actually handle it.

A quick sidebar to Jason's "Hash agg out of memory" comment
Two thoughts here.  I'm not sure this is a meaningful message.  Where we
run out of memory may have nothing to do with where we are using most
memory.  And if we did to decide this was useful, this can be done just as
easily with unchecked exceptions as checked exception.  What I'm want to
avoid is adding throws OutOfMemoryException in 1000 places.  The reality is
everybody should assume that an OutOfMemory can occur pretty much anytime.

TL;DR

In general, I think this discussion is too abstract and thus has the danger
of trending towards a religious discussion (I already see a bit of that
fervor in some of the responses.).  The number of places we are talking
about are not that large and are easy to find.  Let's solve this by looking
at specifics.  For example, if people think checked would be better,
someone should go through and make an example changeset for everyone to
review.  My guess is, the only that it works will be if very quickly, the
handling code converts the checked exception into an unchecked
UserException.  But I'm more than happy to be proven wrong.

On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <db...@maprtech.com>
wrote:

> Hanifi GUNES wrote:
>
>> - We would end up having to add it to almost everything everywhere
>> Why would one propagate the checked exception for no reason? And why would
>> one not propagate the exception for a good reason like robustness? I agree
>> that one has to avoid propagating the checked exception for no reason
>> however I support propagating it for a good reason.
>>
>> The added benefit of raising a checked exception is reminding as well as
>> enforcing devs to handle it and be more cautious about this particular
>> event. I find this compile-time safety check invaluable for robustness.
>>
>
> +(1 times <some large number>)
>
> Daniel
>
>
>
>>
>> - Or constantly wrapping it with RuntimeException to get around that
>> If it has to be done, I would recommend relying on a helper to do so.
>> There
>> is not much of man-work involved here.
>>
>>
>> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <ad...@maprtech.com>:
>>
>>  +1 to Hanifi's
>>>
>>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
>>> altekrusejason@gmail.com
>>>
>>>>
>>>>  wrote:
>>>
>>>  +1 to Hanifi's comments, I think it makes much more sense to have a
>>>>
>>> number
>>>
>>>> of sites where the operators are explicitly catching a checked OOM
>>>> exception and either decide to handle it or produce a message like "Hash
>>>> Aggregate does not support our of memory conditions". This would be
>>>> particularly useful for debugging queries, as the user exception can
>>>> provide context information about the current operation. This way users
>>>>
>>> can
>>>
>>>> have some idea about the part of their query that might be causing an
>>>> excessive strain on system resources. I understand that there are also
>>>> cases where operators competing for memory can make it a toss up to
>>>> which
>>>> will actually fail, but this would at least be a step to give more
>>>>
>>> detailed
>>>
>>>> information to users.
>>>>
>>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org> wrote:
>>>>
>>>>  I would propose throwing a checked exception encouraging explicit and
>>>>> consistent handling of this event. Each sub-system has liberty to
>>>>>
>>>> decide
>>>
>>>> if
>>>>
>>>>> an OOM failure is fatal or non-fatal depending on its capabilities.
>>>>>
>>>> Also
>>>
>>>> if
>>>>
>>>>> at some point a sub-system needs to communicate with its callers via a
>>>>> different mechanism such like using flags (boolean, enum etc) or
>>>>>
>>>> raising
>>>
>>>> an
>>>>
>>>>> unchecked exception that's still fine, just handle the exception. If
>>>>>
>>>> there
>>>>
>>>>> is a need to suppress the checked exception that's fine too, just use a
>>>>> helper method.
>>>>>
>>>>> Either way, returning *null* sounds problematic in many ways i) it is
>>>>> implicit ii) unsafe iii) its handling logic is repetitive iv) it is
>>>>> semantically unclean to make null mean something - even worse something
>>>>> context specific.
>>>>>
>>>>>
>>>>> -Hanifi
>>>>>
>>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <adeneche@maprtech.com
>>>>>
>>>> :
>>>>
>>>>>
>>>>>  I guess that would fix the issue too. But we may still run into
>>>>>>
>>>>> situations
>>>>>
>>>>>> where the caller will pass a flag to "mute" the exception and not
>>>>>>
>>>>> handle
>>>>
>>>>> the case anyway.
>>>>>>
>>>>>> If .buffer() unconditionally throws an exception, can't the caller,
>>>>>>
>>>>> who
>>>
>>>> wants to, just catch that and handle it properly ?
>>>>>>
>>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
>>>>>>
>>>>> chriswestin42@gmail.com>
>>>>
>>>>> wrote:
>>>>>>
>>>>>>  No, but we should do something close to that.
>>>>>>>
>>>>>>> There are cases where the caller can handle the inability to get
>>>>>>>
>>>>>> more
>>>
>>>> memory, and may be able to go to disk. However, you are correct
>>>>>>>
>>>>>> that
>>>
>>>> there
>>>>>>
>>>>>>> are many that can't handle an OOM, and that fail to check.
>>>>>>>
>>>>>>> Instead of unconditionally throwing OutOfMemoryRuntimeException, I
>>>>>>>
>>>>>> would
>>>>>
>>>>>> suggest that the buffer() call take a flag that indicates whether
>>>>>>>
>>>>>> or
>>>
>>>> not
>>>>>
>>>>>> it
>>>>>>
>>>>>>> should throw if it is unable to fulfill the request. This way, the
>>>>>>>
>>>>>> call
>>>>
>>>>> sites that can handle an OOM can pass in the flag to return null,
>>>>>>>
>>>>>> and
>>>
>>>> the
>>>>>
>>>>>> rest can pass in the flag value to throw, and not have to have any
>>>>>>>
>>>>>> checking
>>>>>>
>>>>>>> code.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
>>>>>>> adeneche@maprtech.com
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>
>>>>>>>  our current implementations of BufferAllocator.buffer(int, int)
>>>>>>>>
>>>>>>> returns
>>>>>
>>>>>> null when it cannot allocate the buffer.
>>>>>>>>
>>>>>>>> But looking through the code, there are many places that don't
>>>>>>>>
>>>>>>> check
>>>>
>>>>> if
>>>>>
>>>>>> the
>>>>>>>
>>>>>>>> allocated buffer is null before trying to access it which will
>>>>>>>>
>>>>>>> throw
>>>>
>>>>> a
>>>>>
>>>>>> NullPointerException.
>>>>>>>>
>>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place that
>>>>>>>>
>>>>>>> handle
>>>>
>>>>> the
>>>>>>
>>>>>>> null in a specific manner.
>>>>>>>>
>>>>>>>> Should we update the allocators' implementation to throw an
>>>>>>>> OutOfMemoryRuntimeException instead of returning null ? this has
>>>>>>>>
>>>>>>> the
>>>>
>>>>> added
>>>>>>>
>>>>>>>> benefit of displaying a proper out of memory error message to the
>>>>>>>>
>>>>>>> user.
>>>>>
>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Abdelhakim Deneche
>>>>>>>>
>>>>>>>> Software Engineer
>>>>>>>>
>>>>>>>>    <http://www.mapr.com/>
>>>>>>>>
>>>>>>>>
>>>>>>>> Now Available - Free Hadoop On-Demand Training
>>>>>>>> <
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>
>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Abdelhakim Deneche
>>>>>>
>>>>>> Software Engineer
>>>>>>
>>>>>>    <http://www.mapr.com/>
>>>>>>
>>>>>>
>>>>>> Now Available - Free Hadoop On-Demand Training
>>>>>> <
>>>>>>
>>>>>>
>>>>>
>>>>
>>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>
>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Abdelhakim Deneche
>>>
>>> Software Engineer
>>>
>>>    <http://www.mapr.com/>
>>>
>>>
>>> Now Available - Free Hadoop On-Demand Training
>>> <
>>>
>>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>
>>>>
>>>>
>>>
>>
>
> --
> Daniel Barclay
> MapR Technologies
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Daniel Barclay <db...@maprtech.com>.
Hanifi GUNES wrote:
> - We would end up having to add it to almost everything everywhere
> Why would one propagate the checked exception for no reason? And why would
> one not propagate the exception for a good reason like robustness? I agree
> that one has to avoid propagating the checked exception for no reason
> however I support propagating it for a good reason.
>
> The added benefit of raising a checked exception is reminding as well as
> enforcing devs to handle it and be more cautious about this particular
> event. I find this compile-time safety check invaluable for robustness.

+(1 times <some large number>)

Daniel

>
>
> - Or constantly wrapping it with RuntimeException to get around that
> If it has to be done, I would recommend relying on a helper to do so. There
> is not much of man-work involved here.
>
>
> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <ad...@maprtech.com>:
>
>> +1 to Hanifi's
>>
>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <altekrusejason@gmail.com
>>>
>> wrote:
>>
>>> +1 to Hanifi's comments, I think it makes much more sense to have a
>> number
>>> of sites where the operators are explicitly catching a checked OOM
>>> exception and either decide to handle it or produce a message like "Hash
>>> Aggregate does not support our of memory conditions". This would be
>>> particularly useful for debugging queries, as the user exception can
>>> provide context information about the current operation. This way users
>> can
>>> have some idea about the part of their query that might be causing an
>>> excessive strain on system resources. I understand that there are also
>>> cases where operators competing for memory can make it a toss up to which
>>> will actually fail, but this would at least be a step to give more
>> detailed
>>> information to users.
>>>
>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org> wrote:
>>>
>>>> I would propose throwing a checked exception encouraging explicit and
>>>> consistent handling of this event. Each sub-system has liberty to
>> decide
>>> if
>>>> an OOM failure is fatal or non-fatal depending on its capabilities.
>> Also
>>> if
>>>> at some point a sub-system needs to communicate with its callers via a
>>>> different mechanism such like using flags (boolean, enum etc) or
>> raising
>>> an
>>>> unchecked exception that's still fine, just handle the exception. If
>>> there
>>>> is a need to suppress the checked exception that's fine too, just use a
>>>> helper method.
>>>>
>>>> Either way, returning *null* sounds problematic in many ways i) it is
>>>> implicit ii) unsafe iii) its handling logic is repetitive iv) it is
>>>> semantically unclean to make null mean something - even worse something
>>>> context specific.
>>>>
>>>>
>>>> -Hanifi
>>>>
>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <adeneche@maprtech.com
>>> :
>>>>
>>>>> I guess that would fix the issue too. But we may still run into
>>>> situations
>>>>> where the caller will pass a flag to "mute" the exception and not
>>> handle
>>>>> the case anyway.
>>>>>
>>>>> If .buffer() unconditionally throws an exception, can't the caller,
>> who
>>>>> wants to, just catch that and handle it properly ?
>>>>>
>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
>>> chriswestin42@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> No, but we should do something close to that.
>>>>>>
>>>>>> There are cases where the caller can handle the inability to get
>> more
>>>>>> memory, and may be able to go to disk. However, you are correct
>> that
>>>>> there
>>>>>> are many that can't handle an OOM, and that fail to check.
>>>>>>
>>>>>> Instead of unconditionally throwing OutOfMemoryRuntimeException, I
>>>> would
>>>>>> suggest that the buffer() call take a flag that indicates whether
>> or
>>>> not
>>>>> it
>>>>>> should throw if it is unable to fulfill the request. This way, the
>>> call
>>>>>> sites that can handle an OOM can pass in the flag to return null,
>> and
>>>> the
>>>>>> rest can pass in the flag value to throw, and not have to have any
>>>>> checking
>>>>>> code.
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
>>>>>> adeneche@maprtech.com
>>>>>>> wrote:
>>>>>>
>>>>>>> our current implementations of BufferAllocator.buffer(int, int)
>>>> returns
>>>>>>> null when it cannot allocate the buffer.
>>>>>>>
>>>>>>> But looking through the code, there are many places that don't
>>> check
>>>> if
>>>>>> the
>>>>>>> allocated buffer is null before trying to access it which will
>>> throw
>>>> a
>>>>>>> NullPointerException.
>>>>>>>
>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place that
>>> handle
>>>>> the
>>>>>>> null in a specific manner.
>>>>>>>
>>>>>>> Should we update the allocators' implementation to throw an
>>>>>>> OutOfMemoryRuntimeException instead of returning null ? this has
>>> the
>>>>>> added
>>>>>>> benefit of displaying a proper out of memory error message to the
>>>> user.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Abdelhakim Deneche
>>>>>>>
>>>>>>> Software Engineer
>>>>>>>
>>>>>>>    <http://www.mapr.com/>
>>>>>>>
>>>>>>>
>>>>>>> Now Available - Free Hadoop On-Demand Training
>>>>>>> <
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Abdelhakim Deneche
>>>>>
>>>>> Software Engineer
>>>>>
>>>>>    <http://www.mapr.com/>
>>>>>
>>>>>
>>>>> Now Available - Free Hadoop On-Demand Training
>>>>> <
>>>>>
>>>>
>>>
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>>
>> Abdelhakim Deneche
>>
>> Software Engineer
>>
>>    <http://www.mapr.com/>
>>
>>
>> Now Available - Free Hadoop On-Demand Training
>> <
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>
>>
>


-- 
Daniel Barclay
MapR Technologies

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Hanifi GUNES <ha...@gmail.com>.
- We would end up having to add it to almost everything everywhere
Why would one propagate the checked exception for no reason? And why would
one not propagate the exception for a good reason like robustness? I agree
that one has to avoid propagating the checked exception for no reason
however I support propagating it for a good reason.

The added benefit of raising a checked exception is reminding as well as
enforcing devs to handle it and be more cautious about this particular
event. I find this compile-time safety check invaluable for robustness.


- Or constantly wrapping it with RuntimeException to get around that
If it has to be done, I would recommend relying on a helper to do so. There
is not much of man-work involved here.


2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <ad...@maprtech.com>:

> +1 to Hanifi's
>
> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <altekrusejason@gmail.com
> >
> wrote:
>
> > +1 to Hanifi's comments, I think it makes much more sense to have a
> number
> > of sites where the operators are explicitly catching a checked OOM
> > exception and either decide to handle it or produce a message like "Hash
> > Aggregate does not support our of memory conditions". This would be
> > particularly useful for debugging queries, as the user exception can
> > provide context information about the current operation. This way users
> can
> > have some idea about the part of their query that might be causing an
> > excessive strain on system resources. I understand that there are also
> > cases where operators competing for memory can make it a toss up to which
> > will actually fail, but this would at least be a step to give more
> detailed
> > information to users.
> >
> > On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org> wrote:
> >
> > > I would propose throwing a checked exception encouraging explicit and
> > > consistent handling of this event. Each sub-system has liberty to
> decide
> > if
> > > an OOM failure is fatal or non-fatal depending on its capabilities.
> Also
> > if
> > > at some point a sub-system needs to communicate with its callers via a
> > > different mechanism such like using flags (boolean, enum etc) or
> raising
> > an
> > > unchecked exception that's still fine, just handle the exception. If
> > there
> > > is a need to suppress the checked exception that's fine too, just use a
> > > helper method.
> > >
> > > Either way, returning *null* sounds problematic in many ways i) it is
> > > implicit ii) unsafe iii) its handling logic is repetitive iv) it is
> > > semantically unclean to make null mean something - even worse something
> > > context specific.
> > >
> > >
> > > -Hanifi
> > >
> > > 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <adeneche@maprtech.com
> >:
> > >
> > > > I guess that would fix the issue too. But we may still run into
> > > situations
> > > > where the caller will pass a flag to "mute" the exception and not
> > handle
> > > > the case anyway.
> > > >
> > > > If .buffer() unconditionally throws an exception, can't the caller,
> who
> > > > wants to, just catch that and handle it properly ?
> > > >
> > > > On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > chriswestin42@gmail.com>
> > > > wrote:
> > > >
> > > > > No, but we should do something close to that.
> > > > >
> > > > > There are cases where the caller can handle the inability to get
> more
> > > > > memory, and may be able to go to disk. However, you are correct
> that
> > > > there
> > > > > are many that can't handle an OOM, and that fail to check.
> > > > >
> > > > > Instead of unconditionally throwing OutOfMemoryRuntimeException, I
> > > would
> > > > > suggest that the buffer() call take a flag that indicates whether
> or
> > > not
> > > > it
> > > > > should throw if it is unable to fulfill the request. This way, the
> > call
> > > > > sites that can handle an OOM can pass in the flag to return null,
> and
> > > the
> > > > > rest can pass in the flag value to throw, and not have to have any
> > > > checking
> > > > > code.
> > > > >
> > > > >
> > > > > On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > > > adeneche@maprtech.com
> > > > > > wrote:
> > > > >
> > > > > > our current implementations of BufferAllocator.buffer(int, int)
> > > returns
> > > > > > null when it cannot allocate the buffer.
> > > > > >
> > > > > > But looking through the code, there are many places that don't
> > check
> > > if
> > > > > the
> > > > > > allocated buffer is null before trying to access it which will
> > throw
> > > a
> > > > > > NullPointerException.
> > > > > >
> > > > > > ValueVectors' allocateNewSafe() seem to be the only place that
> > handle
> > > > the
> > > > > > null in a specific manner.
> > > > > >
> > > > > > Should we update the allocators' implementation to throw an
> > > > > > OutOfMemoryRuntimeException instead of returning null ? this has
> > the
> > > > > added
> > > > > > benefit of displaying a proper out of memory error message to the
> > > user.
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Abdelhakim Deneche
> > > > > >
> > > > > > Software Engineer
> > > > > >
> > > > > >   <http://www.mapr.com/>
> > > > > >
> > > > > >
> > > > > > Now Available - Free Hadoop On-Demand Training
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >   <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >
> > > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
+1 to Hanifi's

On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <al...@gmail.com>
wrote:

> +1 to Hanifi's comments, I think it makes much more sense to have a number
> of sites where the operators are explicitly catching a checked OOM
> exception and either decide to handle it or produce a message like "Hash
> Aggregate does not support our of memory conditions". This would be
> particularly useful for debugging queries, as the user exception can
> provide context information about the current operation. This way users can
> have some idea about the part of their query that might be causing an
> excessive strain on system resources. I understand that there are also
> cases where operators competing for memory can make it a toss up to which
> will actually fail, but this would at least be a step to give more detailed
> information to users.
>
> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org> wrote:
>
> > I would propose throwing a checked exception encouraging explicit and
> > consistent handling of this event. Each sub-system has liberty to decide
> if
> > an OOM failure is fatal or non-fatal depending on its capabilities. Also
> if
> > at some point a sub-system needs to communicate with its callers via a
> > different mechanism such like using flags (boolean, enum etc) or raising
> an
> > unchecked exception that's still fine, just handle the exception. If
> there
> > is a need to suppress the checked exception that's fine too, just use a
> > helper method.
> >
> > Either way, returning *null* sounds problematic in many ways i) it is
> > implicit ii) unsafe iii) its handling logic is repetitive iv) it is
> > semantically unclean to make null mean something - even worse something
> > context specific.
> >
> >
> > -Hanifi
> >
> > 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <ad...@maprtech.com>:
> >
> > > I guess that would fix the issue too. But we may still run into
> > situations
> > > where the caller will pass a flag to "mute" the exception and not
> handle
> > > the case anyway.
> > >
> > > If .buffer() unconditionally throws an exception, can't the caller, who
> > > wants to, just catch that and handle it properly ?
> > >
> > > On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> chriswestin42@gmail.com>
> > > wrote:
> > >
> > > > No, but we should do something close to that.
> > > >
> > > > There are cases where the caller can handle the inability to get more
> > > > memory, and may be able to go to disk. However, you are correct that
> > > there
> > > > are many that can't handle an OOM, and that fail to check.
> > > >
> > > > Instead of unconditionally throwing OutOfMemoryRuntimeException, I
> > would
> > > > suggest that the buffer() call take a flag that indicates whether or
> > not
> > > it
> > > > should throw if it is unable to fulfill the request. This way, the
> call
> > > > sites that can handle an OOM can pass in the flag to return null, and
> > the
> > > > rest can pass in the flag value to throw, and not have to have any
> > > checking
> > > > code.
> > > >
> > > >
> > > > On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > > adeneche@maprtech.com
> > > > > wrote:
> > > >
> > > > > our current implementations of BufferAllocator.buffer(int, int)
> > returns
> > > > > null when it cannot allocate the buffer.
> > > > >
> > > > > But looking through the code, there are many places that don't
> check
> > if
> > > > the
> > > > > allocated buffer is null before trying to access it which will
> throw
> > a
> > > > > NullPointerException.
> > > > >
> > > > > ValueVectors' allocateNewSafe() seem to be the only place that
> handle
> > > the
> > > > > null in a specific manner.
> > > > >
> > > > > Should we update the allocators' implementation to throw an
> > > > > OutOfMemoryRuntimeException instead of returning null ? this has
> the
> > > > added
> > > > > benefit of displaying a proper out of memory error message to the
> > user.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > --
> > > > >
> > > > > Abdelhakim Deneche
> > > > >
> > > > > Software Engineer
> > > > >
> > > > >   <http://www.mapr.com/>
> > > > >
> > > > >
> > > > > Now Available - Free Hadoop On-Demand Training
> > > > > <
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Jason Altekruse <al...@gmail.com>.
+1 to Hanifi's comments, I think it makes much more sense to have a number
of sites where the operators are explicitly catching a checked OOM
exception and either decide to handle it or produce a message like "Hash
Aggregate does not support our of memory conditions". This would be
particularly useful for debugging queries, as the user exception can
provide context information about the current operation. This way users can
have some idea about the part of their query that might be causing an
excessive strain on system resources. I understand that there are also
cases where operators competing for memory can make it a toss up to which
will actually fail, but this would at least be a step to give more detailed
information to users.

On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg...@apache.org> wrote:

> I would propose throwing a checked exception encouraging explicit and
> consistent handling of this event. Each sub-system has liberty to decide if
> an OOM failure is fatal or non-fatal depending on its capabilities. Also if
> at some point a sub-system needs to communicate with its callers via a
> different mechanism such like using flags (boolean, enum etc) or raising an
> unchecked exception that's still fine, just handle the exception. If there
> is a need to suppress the checked exception that's fine too, just use a
> helper method.
>
> Either way, returning *null* sounds problematic in many ways i) it is
> implicit ii) unsafe iii) its handling logic is repetitive iv) it is
> semantically unclean to make null mean something - even worse something
> context specific.
>
>
> -Hanifi
>
> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <ad...@maprtech.com>:
>
> > I guess that would fix the issue too. But we may still run into
> situations
> > where the caller will pass a flag to "mute" the exception and not handle
> > the case anyway.
> >
> > If .buffer() unconditionally throws an exception, can't the caller, who
> > wants to, just catch that and handle it properly ?
> >
> > On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <ch...@gmail.com>
> > wrote:
> >
> > > No, but we should do something close to that.
> > >
> > > There are cases where the caller can handle the inability to get more
> > > memory, and may be able to go to disk. However, you are correct that
> > there
> > > are many that can't handle an OOM, and that fail to check.
> > >
> > > Instead of unconditionally throwing OutOfMemoryRuntimeException, I
> would
> > > suggest that the buffer() call take a flag that indicates whether or
> not
> > it
> > > should throw if it is unable to fulfill the request. This way, the call
> > > sites that can handle an OOM can pass in the flag to return null, and
> the
> > > rest can pass in the flag value to throw, and not have to have any
> > checking
> > > code.
> > >
> > >
> > > On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > > adeneche@maprtech.com
> > > > wrote:
> > >
> > > > our current implementations of BufferAllocator.buffer(int, int)
> returns
> > > > null when it cannot allocate the buffer.
> > > >
> > > > But looking through the code, there are many places that don't check
> if
> > > the
> > > > allocated buffer is null before trying to access it which will throw
> a
> > > > NullPointerException.
> > > >
> > > > ValueVectors' allocateNewSafe() seem to be the only place that handle
> > the
> > > > null in a specific manner.
> > > >
> > > > Should we update the allocators' implementation to throw an
> > > > OutOfMemoryRuntimeException instead of returning null ? this has the
> > > added
> > > > benefit of displaying a proper out of memory error message to the
> user.
> > > >
> > > > Thanks!
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >   <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Hanifi GUNES <hg...@apache.org>.
I would propose throwing a checked exception encouraging explicit and
consistent handling of this event. Each sub-system has liberty to decide if
an OOM failure is fatal or non-fatal depending on its capabilities. Also if
at some point a sub-system needs to communicate with its callers via a
different mechanism such like using flags (boolean, enum etc) or raising an
unchecked exception that's still fine, just handle the exception. If there
is a need to suppress the checked exception that's fine too, just use a
helper method.

Either way, returning *null* sounds problematic in many ways i) it is
implicit ii) unsafe iii) its handling logic is repetitive iv) it is
semantically unclean to make null mean something - even worse something
context specific.


-Hanifi

2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <ad...@maprtech.com>:

> I guess that would fix the issue too. But we may still run into situations
> where the caller will pass a flag to "mute" the exception and not handle
> the case anyway.
>
> If .buffer() unconditionally throws an exception, can't the caller, who
> wants to, just catch that and handle it properly ?
>
> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <ch...@gmail.com>
> wrote:
>
> > No, but we should do something close to that.
> >
> > There are cases where the caller can handle the inability to get more
> > memory, and may be able to go to disk. However, you are correct that
> there
> > are many that can't handle an OOM, and that fail to check.
> >
> > Instead of unconditionally throwing OutOfMemoryRuntimeException, I would
> > suggest that the buffer() call take a flag that indicates whether or not
> it
> > should throw if it is unable to fulfill the request. This way, the call
> > sites that can handle an OOM can pass in the flag to return null, and the
> > rest can pass in the flag value to throw, and not have to have any
> checking
> > code.
> >
> >
> > On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> > adeneche@maprtech.com
> > > wrote:
> >
> > > our current implementations of BufferAllocator.buffer(int, int) returns
> > > null when it cannot allocate the buffer.
> > >
> > > But looking through the code, there are many places that don't check if
> > the
> > > allocated buffer is null before trying to access it which will throw a
> > > NullPointerException.
> > >
> > > ValueVectors' allocateNewSafe() seem to be the only place that handle
> the
> > > null in a specific manner.
> > >
> > > Should we update the allocators' implementation to throw an
> > > OutOfMemoryRuntimeException instead of returning null ? this has the
> > added
> > > benefit of displaying a proper out of memory error message to the user.
> > >
> > > Thanks!
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
I guess that would fix the issue too. But we may still run into situations
where the caller will pass a flag to "mute" the exception and not handle
the case anyway.

If .buffer() unconditionally throws an exception, can't the caller, who
wants to, just catch that and handle it properly ?

On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <ch...@gmail.com>
wrote:

> No, but we should do something close to that.
>
> There are cases where the caller can handle the inability to get more
> memory, and may be able to go to disk. However, you are correct that there
> are many that can't handle an OOM, and that fail to check.
>
> Instead of unconditionally throwing OutOfMemoryRuntimeException, I would
> suggest that the buffer() call take a flag that indicates whether or not it
> should throw if it is unable to fulfill the request. This way, the call
> sites that can handle an OOM can pass in the flag to return null, and the
> rest can pass in the flag value to throw, and not have to have any checking
> code.
>
>
> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
> adeneche@maprtech.com
> > wrote:
>
> > our current implementations of BufferAllocator.buffer(int, int) returns
> > null when it cannot allocate the buffer.
> >
> > But looking through the code, there are many places that don't check if
> the
> > allocated buffer is null before trying to access it which will throw a
> > NullPointerException.
> >
> > ValueVectors' allocateNewSafe() seem to be the only place that handle the
> > null in a specific manner.
> >
> > Should we update the allocators' implementation to throw an
> > OutOfMemoryRuntimeException instead of returning null ? this has the
> added
> > benefit of displaying a proper out of memory error message to the user.
> >
> > Thanks!
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?

Posted by Chris Westin <ch...@gmail.com>.
No, but we should do something close to that.

There are cases where the caller can handle the inability to get more
memory, and may be able to go to disk. However, you are correct that there
are many that can't handle an OOM, and that fail to check.

Instead of unconditionally throwing OutOfMemoryRuntimeException, I would
suggest that the buffer() call take a flag that indicates whether or not it
should throw if it is unable to fulfill the request. This way, the call
sites that can handle an OOM can pass in the flag to return null, and the
rest can pass in the flag value to throw, and not have to have any checking
code.


On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <adeneche@maprtech.com
> wrote:

> our current implementations of BufferAllocator.buffer(int, int) returns
> null when it cannot allocate the buffer.
>
> But looking through the code, there are many places that don't check if the
> allocated buffer is null before trying to access it which will throw a
> NullPointerException.
>
> ValueVectors' allocateNewSafe() seem to be the only place that handle the
> null in a specific manner.
>
> Should we update the allocators' implementation to throw an
> OutOfMemoryRuntimeException instead of returning null ? this has the added
> benefit of displaying a proper out of memory error message to the user.
>
> Thanks!
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>