You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@trafodion.apache.org by Dave Birdsall <da...@esgyn.com> on 2016/04/28 22:35:23 UTC

Heap management in C++ UDFs?

Hi,



Are there any special considerations for heap management in a C++ UDF?



I’m guessing that at run-time, we just use the usual C++ new and delete?



I’m guessing that at run-time, all we need to do is anchor any
heap-allocated variables from local variables in the processData method,
being sure to deallocate them before returning. Of course this must take
into account exceptions; there should be a try/catch block encompassing
most of the method that can do deallocations before rethrowing.



Does this sound like the right approach?



Thanks,



Dave

Re: Heap management in C++ UDFs?

Posted by Hans Zeller <ha...@esgyn.com>.
Hi Dave,

Yes, the general model is to use C++ new and delete, and it is the UDR
writer's responsibility not to leak objects.

In more detail:

*The simple and easy model*

Don't allocate resources in the compile time interface. At runtime (in the
method that overrides UDR::processData()), allocate and deallocate
everything in this method. Note that you may need to use a try/catch block
to avoid leaks when exceptions occur. Don't have any data members in the
class that you derive from UDR.



*A step up, a bit more complicated*

Don't share data between compile and runtime. Note that those usually
happen in two different processes. If you want to share data between
different compile time interfaces, derive from class
UDRWriterCompileTimeData and attach an object of your class to the
UDRInvocationInfo object - the life time of that object defines the
compilation of one UDR invocation. Same as above at runtime.



*If you want it even more complicated*

Let's look at a few scopes:

*UDRInvocationInfo (compile time):* The life time of this object defines
compilation of one invocation of a TMUDF in one statement. Note that a
statement may contain multiple TMUDF invocations.

*UDRPlanInfo (compile time):* This defines different plan alternatives for
a single invocation (UDRInvocationInfo). A given UDRPlanInfo object always
belongs to the same UDRInvocationInfo object.

*UDRInvocationInfo/UDRPlanInfo (runtime): *The life time of these objects
is spans a single call of the method derived from UDR::processData. This is
a single invocation of a TMUDF in a statement. If the TMUDF is executed in
parallel, we may create many such objects, each one for part of the data.

*UDR object:* An object derived from this class will be created to call the
code you wrote. Note that this object may be reused for multiple
invocations. Keep that in mind when defining data members for your derived
class. We are currently not sharing these objects across invocations, and
I'm wondering how we should best define the life time of this object. It
could be useful for sharing resources across multiple invocations of a UDR
(and it was meant to serve that purpose).

If you want a simple object with a constructor and destructor that lives as
long as one invocation at runtime, I would suggest defining a class with
constructor and destructor and making an object of that class on the stack:



class MyUDRResources

{

  MyUDRResources() { /* allocate resources */ }

  ~MyUDRResources() { /* deallocate resources */ }

};


class MyUDR : public UDR

{

  virtual void processData(UDRInvocationInfo &info,

                           UDRPlanInfo &plan);

  // avoid data members in MyUDR, unless you want to allow sharing them
between multiple invocations.


};


MyUDR::processData(UDRInvocationInfo &info,

                   UDRPlanInfo &plan)

{

  MyUDRResources r;

  // r will be automatically destroyed as you return or if you hit an
exception


  ...

}


Alternatively, you can use a try/catch block in the processData() method.


Hans

On Thu, Apr 28, 2016 at 1:35 PM, Dave Birdsall <da...@esgyn.com>
wrote:

> Hi,
>
>
>
> Are there any special considerations for heap management in a C++ UDF?
>
>
>
> I’m guessing that at run-time, we just use the usual C++ new and delete?
>
>
>
> I’m guessing that at run-time, all we need to do is anchor any
> heap-allocated variables from local variables in the processData method,
> being sure to deallocate them before returning. Of course this must take
> into account exceptions; there should be a try/catch block encompassing
> most of the method that can do deallocations before rethrowing.
>
>
>
> Does this sound like the right approach?
>
>
>
> Thanks,
>
>
>
> Dave
>
>
>
>
>
>
>
>
>