You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@lucy.apache.org by lo...@apache.org on 2012/01/05 08:38:24 UTC

[lucy-commits] svn commit: r1227512 - in /incubator/lucy/trunk/clownfish/src: CFCHierarchy.c CFCHierarchy.h

Author: logie
Date: Thu Jan  5 07:38:24 2012
New Revision: 1227512

URL: http://svn.apache.org/viewvc?rev=1227512&view=rev
Log:
Added CFCHierarchy_allocate in order to play nice within
the ruby object allocation patterns. Updated CFCHierarchy_new to
call this function instead when creating the base CFCHIERARCHY_META
struct.

Modified:
    incubator/lucy/trunk/clownfish/src/CFCHierarchy.c
    incubator/lucy/trunk/clownfish/src/CFCHierarchy.h

Modified: incubator/lucy/trunk/clownfish/src/CFCHierarchy.c
URL: http://svn.apache.org/viewvc/incubator/lucy/trunk/clownfish/src/CFCHierarchy.c?rev=1227512&r1=1227511&r2=1227512&view=diff
==============================================================================
--- incubator/lucy/trunk/clownfish/src/CFCHierarchy.c (original)
+++ incubator/lucy/trunk/clownfish/src/CFCHierarchy.c Thu Jan  5 07:38:24 2012
@@ -69,8 +69,13 @@ const static CFCMeta CFCHIERARCHY_META =
 };
 
 CFCHierarchy*
+CFCHierarchy_allocate() {
+    return (CFCHierarchy*)CFCBase_allocate(&CFCHIERARCHY_META);
+}
+
+CFCHierarchy*
 CFCHierarchy_new(const char *source, const char *dest) {
-    CFCHierarchy *self = (CFCHierarchy*)CFCBase_allocate(&CFCHIERARCHY_META);
+    CFCHierarchy *self = CFCHierarchy_allocate();
     return CFCHierarchy_init(self, source, dest);
 }
 

Modified: incubator/lucy/trunk/clownfish/src/CFCHierarchy.h
URL: http://svn.apache.org/viewvc/incubator/lucy/trunk/clownfish/src/CFCHierarchy.h?rev=1227512&r1=1227511&r2=1227512&view=diff
==============================================================================
--- incubator/lucy/trunk/clownfish/src/CFCHierarchy.h (original)
+++ incubator/lucy/trunk/clownfish/src/CFCHierarchy.h Thu Jan  5 07:38:24 2012
@@ -40,6 +40,9 @@ struct CFCFile;
  * @param dest The directory where the autogenerated files will be written.
  */
 CFCHierarchy*
+CFCHierarchy_allocate();
+
+CFCHierarchy*
 CFCHierarchy_new(const char *source, const char *dest);
 
 CFCHierarchy*

Re: [lucy-dev] Ruby allocation/initialization

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Sun, Jan 08, 2012 at 04:26:50PM -0800, Logan Bell wrote:
> Very interesting suggestions, I believe the code you supplied will
> work perfect.

According to the link you provided me in IRC, that's "Ruby 1.6 style":

    http://www.ruby-forum.com/topic/81100

    The only option I see is to use Ruby 1.6 style code and define a 
    Klass.new method for my classes.  The problems with this is it prevents 
    clean subclassing and object cloning.  

I tried the code and it does work -- but it seems that the approach you've
discovered via your research is more idiomatic for modern Ruby libraries, so
I think we should go with that.

> I did a bit more research in how other modules address this issue, and
> found that the allocator could generate an empty pointer that could
> then later be populated in the initialization function.
> 
> An example of this would be the following:
> 
> static VALUE cfc_hierarchy_alloc(VALUE klass) {
>     VALUE self_rb = Qnil;
>     void *ptr = NULL;
>     self_rb = Data_Wrap_Struct(klass, NULL, NULL, ptr);
>     return self_rb;
> }
> 
> static VALUE cfc_hierarchy_init(VALUE self_rb, VALUE source, VALUE dest) {
>     CFCHierarchy* self;
>     Data_Get_Struct(self_rb,CFCHierarchy, self);
>     self = CFCHierarchy_new(StringValuePtr(source),StringValuePtr(dest) );
>     DATA_PTR(self_rb) = self;
>     return self_rb;
> }

+1

> This pattern I found was being used in the ruby extension:
> ext/dbm/dbm.c. Doing it this way would allow us to still leverage the
> benefits of having an allocator (not certain if there really is any at
> this point). This is another way of at least doing it.

Nice work.

> If one uses rb_define_singleton_method, would we have any state issues
> later since that class is now a singleton? Or for our purposes, does
> it really even matter? Thoughts?

For Clownfish::CFC -- the compiler -- which is not designed to be extensible,
either implementation strategy would meet our needs technically.  However, IMO
we should use the modern idiom because it will be more familiar to people
spelunking the code.

For the code *generated* by Clownfish::CFC, and for Lucy as a library based on
the Clownfish object model, we will absolutely want to use allocators and
initializers.

In fact, when we create Clownfish::CFC::Binding::Ruby::Constructor, I bet it
will turn out to be cleaner than Clownfish::CFC::Binding::Perl::Constructor.
The Ruby allocation/initialization approach is closer to Clownfish's approach
than Perl's.

Marvin Humphrey

Re: [lucy-dev] Ruby allocation/initialization

Posted by Logan Bell <lo...@gmail.com>.

Very interesting suggestions, I believe the code you supplied will
work perfect. I was dismayed as well that ruby would force us to
change how clownfish objects are constructed, so I did some more code
spelunking to see if there was any other ways around this.

I did a bit more research in how other modules address this issue, and
found that the allocator could generate an empty pointer that could
then later be populated in the initialization function.

An example of this would be the following:

static VALUE cfc_hierarchy_alloc(VALUE klass) {
    VALUE self_rb = Qnil;
    void *ptr = NULL;
    self_rb = Data_Wrap_Struct(klass, NULL, NULL, ptr);
    return self_rb;
}

static VALUE cfc_hierarchy_init(VALUE self_rb, VALUE source, VALUE dest) {
    CFCHierarchy* self;
    Data_Get_Struct(self_rb,CFCHierarchy, self);
    self = CFCHierarchy_new(StringValuePtr(source),StringValuePtr(dest) );
    DATA_PTR(self_rb) = self;
    return self_rb;
}


This pattern I found was being used in the ruby extension:
ext/dbm/dbm.c. Doing it this way would allow us to still leverage the
benefits of having an allocator (not certain if there really is any at
this point). This is another way of at least doing it.

If one uses rb_define_singleton_method, would we have any state issues
later since that class is now a singleton? Or for our purposes, does
it really even matter? Thoughts? I'm not certain at this point which
would be the ideal way of going forward.

Thanks,
Logan


On Sat, Jan 7, 2012 at 12:47 PM, Marvin Humphrey <ma...@rectangular.com> wrote:
> On Thu, Jan 05, 2012 at 11:33:12AM -0800, Logan Bell wrote:
>> In regard to the allocation function and the need to create an empty object
>> has had me digging a bit more in the pickaxe book. The allocator is only
>> needed "if the object you’re implementing doesn’t use any data other than
>> Ruby instance variables, then you don’t need to write an allocation
>> function—Ruby’s default allocator will work just fine. " If I understand
>> that correctly, since our (Clownfish::CFC::Hierarchy) object does need data
>> then we need to allocate the space up front in the allocator function.
>>
>> Further it goes on to outline reasons why this is necessary ( marshaling as
>> you pointed out being one of them ):
>>
>> "One of the reasons for this multistep object creation protocol is that it
>> lets the interpreter handle situations where objects have to be created by
>> “back-door means.” One example is when objects are being deserialized from
>> their marshaled form. Here, the interpreter needs to create an empty object
>> (by calling the allocator), but it cannot call the initializer (because it
>> has no knowledge of the parameters to use). Another common situation is
>> when objects are duplicated or cloned."
>>
>> It might be worth doing some code diving on the ruby end to see for sure,
>> but I can see value in in having constructors that accept no arguments.
>
> Clownfish actually provides a direct analogue to Ruby's Class#allocate:
> VTable_Make_Obj().
>
>    /** Create an empty object of the type defined by the VTable: allocate,
>     * assign its vtable and give it an initial refcount of 1.  The caller is
>     * responsible for initialization.
>     */
>    Obj*
>    Make_Obj(VTable *self);
>
> For an example of how VTable_Make_Obj() is used during deserialization, here's
> Freezer_thaw() from core/Lucy/Util/Freezer.c:
>
>    Obj*
>    Freezer_thaw(InStream *instream) {
>        CharBuf *class_name
>            = CB_Deserialize((CharBuf*)VTable_Make_Obj(CHARBUF), instream);
>        VTable *vtable = VTable_singleton(class_name, NULL);
>        Obj *blank = VTable_Make_Obj(vtable);
>        DECREF(class_name);
>        return Obj_Deserialize(blank, instream);
>    }
>
> Freezer_thaw() obtains the class name, uses it to look up the right VTable
> singleton, then invokes VTable_Make_Obj() to create the blank object.  The
> newborn blank object doesn't start off with much, but at least it has a VTable
> -- so we can invoke the Deserialize() object method on it and flesh it out.
>
> We also use VTable_Make_Obj() for every Lucy object that we create from
> Perl-space.  Our Foo_new() C functions have a limitation: they do not take a
> class name as an argument, so they cannot support dynamic subclassing.  For
> instance, here is Normalizer_new():
>
>    Normalizer*
>    Normalizer_new(const CharBuf *form, bool_t case_fold, bool_t strip_accents) {
>        Normalizer *self = (Normalizer*)VTable_Make_Obj(NORMALIZER);
>        return Normalizer_init(self, form, case_fold, strip_accents);
>    }
>
> Because the VTable is hard-coded to NORMALIZER, objects created via
> Normalizer_new() will *always* have a class of "Lucy::Analysis::Normalizer".
> But what if you create a Perl subclass of Lucy::Analysis::Normalizer called
> "MyNormalizer"?
>
>    package MyNormalizer;
>    use base qw( Lucy::Analysis::Normalizer );
>
>    my $normalizer = MyNormalizer->new;
>
> Here's how Normalizer_new() would need to change in order to support such
> subclassing:
>
>    Normalizer*
>    Normalizer_new(CharBuf *class_name, const CharBuf *form,
>                   bool_t case_fold, bool_t strip_accents) {
>        VTable *vtable = VTable_singleton(class_name, NULL);
>        Normalizer *self = (Normalizer*)VTable_Make_Obj(vtable);
>        return Normalizer_init(self, form, case_fold, strip_accents);
>    }
>
> The actual code which *does* support subclassing for Normalizer is spread
> across three functions, two of which I've included below my sig for reference:
>
>  * XSBind_new_blank_obj() from perl/xs/XSBind.c, which wraps
>    VTable_Make_Obj().
>  * XS_Lucy_Analysis_Normalizer_new() from Lucy.xs, which is auto-generated.
>  * Normalizer_init(), from core/Lucy/Analysis/Normalizer.c.
>
> In order to support dynamic subclassing in the Ruby bindings for Lucy, we will
> need to provide similar functionality.
>
> However, I question whether we need to provide that kind of functionality for
> Clownfish::CFC, which is itself written using a much cruder object model:
>
>  * No support for subclassing.
>  * No support for serialization.
>  * No support for Ruby's #clone or #dup methods.
>
> I don't yet understand why Ruby *needs* an allocator function if we aren't
> going to use those bells and whistles.  How many C libraries out there provide
> two-stage constructors?  It doesn't make sense that Ruby would impose such an
> esoteric requirement, limiting the kinds of C libraries you could write Ruby
> bindings for.
>
> Something like this ought to work:
>
>    // Clownfish::CFC::Hierarchy#new
>    static VALUE
>    S_CFCHierarchy_new(VALUE klass, VALUE source_rb, VALUE dest_rb) {
>        const char *source = StringValuePtr(source_rb);
>        const char *dest   = StringValuePtr(dest_rb);
>        CFCHierarchy *self = CFCHierarchy_new(source, dest);
>        return Data_Wrap_Struct(klass, NULL, NULL, self);
>    }
>
>    // Bootstrap Clownfish::CFC::Hierarchy.
>    static void
>    S_Init_CFCHierarchy() {
>        cHierarchy  = rb_define_class_under(mCFC, "Hierarchy", rb_cObject);
>        rb_define_method(cHierarchy, "build", S_CFCHierarchy_build, 0);
>        rb_define_singleton_method(cHierarchy, "new", S_CFCHierarchy_new, 2);
>    }
>
>    // Bootstrap Clownfish::CFC and all of its components.
>    void
>    Init_CFC() {
>        mClownfish = rb_define_module("Clownfish");
>        mCFC       = rb_define_module_under(mClownfish, "CFC");
>        S_Init_CFCHierarchy();
>    }
>
> I don't know whether that's an idiomatic approach for writing a Ruby extension,
> but if it works, it prevents us from having to add a bunch of CFCFoo_allocate()
> functions and from having to provide two-stage constructors for every
> Clownfish::CFC component.
>
> In any case, exploring this topic for the CFC bindings helps us to understand
> the issues we will confront when auto-generating Ruby wrapper code via the
> as-yet-to-be-written Clownfish::CFC::Binding::Ruby.  :)
>
> Marvin Humphrey
>
>
> cfish_Obj*
> XSBind_new_blank_obj(SV *either_sv) {
>    cfish_VTable *vtable;
>
>    // Get a VTable.
>    if (sv_isobject(either_sv)
>        && sv_derived_from(either_sv, "Lucy::Object::Obj")
>       ) {
>        // Use the supplied object's VTable.
>        IV iv_ptr = SvIV(SvRV(either_sv));
>        cfish_Obj *self = INT2PTR(cfish_Obj*, iv_ptr);
>        vtable = self->vtable;
>    }
>    else {
>        // Use the supplied class name string to find a VTable.
>        STRLEN len;
>        char *ptr = SvPVutf8(either_sv, len);
>        cfish_ZombieCharBuf *klass = CFISH_ZCB_WRAP_STR(ptr, len);
>        vtable = cfish_VTable_singleton((cfish_CharBuf*)klass, NULL);
>    }
>
>    // Use the VTable to allocate a new blank object of the right size.
>    return Cfish_VTable_Make_Obj(vtable);
> }
>
>
> XS(XS_Lucy_Analysis_Normalizer_new) {
>    dXSARGS;
>    CHY_UNUSED_VAR(cv);
>    if (items < 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(class_name, ...)",  GvNAME(CvGV(cv))); }
>    SP -= items;
>
>    const lucy_CharBuf* normalization_form = NULL;
>    chy_bool_t case_fold = true;
>    chy_bool_t strip_accents = false;
>    chy_bool_t args_ok = XSBind_allot_params(
>        &(ST(0)), 1, items, "Lucy::Analysis::Normalizer::new_PARAMS",
>        ALLOT_OBJ(&normalization_form, "normalization_form", 18, false, LUCY_CHARBUF, alloca(cfish_ZCB_size())),
>        ALLOT_BOOL(&case_fold, "case_fold", 9, false),
>        ALLOT_BOOL(&strip_accents, "strip_accents", 13, false),
>        NULL);
>    if (!args_ok) {
>        CFISH_RETHROW(CFISH_INCREF(cfish_Err_get_error()));
>    }
>    lucy_Normalizer* self = (lucy_Normalizer*)XSBind_new_blank_obj(ST(0));
>
>    lucy_Normalizer* retval = lucy_Normalizer_init(self, normalization_form, case_fold, strip_accents);
>    if (retval) {
>        ST(0) = (SV*)Cfish_Obj_To_Host((cfish_Obj*)retval);
>        Cfish_Obj_Dec_RefCount((cfish_Obj*)retval);
>    }
>    else {
>        ST(0) = newSV(0);
>    }
>    sv_2mortal(ST(0));
>    XSRETURN(1);
> }
>
>
>

Re: [lucy-dev] Ruby allocation/initialization

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Thu, Jan 05, 2012 at 11:33:12AM -0800, Logan Bell wrote:
> In regard to the allocation function and the need to create an empty object
> has had me digging a bit more in the pickaxe book. The allocator is only
> needed "if the object you’re implementing doesn’t use any data other than
> Ruby instance variables, then you don’t need to write an allocation
> function—Ruby’s default allocator will work just fine. " If I understand
> that correctly, since our (Clownfish::CFC::Hierarchy) object does need data
> then we need to allocate the space up front in the allocator function.
> 
> Further it goes on to outline reasons why this is necessary ( marshaling as
> you pointed out being one of them ):
> 
> "One of the reasons for this multistep object creation protocol is that it
> lets the interpreter handle situations where objects have to be created by
> “back-door means.” One example is when objects are being deserialized from
> their marshaled form. Here, the interpreter needs to create an empty object
> (by calling the allocator), but it cannot call the initializer (because it
> has no knowledge of the parameters to use). Another common situation is
> when objects are duplicated or cloned."
> 
> It might be worth doing some code diving on the ruby end to see for sure,
> but I can see value in in having constructors that accept no arguments.

Clownfish actually provides a direct analogue to Ruby's Class#allocate:
VTable_Make_Obj().

    /** Create an empty object of the type defined by the VTable: allocate,
     * assign its vtable and give it an initial refcount of 1.  The caller is
     * responsible for initialization.
     */
    Obj*
    Make_Obj(VTable *self);

For an example of how VTable_Make_Obj() is used during deserialization, here's
Freezer_thaw() from core/Lucy/Util/Freezer.c:

    Obj*
    Freezer_thaw(InStream *instream) {
        CharBuf *class_name
            = CB_Deserialize((CharBuf*)VTable_Make_Obj(CHARBUF), instream);
        VTable *vtable = VTable_singleton(class_name, NULL);
        Obj *blank = VTable_Make_Obj(vtable);
        DECREF(class_name);
        return Obj_Deserialize(blank, instream);
    }

Freezer_thaw() obtains the class name, uses it to look up the right VTable
singleton, then invokes VTable_Make_Obj() to create the blank object.  The
newborn blank object doesn't start off with much, but at least it has a VTable
-- so we can invoke the Deserialize() object method on it and flesh it out.

We also use VTable_Make_Obj() for every Lucy object that we create from
Perl-space.  Our Foo_new() C functions have a limitation: they do not take a
class name as an argument, so they cannot support dynamic subclassing.  For
instance, here is Normalizer_new():

    Normalizer*
    Normalizer_new(const CharBuf *form, bool_t case_fold, bool_t strip_accents) {
        Normalizer *self = (Normalizer*)VTable_Make_Obj(NORMALIZER);
        return Normalizer_init(self, form, case_fold, strip_accents);
    }

Because the VTable is hard-coded to NORMALIZER, objects created via
Normalizer_new() will *always* have a class of "Lucy::Analysis::Normalizer".
But what if you create a Perl subclass of Lucy::Analysis::Normalizer called
"MyNormalizer"?

    package MyNormalizer;
    use base qw( Lucy::Analysis::Normalizer );

    my $normalizer = MyNormalizer->new;

Here's how Normalizer_new() would need to change in order to support such
subclassing:

    Normalizer*
    Normalizer_new(CharBuf *class_name, const CharBuf *form,
                   bool_t case_fold, bool_t strip_accents) {
        VTable *vtable = VTable_singleton(class_name, NULL);
        Normalizer *self = (Normalizer*)VTable_Make_Obj(vtable);
        return Normalizer_init(self, form, case_fold, strip_accents);
    }

The actual code which *does* support subclassing for Normalizer is spread
across three functions, two of which I've included below my sig for reference:

  * XSBind_new_blank_obj() from perl/xs/XSBind.c, which wraps
    VTable_Make_Obj().
  * XS_Lucy_Analysis_Normalizer_new() from Lucy.xs, which is auto-generated.
  * Normalizer_init(), from core/Lucy/Analysis/Normalizer.c.

In order to support dynamic subclassing in the Ruby bindings for Lucy, we will
need to provide similar functionality.

However, I question whether we need to provide that kind of functionality for
Clownfish::CFC, which is itself written using a much cruder object model:

  * No support for subclassing.
  * No support for serialization.
  * No support for Ruby's #clone or #dup methods.

I don't yet understand why Ruby *needs* an allocator function if we aren't
going to use those bells and whistles.  How many C libraries out there provide
two-stage constructors?  It doesn't make sense that Ruby would impose such an
esoteric requirement, limiting the kinds of C libraries you could write Ruby
bindings for.

Something like this ought to work:

    // Clownfish::CFC::Hierarchy#new
    static VALUE
    S_CFCHierarchy_new(VALUE klass, VALUE source_rb, VALUE dest_rb) {
        const char *source = StringValuePtr(source_rb);
        const char *dest   = StringValuePtr(dest_rb);
        CFCHierarchy *self = CFCHierarchy_new(source, dest);
        return Data_Wrap_Struct(klass, NULL, NULL, self);
    }

    // Bootstrap Clownfish::CFC::Hierarchy.
    static void
    S_Init_CFCHierarchy() {
        cHierarchy  = rb_define_class_under(mCFC, "Hierarchy", rb_cObject);
        rb_define_method(cHierarchy, "build", S_CFCHierarchy_build, 0);
        rb_define_singleton_method(cHierarchy, "new", S_CFCHierarchy_new, 2);
    }

    // Bootstrap Clownfish::CFC and all of its components.
    void 
    Init_CFC() {
        mClownfish = rb_define_module("Clownfish");
        mCFC       = rb_define_module_under(mClownfish, "CFC");
        S_Init_CFCHierarchy();
    }

I don't know whether that's an idiomatic approach for writing a Ruby extension,
but if it works, it prevents us from having to add a bunch of CFCFoo_allocate()
functions and from having to provide two-stage constructors for every
Clownfish::CFC component.

In any case, exploring this topic for the CFC bindings helps us to understand
the issues we will confront when auto-generating Ruby wrapper code via the
as-yet-to-be-written Clownfish::CFC::Binding::Ruby.  :)

Marvin Humphrey


cfish_Obj*
XSBind_new_blank_obj(SV *either_sv) {
    cfish_VTable *vtable;

    // Get a VTable.
    if (sv_isobject(either_sv)
        && sv_derived_from(either_sv, "Lucy::Object::Obj")
       ) { 
        // Use the supplied object's VTable.
        IV iv_ptr = SvIV(SvRV(either_sv));
        cfish_Obj *self = INT2PTR(cfish_Obj*, iv_ptr);
        vtable = self->vtable;
    }   
    else {
        // Use the supplied class name string to find a VTable.
        STRLEN len;
        char *ptr = SvPVutf8(either_sv, len);
        cfish_ZombieCharBuf *klass = CFISH_ZCB_WRAP_STR(ptr, len);
        vtable = cfish_VTable_singleton((cfish_CharBuf*)klass, NULL);
    }   

    // Use the VTable to allocate a new blank object of the right size.
    return Cfish_VTable_Make_Obj(vtable);
}


XS(XS_Lucy_Analysis_Normalizer_new) {
    dXSARGS;
    CHY_UNUSED_VAR(cv);
    if (items < 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(class_name, ...)",  GvNAME(CvGV(cv))); }
    SP -= items;

    const lucy_CharBuf* normalization_form = NULL; 
    chy_bool_t case_fold = true; 
    chy_bool_t strip_accents = false;
    chy_bool_t args_ok = XSBind_allot_params(
        &(ST(0)), 1, items, "Lucy::Analysis::Normalizer::new_PARAMS",
        ALLOT_OBJ(&normalization_form, "normalization_form", 18, false, LUCY_CHARBUF, alloca(cfish_ZCB_size())),
        ALLOT_BOOL(&case_fold, "case_fold", 9, false),
        ALLOT_BOOL(&strip_accents, "strip_accents", 13, false),
        NULL);
    if (!args_ok) {
        CFISH_RETHROW(CFISH_INCREF(cfish_Err_get_error()));
    }
    lucy_Normalizer* self = (lucy_Normalizer*)XSBind_new_blank_obj(ST(0));

    lucy_Normalizer* retval = lucy_Normalizer_init(self, normalization_form, case_fold, strip_accents);
    if (retval) {
        ST(0) = (SV*)Cfish_Obj_To_Host((cfish_Obj*)retval);
        Cfish_Obj_Dec_RefCount((cfish_Obj*)retval);
    }
    else {
        ST(0) = newSV(0);
    }
    sv_2mortal(ST(0));
    XSRETURN(1);
}

Re: [lucy-dev] Ruby allocation/initialization

Posted by Logan Bell <lo...@gmail.com>.

In regard to the allocation function and the need to create an empty object
has had me digging a bit more in the pickaxe book. The allocator is only
needed "if the object you’re implementing doesn’t use any data other than
Ruby instance variables, then you don’t need to write an allocation
function—Ruby’s default allocator will work just fine. " If I understand
that correctly, since our (Clownfish::CFC::Hierarchy) object does need data
then we need to allocate the space up front in the allocator function.

Further it goes on to outline reasons why this is necessary ( marshaling as
you pointed out being one of them ):

"One of the reasons for this multistep object creation protocol is that it
lets the interpreter handle situations where objects have to be created by
“back-door means.” One example is when objects are being deserialized from
their marshaled form. Here, the interpreter needs to create an empty object
(by calling the allocator), but it cannot call the initializer (because it
has no knowledge of the parameters to use). Another common situation is
when objects are duplicated or cloned."

It might be worth doing some code diving on the ruby end to see for sure,
but I can see value in in having constructors that accept no arguments.

/Logan

On Thu, Jan 5, 2012 at 10:41 AM, Marvin Humphrey <ma...@rectangular.com>wrote:

> On Thu, Jan 05, 2012 at 07:38:24AM -0000, logie@apache.org wrote:
> > URL: http://svn.apache.org/viewvc?rev=1227512&view=rev
> > Log:
> > Added CFCHierarchy_allocate in order to play nice within
> > the ruby object allocation patterns. Updated CFCHierarchy_new to
> > call this function instead when creating the base CFCHIERARCHY_META
> > struct.
>
> Summarizing and continuing an IRC discussion:
>
> The example code in the pickaxe book's appendix on the Ruby's C API (free
> download linked off of <
> http://pragprog.com/book/ruby3/programming-ruby-1-9>)
> for defining a class requires both an allocation function and an
> initialization function.  We have the initialization function already:
>
>    CFCHierarchy*
>    CFCHierarchy_init(CFCHierarchy *self, const char *source, const char
> *dest);
>
> However, before this commit, we did not have an allocation function which
> met
> Ruby's needs:
>
>    * Take no arguments.
>    * Return a "blank" object.  (Essentially, something suitable for running
>      through the initialization function.)
>
> We have CFCHierarchy_new(), but it takes arguments and returns a complete
> object.
>
>    CFCHierarchy*
>    CFCHierarchy_new(const char *source, const char *dest);
>
> Here the new allocator:
>
>    CFCHierarchy*
>    CFCHierarchy_allocate();
>
> I understand the need for zero-argument constructors e.g. when
> deserializing,
> though I don't completely understand whether the allocator is an absolute
> requirement for defining a Ruby class extension or just a quirk of the
> example
> code.
>
> If it's a requirement, we'll presumably be modifying Lucy's classes
> eventually
> and adding allocators there to accommodate Ruby.
>
> Marvin Humphrey
>
>

[lucy-dev] Ruby allocation/initialization

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Thu, Jan 05, 2012 at 07:38:24AM -0000, logie@apache.org wrote:
> URL: http://svn.apache.org/viewvc?rev=1227512&view=rev
> Log:
> Added CFCHierarchy_allocate in order to play nice within
> the ruby object allocation patterns. Updated CFCHierarchy_new to
> call this function instead when creating the base CFCHIERARCHY_META
> struct.

Summarizing and continuing an IRC discussion:

The example code in the pickaxe book's appendix on the Ruby's C API (free
download linked off of <http://pragprog.com/book/ruby3/programming-ruby-1-9>)
for defining a class requires both an allocation function and an
initialization function.  We have the initialization function already: 

    CFCHierarchy*
    CFCHierarchy_init(CFCHierarchy *self, const char *source, const char *dest);

However, before this commit, we did not have an allocation function which met
Ruby's needs:

    * Take no arguments.
    * Return a "blank" object.  (Essentially, something suitable for running
      through the initialization function.)

We have CFCHierarchy_new(), but it takes arguments and returns a complete
object.

    CFCHierarchy*
    CFCHierarchy_new(const char *source, const char *dest);

Here the new allocator:

    CFCHierarchy*
    CFCHierarchy_allocate();

I understand the need for zero-argument constructors e.g. when deserializing,
though I don't completely understand whether the allocator is an absolute
requirement for defining a Ruby class extension or just a quirk of the example
code.

If it's a requirement, we'll presumably be modifying Lucy's classes eventually
and adding allocators there to accommodate Ruby.

Marvin Humphrey