You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Nick Wellnhofer <we...@aevum.de> on 2012/04/10 21:00:39 UTC

Re: [lucy-dev] Hiding struct defs

On 25/03/2012 21:22, Marvin Humphrey wrote:
> On Sun, Mar 25, 2012 at 12:16 PM, Nick Wellnhofer<we...@aevum.de>  wrote:
>> It's not the extensions that need the struct definitions, it's the code in
>> parcel.c:
>>
>> /* Define the variable which holds this class's class name.
>>   */
>>
>> static cfish_ZombieCharBuf LUCY_QUERY_CLASS_NAME = {
>>     CFISH_ZOMBIECHARBUF,
>>     {1}, /* ref.count */
>>     "Lucy::Search::Query",
>>     19,
>>     0
>> };
>>
>> When building Lucy, CharBuf and VTable are automatically included with
>> struct definitions. But in extensions this is not the case.
>
> Got it.
>
> I believe that the solution is to define VTables using a bootstrapping
> function at startup rather than define them statically at compile-time.

The attached patch contains are a rough attempt to initialize the 
VTables at run-time. I also tried to malloc the VTables dynamically, but 
this broke RAWPOSTING_BLANK where a VTable is used to initialize a 
global struct.

The patch also moves the code to register the VTables from boot.c to 
parcel.c which seemed like the right place, because this initialization 
code does not depend on the host language.

Nick

Re: [lucy-dev] Hiding struct defs

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Wed, Apr 11, 2012 at 10:40 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> On 11/04/2012 19:09, Marvin Humphrey wrote:

>>   * cfish_bootstrap_parcel() will need a per-parcel name, I think.
>
> Yes, but I couldn't find a way to get the parcel name from the CFCBind*
> classes. It looks like it's only available to the CFCPerl* classes.

I'd suggest a hack along these lines for now:

Index: clownfish/src/CFCBindCore.c
===================================================================
--- clownfish/src/CFCBindCore.c	(revision 1324986)
+++ clownfish/src/CFCBindCore.c	(working copy)
@@ -114,8 +114,10 @@
 static void
 S_write_parcel_h(CFCBindCore *self) {
     CFCHierarchy *hierarchy = self->hierarchy;
+    CFCParcel    *parcel    = NULL;

     // Declare object structs for all instantiable classes.
+    // Obtain parcel prefix for use in bootstrap function name.
     char *typedefs = CFCUtil_strdup("");
     CFCClass **ordered = CFCHierarchy_ordered_classes(hierarchy);
     for (int i = 0; ordered[i] != NULL; i++) {
@@ -125,7 +127,13 @@
             typedefs = CFCUtil_cat(typedefs, "typedef struct ", full_struct,
                                    " ", full_struct, ";\n", NULL);
         }
+        if (parcel && CFCClass_get_parcel(klass) != parcel) {
+            CFCUtil_die("Multiple parcels not yet supported.");
+        }
+        parcel = CFCClass_get_parcel(klass);
     }
+    const char *parcel_prefix = parcel
+                                ? CFCParcel_get_prefix(parcel) : "cfish_";
     FREEMEM(ordered);


>>   * In VTable_bootstrap, why not just have self->name be an ordinary
>>     CharBuf rather than a ViewCharBuf?  OH!  HAHAHA!  Not bootstrapped yet!
>>     XD
>
> It could also be a CharBuf, but the object has to be created manually, since
> the VTables aren't set up yet.

Cool cool... Now that I take a closer look, IMO it definitely ought to be a
CharBuf because it needs to duplicate the string that comes in.  If we don't
do that, then that ViewCharBuf will be stuck holding a pointer to a passed in
"const char*" argument -- which is a problem because what if it goes away?

(The word "View" in ViewCharBuf is supposed to communicate that the object
doesn't own its own string -- so use with caution!  ZombieCharBufs, which
subclass ViewCharBuf, also have this trait.)

Marvin Humphrey

Re: [lucy-dev] Hiding struct defs

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Fri, Apr 13, 2012 at 11:11 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> On 12/04/2012 18:14, Marvin Humphrey wrote:

>>     void
>>     lucy_init_parcel(void) {
>>         lucy_Bool_init_class();
>>         lucy_Hash_init_class();
>>     }
>
>
> CFC could also generate this code this automatically by simply checking
> whether a class has an "init_class" function.

If we autogenerate the parcel-level routine, then the user doesn't get to
control the order in which the class-level init routines get called.

It seems to me as though initialization should be coordinated at the parcel
level because parcels are the load units of Clownfish.

I also think it's good practice to avoid reserving function names and imbuing
them with special meaning if there's another way to achieve the same end.  One
nice thing about ${parcel_prefix}init_parcel is that it can't conflict with
any routines declared in a clownfish class.  (Aside from that, there's nothing
special about the name.)

Marvin Humphrey

Re: [lucy-dev] Hiding struct defs

Posted by Nick Wellnhofer <we...@aevum.de>.
On 12/04/2012 18:14, Marvin Humphrey wrote:
> On Thu, Apr 12, 2012 at 2:58 AM, Nick Wellnhofer<we...@aevum.de>  wrote:
>> A user-defined per-class initialization function sounds very useful.
>
> Here's a refinement: Instead of providing per-class initialization support, we
> can support a single per-parcel initialization function.
>
> Here's how we might implement such a function right now:
>
>      void
>      lucy_init_parcel(void) {
>          lucy_Bool_init_class();
>          lucy_Hash_init_class();
>      }

CFC could also generate this code this automatically by simply checking 
whether a class has an "init_class" function.

Nick

Re: [lucy-dev] Hiding struct defs

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Thu, Apr 12, 2012 at 2:58 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> A user-defined per-class initialization function sounds very useful.

Here's a refinement: Instead of providing per-class initialization support, we
can support a single per-parcel initialization function.

Here's how we might implement such a function right now:

    void
    lucy_init_parcel(void) {
        lucy_Bool_init_class();
        lucy_Hash_init_class();
    }

Only a handful of classes will typically require bootstrapping.  This way we
can avoid that messy pound-define-enabling I proposed before.  (At least for
the auxilliary bootstrapping -- we'll probably still need it for VTable
bootstrapping if we want static function method implementations.)

Once we break out Clownfish and Lucy::Test as separate parcels, then the
parcel initialization functions might look like this:

    void
    cfish_init_parcel(void) {
        cfish_Bool_init_class();
        cfish_Hash_init_class();
    }

    void
    lucy_init_parcel(void) {
    }

    void
    lucytest_init_parcel(void) {
    }

> But these functions might make use of other VTables that haven't been
> bootstrapped yet. So we should't call them during bootstrapping but rather
> afterwards. For example, I initially tried to register the VTables at the
> end of VTable_bootstrap which failed for this reason when trying to call
> some LFReg methods.
>
> We might even have to delay the per-class initialization until after all
> VTables have been registered, so the sequence would be:
>
> 1. Bootstrap all VTables
> 2. Register all VTables
> 3. Initialize all classes

+1, sounds good. (With the refinement to step 3 proposed above.)

I'm imagining that this sequence would be repeated for each parcel load.

The mechanism will have to incorporate per-host logic, but the common behavior
can be defined as part of the Clownfish interface.

Marvin Humphrey

Re: [lucy-dev] Hiding struct defs

Posted by Nick Wellnhofer <we...@aevum.de>.
On 12/04/2012 06:10, Marvin Humphrey wrote:
> What I would really like to do with those is initialize them during the
> per-class bootstrapping you've been coding up.

A user-defined per-class initialization function sounds very useful. But 
these functions might make use of other VTables that haven't been 
bootstrapped yet. So we should't call them during bootstrapping but 
rather afterwards. For example, I initially tried to register the 
VTables at the end of VTable_bootstrap which failed for this reason when 
trying to call some LFReg methods.

We might even have to delay the per-class initialization until after all 
VTables have been registered, so the sequence would be:

1. Bootstrap all VTables
2. Register all VTables
3. Initialize all classes

> To achieve this, I imagine that we could store a bootstrapping routine as a
> global function in the per-class autogenerated .h files, enabled with a
> pound-define.  Something like this...
>
> In autogen/Lucy/Object/Hash.h:
>
>      #ifdef LUCY_HASH_WANT_BOOT
>      void
>      lucy_Hash_bootstrap(void) {
>          // ...
>          LUCY_HASH
>              = cfish_VTable_bootstrap(LUCY_HASH, LUCY_OBJ,
>                                       "Lucy::Object::Hash", obj_alloc_size,
>                                       x, size, callbacks, methods);

(Not every VTable will be bootstrapped at this point, so it's not safe 
to call the aux_boot code.)

>      #ifdef LUCY_HASH_AUX_BOOT
>          LUCY_HASH_AUX_BOOT();
>      #endif
>          return LUCY_HASH;
>      }
>      #endif // LUCY_HASH_WANT_BOOT
>
> In core/Lucy/Object/Hash.c:
>
>      #define LUCY_HASH_WANT_BOOT
>      #define LUCY_HASHTOMBSTONE_WANT_BOOT
>      #define LUCY_HASHTOMBSTONE_AUX_BOOT S_tombstone_aux_boot
>      #include "Lucy/Object/Hash.h"
>
>      static void
>      S_tombstone_aux_boot(void) {
>          TOMBSTONE.vtable    = HASHTOMBSTONE;
>          TOMBSTONE.ref.count = 1;
>      }
>
> That's kind of messy, but do you see where I'm going with this?  Each class
> gets an optional bootstrapping routine that it can hook into to do whatever it
> wants.

+1 for the general idea. I also don't see a cleaner solution unless we 
start to generate additional files.

> The punchline is that I believe we will ultimately be able to hide all of our
> method implementations by making them static.

That would be nice.

Nick

Re: [lucy-dev] Hiding struct defs

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Wed, Apr 11, 2012 at 10:40 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> On 11/04/2012 19:09, Marvin Humphrey wrote:

>>> I also tried to malloc the VTables dynamically, but this broke
>>> RAWPOSTING_BLANK where a VTable is used to initialize a global struct.
>>
>> I've just committed a change that kills off RAWPOSTING_BLANK.
>>
>> I suspect that the ZombieCharBuf EMPTY is also going to get in your way;
>> I'll look into deep-sixing that one later today.
>>
>> There may be others; we'll just address them one-by-one until we can malloc
>> dynamically.
>
>
> OK.

Several of these are now gone.  A couple known instances remain.

>From core/Lucy/Object/Hash.c:

    static HashTombStone TOMBSTONE = {
        HASHTOMBSTONE,
        {1}
    };

>From core/Lucy/Object/Num.c:

    static ViewCharBuf true_string  = { VIEWCHARBUF, {1}, "true",  4, 0 };
    static ViewCharBuf false_string = { VIEWCHARBUF, {1}, "false", 5, 0 };
    static BoolNum true_obj  = { BOOLNUM, {1}, true, &true_string };
    static BoolNum false_obj = { BOOLNUM, {1}, false, &false_string };

What I would really like to do with those is initialize them during the
per-class bootstrapping you've been coding up.

To achieve this, I imagine that we could store a bootstrapping routine as a
global function in the per-class autogenerated .h files, enabled with a
pound-define.  Something like this...

In autogen/Lucy/Object/Hash.h:

    #ifdef LUCY_HASH_WANT_BOOT
    void
    lucy_Hash_bootstrap(void) {
        // ...
        LUCY_HASH
            = cfish_VTable_bootstrap(LUCY_HASH, LUCY_OBJ,
                                     "Lucy::Object::Hash", obj_alloc_size,
                                     x, size, callbacks, methods);
    #ifdef LUCY_HASH_AUX_BOOT
        LUCY_HASH_AUX_BOOT();
    #endif
        return LUCY_HASH;
    }
    #endif // LUCY_HASH_WANT_BOOT

In core/Lucy/Object/Hash.c:

    #define LUCY_HASH_WANT_BOOT
    #define LUCY_HASHTOMBSTONE_WANT_BOOT
    #define LUCY_HASHTOMBSTONE_AUX_BOOT S_tombstone_aux_boot
    #include "Lucy/Object/Hash.h"

    static void
    S_tombstone_aux_boot(void) {
        TOMBSTONE.vtable    = HASHTOMBSTONE;
        TOMBSTONE.ref.count = 1;
    }

That's kind of messy, but do you see where I'm going with this?  Each class
gets an optional bootstrapping routine that it can hook into to do whatever it
wants.

The punchline is that I believe we will ultimately be able to hide all of our
method implementations by making them static.  If we can pull that off, it
will help us in two ways:

  1. It will cut down significantly on the number of exported symbols
the dynamic
     linker has to deal with.
  2. It will make it impossible for a user to make the mistake of invoking the
     implementing function instead of the method through accidental
     miscapitalization (as multiple Lucy PMC members have done when starting
     out.)

Marvin Humphrey

Re: [lucy-dev] Hiding struct defs

Posted by Nick Wellnhofer <we...@aevum.de>.
On 11/04/2012 19:09, Marvin Humphrey wrote:
> On Tue, Apr 10, 2012 at 12:00 PM, Nick Wellnhofer<we...@aevum.de>  wrote:
>> The attached patch contains are a rough attempt to initialize the VTables at
>> run-time.
>
> Looks great, +1 to commit!
>
> Miscellaneous notes:
>
>    * cfish_bootstrap_parcel() will need a per-parcel name, I think.

Yes, but I couldn't find a way to get the parcel name from the CFCBind* 
classes. It looks like it's only available to the CFCPerl* classes.

>    * In VTable_bootstrap, why not just have self->name be an ordinary CharBuf
>      rather than a ViewCharBuf?  OH!  HAHAHA!  Not bootstrapped yet! XD

It could also be a CharBuf, but the object has to be created manually, 
since the VTables aren't set up yet.

>> I also tried to malloc the VTables dynamically, but this broke
>> RAWPOSTING_BLANK where a VTable is used to initialize a global struct.
>
> I've just committed a change that kills off RAWPOSTING_BLANK.
>
> I suspect that the ZombieCharBuf EMPTY is also going to get in your way; I'll
> look into deep-sixing that one later today.
>
> There may be others; we'll just address them one-by-one until we can malloc
> dynamically.

OK.

Nick

Re: [lucy-dev] Hiding struct defs

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Tue, Apr 10, 2012 at 12:00 PM, Nick Wellnhofer <we...@aevum.de> wrote:
> The attached patch contains are a rough attempt to initialize the VTables at
> run-time.

Looks great, +1 to commit!

Miscellaneous notes:

  * cfish_bootstrap_parcel() will need a per-parcel name, I think.
  * I savored seeing those silly "class_name_def" variables disappear!
  * In VTable_bootstrap, why not just have self->name be an ordinary CharBuf
    rather than a ViewCharBuf?  OH!  HAHAHA!  Not bootstrapped yet! XD
  * Good naming convention with "CFCBindClass_to_c_data",
    "CFCBindClass_to_vtable_bootstrap" and "CFCBindClass_to_vtable_register".
  * We're going to need to rework the Valgrind suppressions once things settle
    down.

> I also tried to malloc the VTables dynamically, but this broke
> RAWPOSTING_BLANK where a VTable is used to initialize a global struct.

I've just committed a change that kills off RAWPOSTING_BLANK.

I suspect that the ZombieCharBuf EMPTY is also going to get in your way; I'll
look into deep-sixing that one later today.

There may be others; we'll just address them one-by-one until we can malloc
dynamically.

> The patch also moves the code to register the VTables from boot.c to
> parcel.c which seemed like the right place, because this initialization code
> does not depend on the host language.

+1, perfect!

Marvin Humphrey