You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2013/07/24 01:09:04 UTC

[lucy-dev] Destructor binding problems

Greets,

Nick wasted no time after the prod from Wheeler's email earlier today with a
commit to fix the Clownfish::Method destructor problem.  However, 4 tests are
still failing on master when run with Perl 5.16 and above.

    Test Summary Report
    -------------------
    t/514-and_matcher.t                 (Wstat: 11 Tests: 1362 Failed: 0)
      Non-zero wait status: 11
    t/518-or_scorer.t                   (Wstat: 11 Tests: 1617 Failed: 0)
      Non-zero wait status: 11
    t/519-req_opt_matcher.t             (Wstat: 11 Tests: 792 Failed: 0)
      Non-zero wait status: 11
    t/526-not_query.t                   (Wstat: 11 Tests: 61 Failed: 0)
      Non-zero wait status: 11

The problem seems to be related to how we choose to create destructor bindings
when a binding is not spec'd for an ancestor class.  Here's a superficial fix:

diff --git a/perl/buildlib/Lucy/Build/Binding/Search.pm
b/perl/buildlib/Lucy/Build/Binding/Search.pm
index af63475..c78ac4a 100644
--- a/perl/buildlib/Lucy/Build/Binding/Search.pm
+++ b/perl/buildlib/Lucy/Build/Binding/Search.pm
@@ -44,6 +44,7 @@ sub bind_all {
     $class->bind_parserelem;
     $class->bind_phrasequery;
     $class->bind_phrasecompiler;
+    $class->bind_polymatcher;
     $class->bind_polyquery;
     $class->bind_polysearcher;
     $class->bind_query;
@@ -603,6 +604,14 @@ sub bind_phrasecompiler {
     Clownfish::CFC::Binding::Perl::Class->register($binding);
 }

+sub bind_polymatcher {
+    my $binding = Clownfish::CFC::Binding::Perl::Class->new(
+        parcel     => "Lucy",
+        class_name => "Lucy::Search::PolyMatcher",
+    );
+    Clownfish::CFC::Binding::Perl::Class->register($binding);
+}
+


The difference in the generated code for Lucy.xs is somewhat surprising.
Without the binding spec for PolyMatcher, we don't generate Perl bindings for
the destructors of any of its descendants -- see below my sig for the diff.

A more robust approach might be to default to generating bindings whenever
possible for every class and every method in a parcel.  (C types which cannot
be mapped automatically, such as `int*`, prevent some methods from being bound
automatically.)  This will cost us in terms of compiled object size and build
time, but it's conceptually simpler and will cut down on weird edge cases like
the one we have to disentangle now.

Marvin Humphrey



$ diff -u Lucy.xs lib/Lucy.xs
--- Lucy.xs 2013-07-23 14:55:22.000000000 -0700
+++ lib/Lucy.xs 2013-07-23 16:01:28.000000000 -0700
@@ -17605,6 +17605,133 @@
     XSRETURN(1);
 }

+XS(XS_Lucy_Search_PolyMatcher_new);
+XS(XS_Lucy_Search_PolyMatcher_new) {
+    dXSARGS;
+    CHY_UNUSED_VAR(cv);
+    if (items < 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(class_name,
...)",  GvNAME(CvGV(cv))); }
+    SP -= items;
+
+    cfish_VArray* children = NULL;
+    lucy_Similarity* similarity = NULL;
+    bool args_ok = XSBind_allot_params(
+        &(ST(0)), 1, items,         ALLOT_OBJ(&children, "children",
8, true, CFISH_VARRAY, NULL),
+        ALLOT_OBJ(&similarity, "similarity", 10, true, LUCY_SIMILARITY, NULL),
+        NULL);
+    if (!args_ok) {
+        CFISH_RETHROW(CFISH_INCREF(cfish_Err_get_error()));
+    }
+    lucy_PolyMatcher* self = (lucy_PolyMatcher*)XSBind_new_blank_obj(ST(0));
+
+    lucy_PolyMatcher* retval = lucy_PolyMatcher_init(self, children,
similarity);
+    if (retval) {
+        ST(0) = (SV*)Cfish_Obj_To_Host((cfish_Obj*)retval);
+        Cfish_Obj_Dec_RefCount((cfish_Obj*)retval);
+    }
+    else {
+        ST(0) = newSV(0);
+    }
+    sv_2mortal(ST(0));
+    XSRETURN(1);
+}
+
+
+XS(XS_Lucy_Search_PolyMatcher_DESTROY);
+XS(XS_Lucy_Search_PolyMatcher_DESTROY) {
+    dXSARGS;
+    CHY_UNUSED_VAR(cv);
+    SP -= items;
+    if (items != 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(self)",
GvNAME(CvGV(cv))); };
+
+    /* Extract vars from Perl stack. */
+    lucy_PolyMatcher* self =
(lucy_PolyMatcher*)XSBind_sv_to_cfish_obj(ST(0), LUCY_POLYMATCHER,
NULL);
+
+    /* Execute */
+    Lucy_PolyMatcher_Destroy_t method =
CFISH_METHOD_PTR(LUCY_POLYMATCHER, Lucy_PolyMatcher_Destroy);
+    method(self);
+    XSRETURN(0);
+}
+
+XS(XS_Lucy_Search_ANDMatcher_DESTROY);
+XS(XS_Lucy_Search_ANDMatcher_DESTROY) {
+    dXSARGS;
+    CHY_UNUSED_VAR(cv);
+    SP -= items;
+    if (items != 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(self)",
GvNAME(CvGV(cv))); };
+
+    /* Extract vars from Perl stack. */
+    lucy_ANDMatcher* self =
(lucy_ANDMatcher*)XSBind_sv_to_cfish_obj(ST(0), LUCY_ANDMATCHER,
NULL);
+
+    /* Execute */
+    Lucy_ANDMatcher_Destroy_t method =
CFISH_METHOD_PTR(LUCY_ANDMATCHER, Lucy_ANDMatcher_Destroy);
+    method(self);
+    XSRETURN(0);
+}
+
+XS(XS_Lucy_Search_NOTMatcher_DESTROY);
+XS(XS_Lucy_Search_NOTMatcher_DESTROY) {
+    dXSARGS;
+    CHY_UNUSED_VAR(cv);
+    SP -= items;
+    if (items != 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(self)",
GvNAME(CvGV(cv))); };
+
+    /* Extract vars from Perl stack. */
+    lucy_NOTMatcher* self =
(lucy_NOTMatcher*)XSBind_sv_to_cfish_obj(ST(0), LUCY_NOTMATCHER,
NULL);
+
+    /* Execute */
+    Lucy_NOTMatcher_Destroy_t method =
CFISH_METHOD_PTR(LUCY_NOTMATCHER, Lucy_NOTMatcher_Destroy);
+    method(self);
+    XSRETURN(0);
+}
+
+XS(XS_Lucy_Search_ORMatcher_DESTROY);
+XS(XS_Lucy_Search_ORMatcher_DESTROY) {
+    dXSARGS;
+    CHY_UNUSED_VAR(cv);
+    SP -= items;
+    if (items != 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(self)",
GvNAME(CvGV(cv))); };
+
+    /* Extract vars from Perl stack. */
+    lucy_ORMatcher* self =
(lucy_ORMatcher*)XSBind_sv_to_cfish_obj(ST(0), LUCY_ORMATCHER, NULL);
+
+    /* Execute */
+    Lucy_ORMatcher_Destroy_t method =
CFISH_METHOD_PTR(LUCY_ORMATCHER, Lucy_ORMatcher_Destroy);
+    method(self);
+    XSRETURN(0);
+}
+
+XS(XS_Lucy_Search_ORScorer_DESTROY);
+XS(XS_Lucy_Search_ORScorer_DESTROY) {
+    dXSARGS;
+    CHY_UNUSED_VAR(cv);
+    SP -= items;
+    if (items != 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(self)",
GvNAME(CvGV(cv))); };
+
+    /* Extract vars from Perl stack. */
+    lucy_ORScorer* self =
(lucy_ORScorer*)XSBind_sv_to_cfish_obj(ST(0), LUCY_ORSCORER, NULL);
+
+    /* Execute */
+    Lucy_ORScorer_Destroy_t method = CFISH_METHOD_PTR(LUCY_ORSCORER,
Lucy_ORScorer_Destroy);
+    method(self);
+    XSRETURN(0);
+}
+
+XS(XS_Lucy_Search_RequiredOptionalMatcher_DESTROY);
+XS(XS_Lucy_Search_RequiredOptionalMatcher_DESTROY) {
+    dXSARGS;
+    CHY_UNUSED_VAR(cv);
+    SP -= items;
+    if (items != 1) { CFISH_THROW(CFISH_ERR, "Usage: %s(self)",
GvNAME(CvGV(cv))); };
+
+    /* Extract vars from Perl stack. */
+    lucy_RequiredOptionalMatcher* self =
(lucy_RequiredOptionalMatcher*)XSBind_sv_to_cfish_obj(ST(0),
LUCY_REQUIREDOPTIONALMATCHER, NULL);
+
+    /* Execute */
+    Lucy_ReqOptMatcher_Destroy_t method =
CFISH_METHOD_PTR(LUCY_REQUIREDOPTIONALMATCHER,
Lucy_ReqOptMatcher_Destroy);
+    method(self);
+    XSRETURN(0);
+}
+
 XS(XS_Lucy_Search_PolyQuery_new);
 XS(XS_Lucy_Search_PolyQuery_new) {
     dXSARGS;
@@ -27270,6 +27397,13 @@
     newXS("Lucy::Search::PhraseQuery::to_string",
XS_Lucy_Search_PhraseQuery_to_string, file);
     newXS("Lucy::Search::PhraseQuery::get_field",
XS_Lucy_Search_PhraseQuery_get_field, file);
     newXS("Lucy::Search::PhraseQuery::get_terms",
XS_Lucy_Search_PhraseQuery_get_terms, file);
+    newXS("Lucy::Search::PolyMatcher::new",
XS_Lucy_Search_PolyMatcher_new, file);
+    newXS("Lucy::Search::PolyMatcher::DESTROY",
XS_Lucy_Search_PolyMatcher_DESTROY, file);
+    newXS("Lucy::Search::ANDMatcher::DESTROY",
XS_Lucy_Search_ANDMatcher_DESTROY, file);
+    newXS("Lucy::Search::NOTMatcher::DESTROY",
XS_Lucy_Search_NOTMatcher_DESTROY, file);
+    newXS("Lucy::Search::ORMatcher::DESTROY",
XS_Lucy_Search_ORMatcher_DESTROY, file);
+    newXS("Lucy::Search::ORScorer::DESTROY",
XS_Lucy_Search_ORScorer_DESTROY, file);
+    newXS("Lucy::Search::RequiredOptionalMatcher::DESTROY",
XS_Lucy_Search_RequiredOptionalMatcher_DESTROY, file);
     newXS("Lucy::Search::PolyQuery::new", XS_Lucy_Search_PolyQuery_new, file);
     newXS("Lucy::Search::PolyQuery::DESTROY",
XS_Lucy_Search_PolyQuery_DESTROY, file);
     newXS("Lucy::Search::PolyQuery::equals",
XS_Lucy_Search_PolyQuery_equals, file);

Re: [lucy-dev] Destructor binding problems

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Sat, Jul 27, 2013 at 9:16 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> So shall I can extend my changes to generate constructor XSubs for every class?

+1

Marvin Humphrey

Re: [lucy-dev] Destructor binding problems

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jul 27, 2013, at 17:57 , Marvin Humphrey <ma...@rectangular.com> wrote:

> Cool!  I'm looking forward to deleting a bunch of code from
> perl/buildlib/Lucy/Build/Binding/*.pm. :)  It would be great if stuff like
> this was no longer necessary, because all classes were fully bound by default:
> 
>    sub bind_andmatcher {
>        my $binding = Clownfish::CFC::Binding::Perl::Class->new(
>            parcel     => "Lucy",
>            class_name => "Lucy::Search::ANDMatcher",
>        );
>        Clownfish::CFC::Binding::Perl::Class->register($binding);
>    }

+1

> We're close, but not quite there.  When I make this change to
> perl/buildlib/Lucy/Build/Binding/Search.pm, we lose the autogenerated
> constructor binding `XS_Lucy_Search_ANDMatcher_new` from Lucy.xs.
> 
>    --- a/perl/buildlib/Lucy/Build/Binding/Search.pm
>    +++ b/perl/buildlib/Lucy/Build/Binding/Search.pm
>    @@ -21,7 +21,7 @@ $VERSION = eval $VERSION;
> 
>     sub bind_all {
>         my $class = shift;
>    -    $class->bind_andmatcher;
>    +    #$class->bind_andmatcher;
>         $class->bind_andquery;

So shall I can extend my changes to generate constructor XSubs for every class?

> If the XSUB binding for Serialize() were implemented as above, we will wind up
> in an infinite loop when we invoke `$my_query->serialize($outstream)`:
> 
>    1.  Enter Perl subroutine `MyQuery::serialize`.
>    2.  Enter XSUB `Lucy::Search::Query::serialize`.
>    3.  Enter callback stub `Lucy_Query_Serialize_OVERRIDE`
>    4.  Goto 1.

Ah, yes. I didn't realize that with dynamic dispatch we'd call back into Perl-space.

Nick



Re: [lucy-dev] Destructor binding problems

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Fri, Jul 26, 2013 at 3:39 PM, Nick Wellnhofer <we...@aevum.de> wrote:
> On Jul 24, 2013, at 12:30 , Nick Wellnhofer <we...@aevum.de> wrote:
>
>> On Jul 24, 2013, at 01:09 , Marvin Humphrey <ma...@rectangular.com> wrote:
>>
>>> A more robust approach might be to default to generating bindings whenever
>>> possible for every class and every method in a parcel.  (C types which cannot
>>> be mapped automatically, such as `int*`, prevent some methods from being bound
>>> automatically.)  This will cost us in terms of compiled object size and build
>>> time, but it's conceptually simpler and will cut down on weird edge cases like
>>> the one we have to disentangle now.
>>
>> I think that's the best solution. We've been generating bindings for most methods anyway.
>
> This is now implemented in branch 'perl-method-bindings'.

Cool!  I'm looking forward to deleting a bunch of code from
perl/buildlib/Lucy/Build/Binding/*.pm. :)  It would be great if stuff like
this was no longer necessary, because all classes were fully bound by default:

    sub bind_andmatcher {
        my $binding = Clownfish::CFC::Binding::Perl::Class->new(
            parcel     => "Lucy",
            class_name => "Lucy::Search::ANDMatcher",
        );
        Clownfish::CFC::Binding::Perl::Class->register($binding);
    }

We're close, but not quite there.  When I make this change to
perl/buildlib/Lucy/Build/Binding/Search.pm, we lose the autogenerated
constructor binding `XS_Lucy_Search_ANDMatcher_new` from Lucy.xs.

    --- a/perl/buildlib/Lucy/Build/Binding/Search.pm
    +++ b/perl/buildlib/Lucy/Build/Binding/Search.pm
    @@ -21,7 +21,7 @@ $VERSION = eval $VERSION;

     sub bind_all {
         my $class = shift;
    -    $class->bind_andmatcher;
    +    #$class->bind_andmatcher;
         $class->bind_andquery;

> Looking at the code in CFCPerlClass_method_bindings, another question turned
> up. Currently, the Perl bindings generate XSubs for every fresh method to
> support SUPER:: invocations from Perl. But I can't see a legitimate use case
> where this would be needed.

Perhaps there's a misleading comment in the source code, but that's not why we
need to avoid XSUBs like this:

    void
    serialize(self, outstream)
        lucy_Query *self;
        lucy_OutStream *outstream;
    PPCODE:
        Lucy_Query_Serialize(self, outstream); // <--- vtable dispatch

> In my opinion, it only makes sense to invoke a
> method of a superclass on the 'self' object from a subclass. In this case,
> it would be enough to have a single XSub per novel method which uses dynamic
> dispatch.

Consider a `MyQuery` subclass of Query which adds a member variable "num" that
must be serialized.

    package MyQuery;
    use base qw( Lucy::Search::Query );

    sub serialize {
        my ($self, $outstream) = @_;
        $self->SUPER::serialize($outstream);
        $outstream->write_c32($self->get_num);
    }

When Clownfish creates the VTable for MyQuery, it will detect that MyQuery has
implemented Serialize() in Perl-space and will install a callback stub
`Lucy_Query_Serialize_OVERRIDE` at `Lucy_Query_Serialize_OFFSET` in the
newly created vtable for MyQuery.

If the XSUB binding for Serialize() were implemented as above, we will wind up
in an infinite loop when we invoke `$my_query->serialize($outstream)`:

    1.  Enter Perl subroutine `MyQuery::serialize`.
    2.  Enter XSUB `Lucy::Search::Query::serialize`.
    3.  Enter callback stub `Lucy_Query_Serialize_OVERRIDE`
    4.  Goto 1.

To stop the infinite loop, we need to implement the XSUB for
`Lucy::Search::Query::serialize` so that it is fixed to always use the
implementation from Query rather than the implementation extracted from self's
VTable.

    void
    serialize(self, outstream)
        lucy_Query *self;
        lucy_OutStream *outstream;
    PPCODE:
        Lucy_Query_Serialize_t method
            = (Lucy_Query_Serialize_t)CFISH_METHOD_PTR(LUCY_QUERY,
                                                       Lucy_Query_Serialize);
        method(self, outstream);

In order for SUPER invocations to work properly for all methods, we need to
generate one XSUB for each "fresh" method implementation.  That's a lot of XS
code, but I don't see a way around it.

Does that make sense?

Marvin Humphrey

Re: [lucy-dev] Destructor binding problems

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jul 24, 2013, at 12:30 , Nick Wellnhofer <we...@aevum.de> wrote:

> On Jul 24, 2013, at 01:09 , Marvin Humphrey <ma...@rectangular.com> wrote:
> 
>> A more robust approach might be to default to generating bindings whenever
>> possible for every class and every method in a parcel.  (C types which cannot
>> be mapped automatically, such as `int*`, prevent some methods from being bound
>> automatically.)  This will cost us in terms of compiled object size and build
>> time, but it's conceptually simpler and will cut down on weird edge cases like
>> the one we have to disentangle now.
> 
> I think that's the best solution. We've been generating bindings for most methods anyway.

This is now implemented in branch 'perl-method-bindings'.

Looking at the code in CFCPerlClass_method_bindings, another question turned up. Currently, the Perl bindings generate XSubs for every fresh method to support SUPER:: invocations from Perl. But I can't see a legitimate use case where this would be needed. In my opinion, it only makes sense to invoke a method of a superclass on the 'self' object from a subclass. In this case, it would be enough to have a single XSub per novel method which uses dynamic dispatch. Things like $arbitrary_object->SUPER::method() wouldn't be possible, but this looks error-prone to me anyway. No sane code should do that. For example, languages like Java don't allow that at all.

Nick


Re: [lucy-dev] Destructor binding problems

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jul 24, 2013, at 01:09 , Marvin Humphrey <ma...@rectangular.com> wrote:

> The difference in the generated code for Lucy.xs is somewhat surprising.
> Without the binding spec for PolyMatcher, we don't generate Perl bindings for
> the destructors of any of its descendants -- see below my sig for the diff.

This is because of my changes to CFCPerlClass_method_bindings trying to make XSub creation for methods from another parcel work. Turns out that there are still some holes.

> A more robust approach might be to default to generating bindings whenever
> possible for every class and every method in a parcel.  (C types which cannot
> be mapped automatically, such as `int*`, prevent some methods from being bound
> automatically.)  This will cost us in terms of compiled object size and build
> time, but it's conceptually simpler and will cut down on weird edge cases like
> the one we have to disentangle now.

I think that's the best solution. We've been generating bindings for most methods anyway.

Nick