You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Nick Wellnhofer <we...@aevum.de> on 2014/07/30 19:27:32 UTC

Re: [lucy-dev] git commit: refs/heads/overridden_aliases - Adjust for Clownfish fixes to overridden aliases

On Jul 30, 2014, at 17:14 , Peter Karman <pe...@peknet.com> wrote:

> nwellnhof@apache.org wrote on 7/30/14 9:43 AM:
>> Adjust for Clownfish fixes to overridden aliases
>> 
>> The callback from C to Perl calls the alias '_make_compiler' now, so
>> this is the method that has to be overridden.
>> 
>> This is probably confusing for authors of custom queries in Perl and a
>> better fix would be nice.
>> 
> 
> As one of those authors, I agree. A better fix would be nice.
> 
> What was the original impetus for making the Perl method "private" (leading
> underscore) if it is part of the public API?

Make_Compiler is declared in Clownfish as follows:

    public abstract incremented Compiler*
    Make_Compiler(Query *self, Searcher *searcher, float boost,
                  bool subordinate = false);

For the Perl bindings, we want to use a default value if ‘boost’ is undefined or missing (the Query’s boost value). C doesn’t have the notion of undefined or missing arguments. So ‘Make_Compiler’ is aliased to ‘_make_compiler’ and a Perl wrapper named ‘make_compiler’ is used to set the default.

Until now, Clownfish didn’t account for host method aliases when calling overridden Perl methods, so ‘make_compiler’ was called. This happens to work as expected in this case, but generally it is wrong. Clownfish should invoke methods overridden from Perl under their aliased name since it can’t assume that there’s a wrapper with the original name doing the right thing. I fixed this in the Clownfish branch ‘overridden_aliases’.

I can see two solutions.

A. Keep the Perl wrapper possibly using different names for the original method and the wrapper. Unfortunately, this wouldn’t be backward compatible. At least, we could choose whether to break either implementors or users of ‘make_compiler’.

B. Emulate the default value on the C level, for example by using a special float constant to signal that the default should be used:

    public abstract incremented Compiler*
    Make_Compiler(Query *self, Searcher *searcher,
                  float boost = special_value,
                  bool subordinate = false);

Then we could drop the alias and wrapper without affecting backward compatibility. Using zero or a negative number as default value would be easy but might be confusing. A value like NaN (not a number) would be the best choice in my opinion but this requires more work on the Clownfish side.

Nick


Re: [lucy-dev] git commit: refs/heads/overridden_aliases - Adjust for Clownfish fixes to overridden aliases

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jul 30, 2014, at 20:37 , Peter Karman <pe...@peknet.com> wrote:

> Nick Wellnhofer wrote on 7/30/14 1:27 PM:
> 
>> I can see two solutions.
>> 
>> A. Keep the Perl wrapper possibly using different names for the original
>> method and the wrapper. Unfortunately, this wouldn’t be backward compatible.
>> At least, we could choose whether to break either implementors or users of
>> ‘make_compiler’.
>> 
>> B. Emulate the default value on the C level, for example by using a special
>> float constant to signal that the default should be used:
>> 
>> public abstract incremented Compiler* Make_Compiler(Query *self, Searcher
>> *searcher, float boost = special_value, bool subordinate = false);
> 
> Or (C) (which Nick suggested to me on irc) is making 'boost' a required param,
> which means the method signature changes but not the method name. I like that best.

Here’s the only place I could find from a quick look where a change would be required:

https://github.com/karpet/lucyx-search-nulltermquery/blob/master/lib/LucyX/Search/AnyTermCompiler.pm#L121

In this case, it seems that the parent query’s boost should be passed to the child query. The cases where Make_Compiler is invoked on a “root” query are probably mostly handled from the C code so making the ‘boost’ parameter required might actually be beneficial because it forces custom query authors to think about passing it on.

Nick


Re: [lucy-dev] Re: Adjust for Clownfish fixes to overridden aliases

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Wed, Aug 20, 2014 at 2:51 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> On 31/07/2014 15:25, Nick Wellnhofer wrote:
>>
>> On 31/07/2014 04:00, Marvin Humphrey wrote:
>>>
>>> It is still important to come up with a general solution to the problem
>>> of autogenerated bindings which do not behave quite right.  Our current
>>> hack -- using CFC to autogenerate a "private" helper (i.e. prepended
>>> with an underscore) and then creating a wrapper with the "real" name --
>>> has been revealed as unworkable.  However, that hack is still used in
>>> many places -- see much of Lucy.pm and Clownfish.pm.
>>
>> Clownfish.pm seems to be OK. There are no such aliases, only a couple of
>> helper functions that are called from C.
>
> Having a closer look, there are two cases where Clownfish uses Perl
> wrappers: String#Clone and VArray#Clone. But the XSUBs aren't excluded so we
> have both an XSub and a Perl sub. This looks like a bug.

It's an old hack:

        # Defeat obscure bugs in the XS auto-generation by redefining clone().
        # (Because of how the typemap works for String*,
        # the auto-generated methods return UTF-8 Perl scalars rather than
        # actual String objects.)
        no warnings 'redefine';
        sub clone { shift->_clone(@_) }

Without that hack, calling `$clownfish_string->clone` returns a Perl scalar
rather than another Clownfish::String, because cfish_String* is mapped to a
UTF-8 Perl scalar in our clownfish-to-perl data conversion routines.

The reason that we have to redefine the Perl-space method clone() for
Clownfish::String is because Clownfish's XSUB binding generator doesn't offer
per-class granularity -- it's either generate clone() XSUBs for *all*
subclasses of Obj, or none.

Marvin Humphrey

Re: [lucy-dev] Re: Adjust for Clownfish fixes to overridden aliases

Posted by Nick Wellnhofer <we...@aevum.de>.
On 31/07/2014 15:25, Nick Wellnhofer wrote:
> On 31/07/2014 04:00, Marvin Humphrey wrote:
>> It is still important to come up with a general solution to the problem
>> of autogenerated bindings which do not behave quite right.  Our current
>> hack -- using CFC to autogenerate a "private" helper (i.e. prepended
>> with an underscore) and then creating a wrapper with the "real" name --
>> has been revealed as unworkable.  However, that hack is still used in
>> many places -- see much of Lucy.pm and Clownfish.pm.
>
> Clownfish.pm seems to be OK. There are no such aliases, only a couple of
> helper functions that are called from C.

Having a closer look, there are two cases where Clownfish uses Perl wrappers: 
String#Clone and VArray#Clone. But the XSUBs aren't excluded so we have both 
an XSub and a Perl sub. This looks like a bug.

> Regarding the hand-rolled XS, there's another subtle issue that I encountered
> when working on callbacks for excluded methods. For hand-rolled XS methods, we
> call '$binding->exclude_method' to tell CFC not to autogenerate an XSUB for
> us. But that's not enough to determine whether callbacks for overriding from
> Perl should be allowed. There are cases where we want callbacks to work, like
> Query#Make_Compiler. But in other cases, the hand-rolled XS has incompatible
> parameters so overriding would break. So the API should look like:
>
>      # Don't generate XSUB and callback. A method with the same name
>      # can be defined via custom XS or Perl but callbacks won't work.
>      $binding->exclude_method(...);
>
>      # Don't generate XSUB but create callback.
>      $binding->exclude_xsub(...);
>
> There's also a third case where we want to autogenerate an XSUB but no
> callback. I can't think of an immediate use case, but at some point, we might
> want to allow custom callbacks.

I created a branch overridden_exclusions laying some groundwork and 
consolidating some code. The only effective change for now is that the XSUBs 
for To_Host are removed, so I think it's safe to merge.

Nick



[lucy-dev] Re: Adjust for Clownfish fixes to overridden aliases

Posted by Nick Wellnhofer <we...@aevum.de>.
On 31/07/2014 04:00, Marvin Humphrey wrote:
> It is still important to come up with a general solution to the problem
> of autogenerated bindings which do not behave quite right.  Our current
> hack -- using CFC to autogenerate a "private" helper (i.e. prepended
> with an underscore) and then creating a wrapper with the "real" name --
> has been revealed as unworkable.  However, that hack is still used in
> many places -- see much of Lucy.pm and Clownfish.pm.

Clownfish.pm seems to be OK. There are no such aliases, only a couple of 
helper functions that are called from C.

> The underlying issue is that now that Nick has fixed the callback
> behavior to invoke the alias (THANK YOU!) `bind_method` should not be
> used to create helpers -- only "real" methods.  I suggest a workaround
> of creating those private helpers with hand-rolled XS -- which, unlike
> abusing `bind_method` for that purpose, will not have the side effect of
> changing what Perl-space name gets invoked for callbacks.

Regarding the hand-rolled XS, there's another subtle issue that I encountered 
when working on callbacks for excluded methods. For hand-rolled XS methods, we 
call '$binding->exclude_method' to tell CFC not to autogenerate an XSUB for 
us. But that's not enough to determine whether callbacks for overriding from 
Perl should be allowed. There are cases where we want callbacks to work, like 
Query#Make_Compiler. But in other cases, the hand-rolled XS has incompatible 
parameters so overriding would break. So the API should look like:

     # Don't generate XSUB and callback. A method with the same name
     # can be defined via custom XS or Perl but callbacks won't work.
     $binding->exclude_method(...);

     # Don't generate XSUB but create callback.
     $binding->exclude_xsub(...);

There's also a third case where we want to autogenerate an XSUB but no 
callback. I can't think of an immediate use case, but at some point, we might 
want to allow custom callbacks.

Nick


Re: [lucy-dev] git commit: refs/heads/overridden_aliases - Adjust for Clownfish fixes to overridden aliases

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Wed, Jul 30, 2014 at 11:37 AM, Peter Karman <pe...@peknet.com> wrote:

> Or (C) (which Nick suggested to me on irc) is making 'boost' a required
> param, which means the method signature changes but not the method name. I
> like that best.

+1

I agree that making `boost` required is a good solution to this specific
problem.

It is still important to come up with a general solution to the problem
of autogenerated bindings which do not behave quite right.  Our current
hack -- using CFC to autogenerate a "private" helper (i.e. prepended
with an underscore) and then creating a wrapper with the "real" name --
has been revealed as unworkable.  However, that hack is still used in
many places -- see much of Lucy.pm and Clownfish.pm.

Our top priority IMO should be to make things simple for end users of
Lucy and other Clownfish-powered libraries.  It's preferable to tell
users of Lucy::Search::Query that they should override `make_compiler`,
rather than to tell them that `make_compiler` calls a helper which they
need to override.

Conceptually, the simplest approach is to implement all those
clone/_clone, new/_new, make_compiler/_make_compiler pairs in Lucy.pm
and Clownfish.pm as single unified XS functions.   But sometimes that
will be cumbersome, because we may want to use Perl functionality which
is inconvenient to access from C -- like the regex engine.

The underlying issue is that now that Nick has fixed the callback
behavior to invoke the alias (THANK YOU!) `bind_method` should not be
used to create helpers -- only "real" methods.  I suggest a workaround
of creating those private helpers with hand-rolled XS -- which, unlike
abusing `bind_method` for that purpose, will not have the side effect of
changing what Perl-space name gets invoked for callbacks.

Marvin Humphrey

Re: [lucy-dev] git commit: refs/heads/overridden_aliases - Adjust for Clownfish fixes to overridden aliases

Posted by Peter Karman <pe...@peknet.com>.
Nick Wellnhofer wrote on 7/30/14 1:27 PM:



> I can see two solutions.
> 
> A. Keep the Perl wrapper possibly using different names for the original
> method and the wrapper. Unfortunately, this wouldn’t be backward compatible.
> At least, we could choose whether to break either implementors or users of
> ‘make_compiler’.
> 
> B. Emulate the default value on the C level, for example by using a special
> float constant to signal that the default should be used:
> 
> public abstract incremented Compiler* Make_Compiler(Query *self, Searcher
> *searcher, float boost = special_value, bool subordinate = false);

Or (C) (which Nick suggested to me on irc) is making 'boost' a required param,
which means the method signature changes but not the method name. I like that best.



-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com