You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by "Marvin Humphrey (JIRA)" <ji...@apache.org> on 2009/08/27 07:08:59 UTC

[jira] Created: (LUCY-27) Boilerplater Perl bindings

Boilerplater Perl bindings
--------------------------

                 Key: LUCY-27
                 URL: https://issues.apache.org/jira/browse/LUCY-27
             Project: Lucy
          Issue Type: Sub-task
          Components: Boilerplater
            Reporter: Marvin Humphrey


Iterate over the classes, methods, etc. in a Boilerplater::Hierarchy, 
auto-generating binding code to bridge Perl-space and C-space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-27) Boilerplater Perl bindings

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-27:
--------------------------------

    Attachment: Class.pm

Boilerplater::Binding::Perl::Class allows us to create highly customized
binding specifications for individual classes.  The Perl code which
communicates the spec will be embedded in the individual Lucy Perl module
files, which will be mostly empty.

Here's a sample ANDQuery.pm file:

{code:none}
use Lucy;

1;

__END__

__BINDING__

my $synopsis = <<'END_SYNOPSIS';
    my $foo_and_bar_query = Lucy::Search::ANDQuery->new(
        children => [ $foo_query, $bar_query ],
    );
    my $hits = $searcher->hits( query => $foo_and_bar_query );
    ...
END_SYNOPSIS

my $constructor = <<'END_CONSTRUCTOR';
    my $foo_and_bar_query = Lucy::Search::ANDQuery->new(
        children => [ $foo_query, $bar_query ],
    );
END_CONSTRUCTOR

Boilerplater::Binding::Perl::Class->register(
    parcel            => "Lucy",
    class_name        => "Lucy::Search::ANDQuery",
    bind_constructors => ["new"],
    make_pod          => {
        methods     => [qw( add_child )],
        synopsis    => $synopsis,
        constructor => { sample => $constructor },
    },
);

__COPYRIGHT__

    /**
     * Copyright 2009 The Apache Software Foundation
     *
     * Licensed under the Apache License, Version 2.0 (the "License");
     * you may not use this file except in compliance with the License.
     * You may obtain a copy of the License at
     *
     *     http://www.apache.org/licenses/LICENSE-2.0
     *
     * Unless required by applicable law or agreed to in writing, software
     * distributed under the License is distributed on an "AS IS" BASIS,
     * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
     * implied.  See the License for the specific language governing
     * permissions and limitations under the License.
     */
{code}

In addition to the XS bindings, most POD will be autogenerated, derived
from the JavaDoc-style documentation in the Boilerplater header files. 

Other languages will be able to do something similar, allowing us to share
documentation text and only requiring that we create customized code 
samples for individual host languages.

> Boilerplater Perl bindings
> --------------------------
>
>                 Key: LUCY-27
>                 URL: https://issues.apache.org/jira/browse/LUCY-27
>             Project: Lucy
>          Issue Type: Sub-task
>          Components: Boilerplater
>            Reporter: Marvin Humphrey
>         Attachments: Class.pm, Constructor.pm, Method.pm, Perl.pm, Subroutine.pm, TypeMap.pm
>
>
> Iterate over the classes, methods, etc. in a Boilerplater::Hierarchy, 
> auto-generating binding code to bridge Perl-space and C-space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-27) Boilerplater Perl bindings

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-27:
--------------------------------

    Attachment: XSBind.c
                XSBind.h

XSBind.h and XSBind.c define C routines used by the
Boilerplater::Binding::Perl and friends for converting back and forth between
Perl data structures and Lucy data structures.

The default conversion routines perform deep conversion on data structures as
they cross the from Lucy to Perl: Lucy Hash objects become Perl hashes, Lucy
VArrays become perl VArrays, CharBufs are converted to UTF-8-enabled Perl
scalars.  This means that, for example, when you invoke Seg_Fetch_Metadata from
Perl-space, you will get back an ordinary Perl hash rather than a perl-wrapped
Lucy::Obj::Hash object.

{code}
my $metadata = $segment->fetch_metadata("lexicon");
die "unrecognized format" if $metadata->{format} > 2;
{code}

In general, it would be more efficient to just return the Lucy objects, but
that yields an inferior interface for Perl users.  

Another approach would be to return the Lucy objects but give them a more
user-friendly surface via Perl overloading and tied variables.  However, it is
my experience that overloaded and tied variables are difficult to troubleshoot
and breed a lot of bugs.  

There are two cases where I think overloading can be effective: hashref
overloading on Doc objects and stringification overloading on Err objects;
perhaps other cases will present themselves in time.  However, I think
enabling overloading for a few isolated cases makes more sense than relying on
it as a core technology for moving data back and forth between Perl-space and
Lucy-space.

> Boilerplater Perl bindings
> --------------------------
>
>                 Key: LUCY-27
>                 URL: https://issues.apache.org/jira/browse/LUCY-27
>             Project: Lucy
>          Issue Type: Sub-task
>          Components: Boilerplater
>            Reporter: Marvin Humphrey
>         Attachments: Class.pm, Constructor.pm, Method.pm, Perl.pm, Subroutine.pm, TypeMap.pm, XSBind.c, XSBind.h
>
>
> Iterate over the classes, methods, etc. in a Boilerplater::Hierarchy, 
> auto-generating binding code to bridge Perl-space and C-space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-27) Boilerplater Perl bindings

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-27:
--------------------------------

    Attachment: Constructor.pm
                Method.pm
                Subroutine.pm

Boilerplater::Binding::Perl::Method, Boilerplater::Binding::Perl::Constructor,
and their common abstract parent, Boilerplater::Binding::Perl::Subroutine, are
responsible for autogenerating complete XSubs which allow public Lucy methods to
be invoked from Perl. 

Methods which take two or more arguments will be automatically set up to take
labeled, hash-style parameters.  Methods which take one arg will be set up to
take a an unlabeled positional argument:

{code:none}
my $query = $query_parser->parse($query_string);    # positional arg
my $hits  = $searcher->hits(                        # labeled params
    query      => $query,
    num_wanted => 20,
);
{code}

Constructors always take labeled parameters, even if the C constructor only
takes one argument.

{code:none}
my $folder = Lucy::Storage::FSFolder->new(
    path => '/path/to/folder',
);
{code}

Default argument values are extracted from the Boilerplater::ParamList
signature and baked into the C code of the XSub itself.  If a parameter is
either not supplied or supplied as undef, the default value will be used when
invoking the C function.  If there is no default value and a parameter is
either missing or undef, an error will occur.  

Supplying an unrecognized parameter also results in an error.   Validation is
implemented using an autogenerated package global Perl hash, one per method
binding.  When the method is invoked from Perl-space, each label is checked
for existence in this hash:

{code:none}
our %Lucy::Storage::FSFolder::new_PARAMS = (
    path => undef,
);
{code}

The names of these parameter labels are taken from the variable names in the
Boilerplater function signature -- which is why argument names are considered
part of the public API for a Boilerplater function, and why the Boilerplater
compiler throws an error if a method which overrides another has a conflicting
argument name.  Changing an arg name would cause the contents of this hash to
change and existing invocations to start throwing "unrecognized parameter"
errors.

{code:none}
my $hits = $searcher->hits(
    query         => $query,
    invalid_param => 'kaboom',
);
{code}

Methods must be bound from the class under which they are first declared; that
implementation, and all subclass implementations then get their own dedicated
XSub.  The C function which implements the method is then invoked directly
rather than via Lucy's vtable dynamic dispatch -- i.e. lucy_Searcher_hits() is
invoked rather than Lucy_Searcher_Hits().  It may seem wasteful to generate so
many XSubs, but in fact doing things this way is crucial to subclassing, as
the more compact technique is not compatible with Perl's SUPER method
invocation syntax.

> Boilerplater Perl bindings
> --------------------------
>
>                 Key: LUCY-27
>                 URL: https://issues.apache.org/jira/browse/LUCY-27
>             Project: Lucy
>          Issue Type: Sub-task
>          Components: Boilerplater
>            Reporter: Marvin Humphrey
>         Attachments: Constructor.pm, Method.pm, Subroutine.pm, TypeMap.pm
>
>
> Iterate over the classes, methods, etc. in a Boilerplater::Hierarchy, 
> auto-generating binding code to bridge Perl-space and C-space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-27) Boilerplater Perl bindings

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-27:
--------------------------------

    Attachment: Perl.pm

Boilerplater::Binding::Perl ties everything together.  It is the only class
from this branch of the Boilerplater hierarchy that the Build script deals
with directly.

All compiled code will be linked into a single shared object, which will
belong to the Lucy.pm module -- invoking "use Lucy;" will load the entire
library.

> Boilerplater Perl bindings
> --------------------------
>
>                 Key: LUCY-27
>                 URL: https://issues.apache.org/jira/browse/LUCY-27
>             Project: Lucy
>          Issue Type: Sub-task
>          Components: Boilerplater
>            Reporter: Marvin Humphrey
>         Attachments: Class.pm, Constructor.pm, Method.pm, Perl.pm, Subroutine.pm, TypeMap.pm
>
>
> Iterate over the classes, methods, etc. in a Boilerplater::Hierarchy, 
> auto-generating binding code to bridge Perl-space and C-space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-27) Boilerplater Perl bindings

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-27:
--------------------------------

    Attachment: TypeMap.pm

Boilerplater::Binding::Perl::TypeMap's purpose is to create code fragments
which translate Boilerplater objects and C primitives to Perl 5 data types and
back again.

Translating C primitive types is reasonably straightforward.  (At least, it's
reasonably straightforward in the context of XS, Perl 5's voluminous,
macro-tastic C API.) Here's example code for round-tripping an i32_t through a
Perl scalar and back to C:

{code:none}
chy_i32_t twenty = 20;
SV *twenty_sv = newSViv(twenty);
twenty = SvIV(twenty_sv);
{code}

Wrapping and unwrapping most object types with Perl scalars is reasonably
straightforward as well from the perspective of TypeMap, because the
complexity is hidden behind function calls and the only tricky bit is the
cast.

{code:none}
SV *query_sv = XSBind_bp_to_perl(query);
query = (lucy_Query*)XSBind_perl_to_bp(query_sv);
{code}

XSBind_bp_to_perl and XSBind_perl_to_bp are pretty sophisticated, recursing
into complex data structures and translating Perl hashes and arrays to Hash
and VArray objects... but that's a topic for another time.  

One aspect of TypeMap that is not straightforward is handling of string
arguments.  The algorithm TypeMap and XSBind use is described at
[http://mail-archives.apache.org/mod_mbox/lucene-lucy-dev/200909.mbox/%3C20090901155811.GA17100@rectangular.com%3E].

The last item of note about TypeMap is that it generates the 'typemap' file
needed by XS on the fly, creating individual mappings for each Lucy class.
This allows us to perform "isa" tests which pass subclasses but deny other
Lucy objects -- for instance, if a method expects a Lexicon, supplying a
SegLexicon would work but supplying an Indexer or a CharBuf would fail.  And
since we perform this argument checking at the binding boundary, we don't need
to implement it in the core C functions, keeping them as lightweight as
possible.


> Boilerplater Perl bindings
> --------------------------
>
>                 Key: LUCY-27
>                 URL: https://issues.apache.org/jira/browse/LUCY-27
>             Project: Lucy
>          Issue Type: Sub-task
>          Components: Boilerplater
>            Reporter: Marvin Humphrey
>         Attachments: TypeMap.pm
>
>
> Iterate over the classes, methods, etc. in a Boilerplater::Hierarchy, 
> auto-generating binding code to bridge Perl-space and C-space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (LUCY-27) Boilerplater Perl bindings

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey resolved LUCY-27.
---------------------------------

    Resolution: Fixed
      Assignee: Marvin Humphrey

Committed as r811923 and r811925, with the addition of a small
utility function in r811918.

> Boilerplater Perl bindings
> --------------------------
>
>                 Key: LUCY-27
>                 URL: https://issues.apache.org/jira/browse/LUCY-27
>             Project: Lucy
>          Issue Type: Sub-task
>          Components: Boilerplater
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: Class.pm, Constructor.pm, Method.pm, Perl.pm, Subroutine.pm, TypeMap.pm, XSBind.c, XSBind.h
>
>
> Iterate over the classes, methods, etc. in a Boilerplater::Hierarchy, 
> auto-generating binding code to bridge Perl-space and C-space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.