You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Nick Wellnhofer <we...@aevum.de> on 2013/05/26 18:21:22 UTC

[lucy-dev] Improving support for multiple parcels

Hello lucy-dev,

I just pushed a new branch 'separate-clownfish-wip1' which goes a long way toward separating Clownfish from Lucy and supporting multiple parcels in general. Have a look at the commit log for more details.

The biggest obstacle for now is that the serialization methods of Clownfish::Obj use the InStream and OutStream classes which are still part of the Lucy parcel.

Another thing I'd like to do is to move the Clownfish runtime and Lucy tests to separate parcels. Any suggestions for how to name the test parcels?

Nick


Re: [lucy-dev] Improving support for multiple parcels

Posted by Nick Wellnhofer <we...@aevum.de>.
On May 27, 2013, at 07:38 , Marvin Humphrey <ma...@rectangular.com> wrote:

> If there are no objections to this plan, I can work on it this week.

+1

>> Another thing I'd like to do is to move the Clownfish runtime and Lucy tests
>> to separate parcels. Any suggestions for how to name the test parcels?
> 
> *   clownfish::test nickname Test -- TestBatch, TestRunner, etc.
> *   test::clownfish nickname TestCfish -- TestCharBuf, TestHash, etc.
> *   test::org::apache::lucy nickname TestLucy -- TestAnalyzer, etc.
> 
> I expect that the first one, clownfish::test, will be a public API -- the
> default test harness for Clownfish-powered projects.  The other two are the
> test suites for their respective parcels.

I just pushed some commits to 'separate-clownfish-wip1' that move the test suites to separate parcels. And I reworked the test harness again, hopefully for the last time ;)

The Perl CFC tests in the branch are currently broken. I'd have to add Perl bindings for the new features first but I'm not sure whether we really need complete bindings for CFC.

Nick


Re: [lucy-dev] Improving support for multiple parcels

Posted by Nick Wellnhofer <we...@aevum.de>.
On May 27, 2013, at 07:38 , Marvin Humphrey <ma...@rectangular.com> wrote:

> If there are no objections to this plan, I can work on it this week.

I've checked out your new branch 'lucify-serialization' and rebased my 'separate-clownfish' branch on top of it (not yet pushed). There were only a couple of trivial conflicts, so please merge 'lucify-serialization' first.

Then I can complete the separation of Clownfish and Lucy code, merge 'separate-clownfish', and start to build a separate library for Clownfish. It think it's easier to build CFC and the Clownfish runtime in one go but we can also use two separate build processes. There's also the question of how to name the Clownfish C library: libcfish or libclownfish?

In a next step, we can split off the test suites. It occurred to me that it might make sense to rename the test classes from "Clownfish::Test::*" and "Lucy::Test::*" to "TestClownfish::*" and "TestLucy::*" matching the parcel names. Other things that probably need to be done:

    * Move the test code from 'core' to separate directories.
    * Create a second Makefile for the C build of the test suites.
    * More centralization of the Perl build system.

Nick


Re: [lucy-dev] Improving support for multiple parcels

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Sun, May 26, 2013 at 9:21 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> I just pushed a new branch 'separate-clownfish-wip1' which goes a long way
> toward separating Clownfish from Lucy and supporting multiple parcels in
> general. Have a look at the commit log for more details.

Nick++

I think we should prepare to release 0.4.0 once the long-overdue separation of
Clownfish and Lucy into distinct shared objects is finally done.  (And in the
future I'll try harder to avoid putting in-progress features onto master.)

There are a couple really nice commits on your branch which I'll reply to
separately.

> The biggest obstacle for now is that the serialization methods of
> Clownfish::Obj use the InStream and OutStream classes which are still part
> of the Lucy parcel.

I've known that this was on the horizon for a while and would block the
Clownfish/Lucy separation, so I've already been tinkering with a few different
approaches.  I've come to believe that a stopgap would be best: do whatever it
takes to remove serialization from Clownfish and make it bare-minimum
functional in Lucy.  That way we don't have to worry about what the Clownfish
i/o API should look like before the next release.

1.  First, remove Serialize() and Deserialize() from Obj.  Those methods will
    then become novel on a number of Lucy classes: Query, Doc, DocVector,
    TermVector, SortSpec, SortRule, MatchDoc, TopDocs, etc.
2.  Make sure that Lucy::Util::Freezer's FREEZE and THAW continue to work on
    all of those classes.  We'll need a stupid chain of class checks like
    `if (Obj_Is_A(obj, QUERY))` followed by casts and type-specific
    invocations like `Query_Serialize((Query*)obj, outstream)` since we'll no
    longer be able to invoke `Obj_Serialize(obj, outstream)`.  We'll also need
    to implement serialization routines for CharBuf, Hash, VArray, Num, etc.
    inside Freezer.c.
3.  Change the STORABLE_freeze and STORABLE_thaw routines to wrap
    Freezer_freeze and Freezer_thaw.
4.  Move the STORABLE_freeze and STORABLE_thaw XS routines from Clownfish::Obj
    into Lucy::Util::Freezer, and then use Exporter to install them into all
    the relevant Lucy Perl packages.

If I've reasoned this through correctly, we won't need to change any code in
LucyX::Remote -- it can continue to use Storable for object serialization.

Looking forward, I think we should move towards Apache Avro as a serialization
format for both RPC and index metadata -- but switching mechanisms is too
ambitious for the near term.

If there are no objections to this plan, I can work on it this week.

> Another thing I'd like to do is to move the Clownfish runtime and Lucy tests
> to separate parcels. Any suggestions for how to name the test parcels?

*   clownfish::test nickname Test -- TestBatch, TestRunner, etc.
*   test::clownfish nickname TestCfish -- TestCharBuf, TestHash, etc.
*   test::org::apache::lucy nickname TestLucy -- TestAnalyzer, etc.

I expect that the first one, clownfish::test, will be a public API -- the
default test harness for Clownfish-powered projects.  The other two are the
test suites for their respective parcels.

Marvin Humphrey