You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2010/07/02 19:15:12 UTC

Draft proposal: Move Lucy to Incubator

PREFACE
    Lucy is a sub-project which is being spun off from the Lucene TLP but is
    not yet ready for graduation.  We propose to address certain needs of the
    project by assimilating the KinoSearch codebase, and to enter the
    Incubator on a top-level-project track.

ABSTRACT
    Lucy will be a loose port of the Lucene search engine library, written in
    C and targeted at dynamic language users.

PROPOSAL
    Lucy has two aims.  First, it will be a high-performance C search engine
    library.  Second, it will maximize its usability and power when accessed
    via dynamic language bindings.  To that end, it will present highly
    idiomatic, carefully tailored APIs for each of its "host" binding
    languages, including support for subclasses written entirely in the "host"
    language.

BACKGROUND
    Lucy, a "loose C" port of Java Lucene, began as an ambitious, from-scratch
    Lucene sub-project, with David Balmain (author of Ferret, a Ruby/C port of
    Lucene), Doug Cutting, and Marvin Humphrey (founder of KinoSearch, a
    Perl/C port) as committers.  During an initial burst of activity, the
    overall architecture for Lucy was sketched out by Dave and Marvin.
    Unfortunately, Dave became unavailable soon after, and without a working
    codebase to release or any users, it proved difficult to replace him.
    Still, Marvin carried on their work throughout a period of seemingly low
    activity.

    In the last year, that work has come to fruition: major technical
    milestones have been achieved and Lucy's underpinnings have been
    completed.  Additionally, other developers from the KinoSearch community
    have taken an interest in Lucy and have begun to ramp up their
    contributions.  The next steps for Lucy were articulated by the Lucene PMC
    in a recent review: make releases, acquire users, grow community.

    To implement the Lucene PMC's recommendations and get to a release as
    quickly as possible, the Lucy community proposes to assimilate the
    KinoSearch codebase, which has been retrofitted to use Lucy's core.  Lucy
    still lacks a number of important indexing and search classes; we wish to
    flesh these out via IP clearance work rather than software development.

    Since the Lucene PMC will not be responsible for Lucy much longer, it is
    more appropriate for the software grant to take place within the context
    of the Incubator than the Lucene TLP.  As none of our current members have
    Apache PMC experience, we also seek to take advantage of the Incubator
    environment to prepare ourselves for responsible self-governance.

RATIONALE
    There is great hunger for a search engine library in the mode of Lucene
    which is accessible from various dynamic languages, and for one accessible
    from pure C.  Individuals naturally wish to code in their language of
    choice.  Organizations which do not have significant Java expertise may
    not want to support Java strictly for the sake of running a Lucene
    installation.  Developers may want to take advantage of C's
    interoperability and fine-grained control.  Lucy will meet all these
    demands.

    Apache is a natural home for our project given the way it has always
    operated: user-driven innovation, security as a requirement, lively and
    amiable mailing list discussions, strength through diversity, and so on.
    We feel comfortable here, and we believe that we will become exemplary
    Apache citizens.

INITIAL GOALS
    * Make a 1.0 stable release as quickly as possible.
    * Concentrate on community expansion immediately thereafter.
    * Expose a public C API.

CURRENT STATUS
  Meritocracy
    Our initial committer list includes two individuals (Peter Karman and
    Nathan Kurz) who started off as KinoSearch users, demonstrated merit
    through constructive forum participation, adept negotiation, consensus
    building, and submission of high-quality contributions, and were invited
    to become committers.  Peter now rolls most releases.

    We look forward to continuing to operate as a meritocracy under the
    established traditions and rules of the ASF.

  Community
    Lucy's most active participants of late have been drawn from the
    KinoSearch and Lucene communities.  Having been focused on features and
    technical goals for a long time, we are considerably overdue for a stable
    release, and anticipate rapid growth in its wake.

  Core Developers
     * Marvin Humphrey is the project founder of KinoSearch, and co-founded
       the existing Lucy sub-project.  He is presently employed by Eventful,
       Inc.
     * Peter Karman has contributed to several open source projects since
       2001, including being a committer at http://swish-e.org/ (a search
       engine), http://code.google.com/p/rose/ (an ORM) and
       http://catalyst.perl.org/ (web framework).  He is employed by American
       Public Media.
     * Nathan Kurz has participated in numerous open source projects and has
       been a KinoSearch committer since 2007.  He is currently Chief Flavor
       Engineer of Scream Sorbet, and writes software in his copious free
       time.

  Alignment
    One Apache value which is particularly cherished by the Lucy community is
    codebase transparency.  We have developed institutions which enable us to
    measure and maximize usability (see http://wiki.apache.org/lucy/BrainLog),
    and we feel strongly that the bindings for Lucy must present APIs and
    documentation which are idiomatic to the host language culture so that end
    users can consume our work as easily as possible.

    The controlled competition of meritocratic community development is also
    very important to us.  There has been substantial cross-pollination of
    ideas between Lucene and Lucy, yielding considerable benefits for both
    projects.  The Lucy developers envision that our host-language
    sub-communities will approach using and extending the library in distinct
    ways; we hope to harness the creative tension between them to drive
    innovation, building productive relationships akin to the one that Lucene
    and Lucy have today.

    A third priority of ours is to be bound by existing Apache institutions,
    for the protection of all our stakeholders.

KNOWN RISKS
  Orphaned products
    All initial committers have been associated with the project for several
    years across multiple jobs.  However, at this time, the project would
    probably not survive the departure of Marvin Humphrey, so there is a risk
    of being orphaned.  Marvin has no plans to leave, but we have been
    actively working to disperse his knowledge of the code base and
    administrative responsibilities in order to make him dispensable.  Having
    staggered badly after Dave Balmain's departure, we are keenly aware of
    this vulnerability and highly motivated to eliminate it.

  Inexperience with Open Source
    The initial committers have all have significant experience with open
    source development, and include one present Apache committer.  We
    recognize that we lack PMC experience and seek to address that deficiency
    by going through the Incubator.  In retrospect, Marvin wishes that Lucy
    had gone through the Incubator during its first inception.

  Homogenous Developers
    Our community is geographically dispersed, with members in San Diego,
    Oakland, and Minneapolis.  We all work for different organizations.

  Reliance on Salaried Developers
    Marvin Humphrey has a great job at Eventful working primarily on this
    project and supporting applications that use it.  Nevertheless, he is
    extremely dedicated to Lucy and is determined to see it through to the
    point where it becomes self-sustaining, regardless of work circumstances.

  Relationships with Other Apache Products
    Lucy's relationship with Lucene of cordial "coopetition" has produced
    benefits for Lucene users in terms of indexing speed, near-real-time
    search support, and more.  We expect this dynamic to continue delivering
    improvements for all parties involved.

  An Excessive Fascination with the Apache Brand
    Our desire to maintain Lucy's affiliation with Apache has less to do with
    the brand and more to do with our conviction that developing the project
    The Apache Way under Apache institutions is in Lucy's best interests.
    However, we have to acknowledge that during its time as a Lucene
    subproject, Lucy has not always fulfilled certain key requirements for an
    Apache project.  In particular, it has failed to "release early, release
    often", and it has made minimal progress in expanding its community.

    We attribute some of our difficulties to the what may have been excess
    ambition in the original Lucy plan, given the scope of the project and the
    size of the initial committer list:

        [http://www.apache.org/foundation/how-it-works.html#incubator]

        The basic requirements for incubation are:
          * a working codebase -- over the years and after several
            failures, the foundation came to understand that without an
            initial working codebase, it is generally hard to bootstrap a
            community.

    By rebooting the project with a working codebase, we expect to avoid the
    trap that ensnared Lucy's first incarnation: we will release early,
    release often, accumulate users, nurture contributors, and grow our
    community.

DOCUMENTATION
    * Subversion repository: [http://www.rectangular.com/svn/kinosearch/]
    * Perl API documentation: [http://www.rectangular.com/kinosearch/docs/devel/]
    * Discussion list: [http://www.rectangular.com/mailman/listinfo/kinosearch/]

INITIAL SOURCE
    The initial source will be a snapshot from the KinoSearch subversion
    repository.

SOURCE AND INTELLECTUAL PROPERTY SUBMISSION PLAN
    KinoSearch is currently under a GPL/Artistic license.  There are five
    individuals who have made multiple significant contributions to the
    codebase and whose participation is either essential or would be very
    helpful: Marvin Humphrey, Peter Karman, Nathan Kurz, Chris Nandor, and
    Father Chrysostomos.  All have been contacted and are amenable to
    re-licensing their work and contributing it to Apache.  We will contact as
    many other contributors as possible; if there are any that we cannot
    obtain permission from, we will refactor to expunge their work.

EXTERNAL DEPENDENCIES
    The Perl bindings for KinoSearch currently depend on a few CPAN modules
    which do not have Apache-compatible licenses.  It will be possible to
    eliminate all such dependencies if necessary.

REQUIRED RESOURCES
  Mailing lists
    * lucy-dev
    * lucy-private (with moderated subscriptions)
    * lucy-commits
    * lucy-users

  Subversion Directory
    [http://svn.apache.org/repos/asf/incubator/lucy]

  Issue Tracking
    Lucy already has a JIRA tracker: Lucy (LUCY)

  Other Resources
    Lucy already has a MoinMoin wiki at wiki.apache.org/lucy.  It would be
    convenient to keep it, especially since its current location is also where
    it would end up upon TLP graduation, but we will defer to the wishes of
    the Incubator PMC if standard Incubator wiki placement is recommended.

INITIAL COMMITTERS
    1. Marvin Humphrey (marvin at rectangular dot com)
    2. Peter Karman (peter at peknet dot com)
    3. Nathan Kurz ( nate@verse.com )

SPONSORS
  Champion
    TBD

  Nominated Mentors
    TBD

  Sponsoring Entity
    Lucy is currently sponsored by Lucene as a sub-project. This proposal
    advocates changing Lucy's relationship with Apache from developing all new
    code as a Lucene sub-project, to instead assimilating existing code
    (KinoSearch) under the sponsorship of the Incubator.



Re: Draft proposal: Move Lucy to Incubator

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Fri, Jul 02, 2010 at 01:39:23PM -0700, Chris Hostetter wrote:
> : BACKGROUND
> 	...
> :     Since the Lucene PMC will not be responsible for Lucy much longer, it is
> :     more appropriate for the software grant to take place within the context
> :     of the Incubator than the Lucene TLP.  As none of our current members have
> :     Apache PMC experience, we also seek to take advantage of the Incubator
> :     environment to prepare ourselves for responsible self-governance.
> 
> I can't articulate exactly what i'm feeling here, but I feel like this 
> background section could benefit from some mention of the shift away from 
> "Umbrella projects" being an influencer towards wanting to enter the 
> incubator.

I made the following changes in response to this feedback:

    @@ -1,8 +1,7 @@
     PREFACE
         Lucy is a sub-project which is being spun off from the Lucene TLP but is
         not yet ready for graduation.  We propose to address certain needs of the
    -    project by assimilating the KinoSearch codebase, and to enter the
    -    Incubator on a top-level-project track.
    +    project by assimilating the KinoSearch codebase.
     
     ABSTRACT
         Lucy will be a loose port of the Lucene search engine library, written in
    @@ -40,11 +39,13 @@
         still lacks a number of important indexing and search classes; we wish to
         flesh these out via IP clearance work rather than software development.
     
    -    Since the Lucene PMC will not be responsible for Lucy much longer, it is
    -    more appropriate for the software grant to take place within the context
    -    of the Incubator than the Lucene TLP.  As none of our current members have
    -    Apache PMC experience, we also seek to take advantage of the Incubator
    -    environment to prepare ourselves for responsible self-governance.
    +    Since Lucy cannot remain as a sub-project of Lucene under the current ASF
    +    policy of breaking up "umbrella projects", it is not appropriate for the
    +    software grant to take place within the context of the Lucene TLP.
    +    Instead, we advocate that the software grant happen within the context of
    +    the Incubator, and that a Lucy podling and PPMC be established which will
    +    ultimately take responsibility for the codebase.
    +
     
     RATIONALE
         There is great hunger for a search engine library in the mode of Lucene
    @@ -133,8 +134,9 @@
         The initial committers have all have significant experience with open
         source development, and include one present Apache committer.  We
         recognize that we lack PMC experience and seek to address that deficiency
    -    by going through the Incubator.  In retrospect, Marvin wishes that Lucy
    -    had gone through the Incubator during its first inception.
    +    by using the Incubator environment to educate ourselves and prepare for
    +    responsible self-governance.  In retrospect, Marvin wishes that Lucy had
    +    gone through the Incubator during its first inception.
     
       Homogenous Developers
         Our community is geographically dispersed, with members in San Diego,

> : INITIAL GOALS
> :     * Make a 1.0 stable release as quickly as possible.
> :     * Concentrate on community expansion immediately thereafter.
> :     * Expose a public C API.
> 
> i would not say "immediately thereafter" ... building up the community 
> should be an independent goal, worked on concurrently with other technical 
> goals.

I like the way you've put that, and I agree that that should be our mindset.
I've stricken "immediately thereafter".

> that was the the hardest thing for me to wrap my head arround when Solr 
> was incubating -- in many ways i was actively trying to keep Solr a 
> "secret" until i felt like it was "ready to be unvield" but that's not 
> what incubation is about, and it's really teh antithesis of how to have 
> asuccessful project -- you don't get a lot of contributors all at once by 
> saying "here it is, we've got something that's stable and solid and 
> 'done', who wants to come be a part of it?" .. you get contributors slowly 
> and surely by saying "here's what we've got so far, who wants to help us 
> make this better?"

Thoughtful advice.  

Even before the Lucene PMC emphasized making releases during its review of
Lucy, there was consensus in the KinoSearch community that publishing a stable
release needed to be a high priority.  KinoSearch effectively skipped its last
major release cycle, and the version which most people see is 4-year-old
technology.

Once the two projects have merged and a stable release of Lucy is out, we
won't be "done" -- there's a ton of stuff on the TODO list that isn't going to
make it into the release.   However, we will have made things a lot easier for
users and contributors, particularly those who wish to publish extensions.  

So, I think we'd like our approach to be, "here's what we have so far, feel
free to use it as is or to help us make it better."

> : DOCUMENTATION
> :     * Subversion repository: [http://www.rectangular.com/svn/kinosearch/]
> :     * Perl API documentation: [http://www.rectangular.com/kinosearch/docs/devel/]
> :     * Discussion list: [http://www.rectangular.com/mailman/listinfo/kinosearch/]
> : 
> : INITIAL SOURCE
> :     The initial source will be a snapshot from the KinoSearch subversion
> :     repository.
> 
> ...what about hte eisting ASF Lucy SVN repo / mailing lists? should those 
> be mentioned here?

Good point.  Will fix.

> : INITIAL COMMITTERS
> :     1. Marvin Humphrey (marvin at rectangular dot com)
> :     2. Peter Karman (peter at peknet dot com)
> :     3. Nathan Kurz ( nate@verse.com )
> 
> ...I'm not certain, but it might make sense to mention who on that list is 
> already an apache committer (or has a CLA on file)

Good catch -- we are supposed to do that.

Thanks,

Marvin Humphrey


Re: Draft proposal: Move Lucy to Incubator

Posted by Chris Hostetter <ho...@fucit.org>.
This seems pretty solid to me, a few minor comments...

: BACKGROUND
	...
:     Since the Lucene PMC will not be responsible for Lucy much longer, it is
:     more appropriate for the software grant to take place within the context
:     of the Incubator than the Lucene TLP.  As none of our current members have
:     Apache PMC experience, we also seek to take advantage of the Incubator
:     environment to prepare ourselves for responsible self-governance.

I can't articulate exactly what i'm feeling here, but I feel like this 
background section could benefit from some mention of the shift away from 
"Umbrella projects" being an influencer towards wanting to enter the 
incubator.

: INITIAL GOALS
:     * Make a 1.0 stable release as quickly as possible.
:     * Concentrate on community expansion immediately thereafter.
:     * Expose a public C API.

i would not say "immediately thereafter" ... building up the community 
should be an independent goal, worked on concurrently with other technical 
goals.

that was the the hardest thing for me to wrap my head arround when Solr 
was incubating -- in many ways i was actively trying to keep Solr a 
"secret" until i felt like it was "ready to be unvield" but that's not 
what incubation is about, and it's really teh antithesis of how to have 
asuccessful project -- you don't get a lot of contributors all at once by 
saying "here it is, we've got something that's stable and solid and 
'done', who wants to come be a part of it?" .. you get contributors slowly 
and surely by saying "here's what we've got so far, who wants to help us 
make this better?"

: DOCUMENTATION
:     * Subversion repository: [http://www.rectangular.com/svn/kinosearch/]
:     * Perl API documentation: [http://www.rectangular.com/kinosearch/docs/devel/]
:     * Discussion list: [http://www.rectangular.com/mailman/listinfo/kinosearch/]
: 
: INITIAL SOURCE
:     The initial source will be a snapshot from the KinoSearch subversion
:     repository.

...what about hte eisting ASF Lucy SVN repo / mailing lists? should those 
be mentioned here?

: INITIAL COMMITTERS
:     1. Marvin Humphrey (marvin at rectangular dot com)
:     2. Peter Karman (peter at peknet dot com)
:     3. Nathan Kurz ( nate@verse.com )

...I'm not certain, but it might make sense to mention who on that list is 
already an apache committer (or has a CLA on file)



-Hoss


Re: [Lucene] Re: Draft proposal: Move Lucy to Incubator

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 07/06/2010 06:51 PM:
> On Sun, Jul 04, 2010 at 11:44:48AM -0700, Marvin Humphrey wrote:
>> Our chief challenge is now identifying a Champion and three Mentors.
> 
> I'm pleased to relay that Chris Hostetter has agreed to take on the role of
> Champion and Chris Mattmann has agreed to be a Mentor.
> 
> Here are the changes that the draft proposal has undergone since it was posted
> here on Friday:
> 
>   http://wiki.apache.org/lucy/LucyIncubatorProposal?action=diff&rev2=43&rev1=35
> 

+1 for those changes.

Thanks to Chris H. and Chris M. for stepping forward.

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: Draft proposal: Move Lucy to Incubator

Posted by Chris Hostetter <ho...@fucit.org>.
: This discussion has been open since July 2nd, so i think we've probably 
: already gotten comments from most of the people who are going to post 
: them, but i'll give it another 24 hours just in case anyone has any 
: followups on the latest edits to the draft.

The proposal has now been submitted to the incubator, if you have any 
comments please follow up on the new general@incubator.a.o thread...

   http://markmail.org/thread/rqnrwicgqsjqmnfb


-Hoss


Re: Draft proposal: Move Lucy to Incubator

Posted by Chris Hostetter <ho...@fucit.org>.
: I'm fine with all those mods, and I've gone and applied them for you.
: 
: http://wiki.apache.org/lucy/LucyIncubatorProposal?action=diff&rev2=45&rev1=43

Looks good to me.

Unlike my Lucy wiki account, my incubator wiki account *is* working, so if 
there are no other comments/suggestions I can go ahead and to the wiki 
cloning/linking dance and send the proposal to general@incubator.

This discussion has been open since July 2nd, so i think we've probably 
already gotten comments from most of the people who are going to post 
them, but i'll give it another 24 hours just in case anyone has any 
followups on the latest edits to the draft.



-Hoss


Re: Draft proposal: Move Lucy to Incubator

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Wed, Jul 07, 2010 at 02:06:09PM -0700, Chris Hostetter wrote:
> (Note: i would have made all these edits to the lucy wiki myself, but for 
> some reason the account i just created won't let me login, and the 
> password reset email still hasn't arrived)

I'm fine with all those mods, and I've gone and applied them for you.

http://wiki.apache.org/lucy/LucyIncubatorProposal?action=diff&rev2=45&rev1=43

Marvin Humphrey


Re: Draft proposal: Move Lucy to Incubator

Posted by Chris Hostetter <ho...@fucit.org>.
: Here are the changes that the draft proposal has undergone since it was posted
: here on Friday:
: 
:   http://wiki.apache.org/lucy/LucyIncubatorProposal?action=diff&rev2=43&rev1=35

I'm still a little confused about a few things...
--
1) In the preface, I think it needs to be clear that there is more going 
on then just the assimilation of KinoSearch, there is also the migration 
of Lucy from Lucene->Incubator ... something that is a-typical, and should 
be mentioned right up front...

Preface

Lucy is a sub-project which is being spun off from the Lucene TLP but is 
not yet ready for graduation.  We propose to address certain needs of the 
project by transitioning to an Incubator Podling, and assimilating the 
KinoSearch codebase. 

--
2) "Documentation" ... shouldn't the existing lucy svn, docs, wiki, and 
mailing list be listed here?  Ditto for "Initial Source" ... "The initial 
source will be a snapshot from the KinoSearch subversion repository. " 
only describes half the picture -- there is already stuff in the Lucy 
repository as well.

(these existing ASF resources are mentioned later on in the proposal in 
"Required Resources", but for people who know nothing about Lucy reading 
the "Current Status" of won't really be giving them the full picture)

--
3) As far as this para in the Background section...

>> Since Lucy cannot remain as a sub-project of Lucene under the current 
>> ASF policy of breaking up "umbrella projects", it is not appropriate 
>> for the software grant to take place within the context of the Lucene 
>> TLP. Instead, we advocate that the software grant happen within the 
>> context of the Incubator, and that a Lucy podling and PPMC be 
>> established which will ultimately take responsibility for the codebase.

   ...i think a better way to phrase it may be be...

Because Lucene is working to move away from being an "umbrella project", A 
Long term goal of the Lucy project is to graduate to an ASF TLP.  With 
that in mind, it seems more appropriate for the KinoSearch software grant 
to take place within the context of the Incubator, and that a Lucy podling 
and PPMC be established which will ultimately take responsibility for the 
codebase.

(Note: i would have made all these edits to the lucy wiki myself, but for 
some reason the account i just created won't let me login, and the 
password reset email still hasn't arrived)


-Hoss


Re: Draft proposal: Move Lucy to Incubator

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks, Marvin.

> 
> I'm pleased to relay that Chris Hostetter has agreed to take on the role of
> Champion and Chris Mattmann has agreed to be a Mentor.
> 
> Here are the changes that the draft proposal has undergone since it was posted
> here on Friday:
> 
>   
> http://wiki.apache.org/lucy/LucyIncubatorProposal?action=diff&rev2=43&rev1=35
> 
> It looks like the next step is to copy this proposal to
> <http://wiki.apache.org/incubator/LucyProposal> and add a link from the
> <http://wiki.apache.org/incubator/> front page.  Then we submit a snapshot to
> general@incubator under the subject heading "[PROPOSAL] Move Lucy to
> Incubator".

+1. Chris H. should send it to the Incubator too as the Champion. After a
bit of discussion, we can then call a VOTE thread.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



Re: Draft proposal: Move Lucy to Incubator

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Sun, Jul 04, 2010 at 11:44:48AM -0700, Marvin Humphrey wrote:
> Our chief challenge is now identifying a Champion and three Mentors.

I'm pleased to relay that Chris Hostetter has agreed to take on the role of
Champion and Chris Mattmann has agreed to be a Mentor.

Here are the changes that the draft proposal has undergone since it was posted
here on Friday:

  http://wiki.apache.org/lucy/LucyIncubatorProposal?action=diff&rev2=43&rev1=35

It looks like the next step is to copy this proposal to
<http://wiki.apache.org/incubator/LucyProposal> and add a link from the
<http://wiki.apache.org/incubator/> front page.  Then we submit a snapshot to
general@incubator under the subject heading "[PROPOSAL] Move Lucy to
Incubator".

Marvin Humphrey


Re: Draft proposal: Move Lucy to Incubator

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Marvin,

Yep I think you're in good shape. +1 to submit to the Incubator. My comment was mainly, once the proposal arrives over there, I expect Incubator folks to jump in and try and help out either via mentorship, or as a committer. I'm already mentoring 2 podlings right now, else I'd jump in, but I don't really have many spare cycles. I'll snoop around on the lists though and if I get time later on, perhaps I can jump in and help.

Best of luck!

Cheers,
Chris



On 7/4/10 11:44 AM, "Marvin Humphrey" <ma...@rectangular.com> wrote:

On Fri, Jul 02, 2010 at 10:29:45AM -0700, Mattmann, Chris A (388J) wrote:
> +1, Marvin. My only comment is that you can hopefully add some additional
> committers when you move into the Incubator to really kick start the
> project.

Thanks, Chris.

Another possibility for us is to delay entry into the Incubator until we have
accumulated a larger number of committers.

To achieve this, we would make our stable release ASAP under the current
KinoSearch namespace and license.  We expect to generate significant interest
within the Perl community with this release, attracting many new users.  If
the past is prologue, these users will soon start contributing, and we'll be
able to identify and develop some candidates who subsequently become
committers.  Then we would be able to put our proposal to the Incubator PMC
with more names on the list.

However, that's not my preferred course of action for a number of reasons.

First, I believe that it's in the best interests of the project to launch the
modern codebase under the Lucy brand.  Making a release under the KinoSearch
brand and then switching to Lucy shortly thereafter will be confusing and
inconvenient for our users.

Second, the more contributors we have, the more complicated the relicensing
and software grant becomes.  We have all the necessary past contributors on
board now.  In a worst case scenario, one or more of those individuals becomes
unable to participate while we delay.

Third, accumulating worthy committers will take time, and it's not clear how
much.

Fourth, I think it would be good for the project if the release happens while
Lucy is in the Incubator under the guidance of Mentors.  We have received some
thoughtful advice on this general@ list lately.  Having Mentors available to
provide us with ongoing feedback during this phase of expansion will help us
to run as efficiently and effectively as possible, establishing healthy
cultural norms within the community that persist for years to come.

Our chief challenge is now identifying a Champion and three Mentors.  If we
can do that and pass an Incubator PMC vote, I believe that Lucy's future is
extremely bright.

We have solved the technical problems that Dave Balmain and I set out to
solve.  We have a product that delivers great features like mmap-powered
near-real-time search, and offers mechanisms for extension which will entice
users to become contributors.  We have the institutions, infrastructure, and
accumulated wisdom of Apache pushing us forward, particularly the traditions
of the Lucene community where Lucy was born.  And we have a collectively
devised proposal which incorporates lessons learned from past mistakes, and
which we are all very excited about.

Marvin Humphrey




++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Draft proposal: Move Lucy to Incubator

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Fri, Jul 02, 2010 at 10:29:45AM -0700, Mattmann, Chris A (388J) wrote:
> +1, Marvin. My only comment is that you can hopefully add some additional
> committers when you move into the Incubator to really kick start the
> project.

Thanks, Chris.

Another possibility for us is to delay entry into the Incubator until we have
accumulated a larger number of committers.  

To achieve this, we would make our stable release ASAP under the current
KinoSearch namespace and license.  We expect to generate significant interest
within the Perl community with this release, attracting many new users.  If
the past is prologue, these users will soon start contributing, and we'll be
able to identify and develop some candidates who subsequently become
committers.  Then we would be able to put our proposal to the Incubator PMC
with more names on the list.

However, that's not my preferred course of action for a number of reasons.  

First, I believe that it's in the best interests of the project to launch the
modern codebase under the Lucy brand.  Making a release under the KinoSearch
brand and then switching to Lucy shortly thereafter will be confusing and
inconvenient for our users.

Second, the more contributors we have, the more complicated the relicensing
and software grant becomes.  We have all the necessary past contributors on
board now.  In a worst case scenario, one or more of those individuals becomes
unable to participate while we delay.

Third, accumulating worthy committers will take time, and it's not clear how
much.

Fourth, I think it would be good for the project if the release happens while
Lucy is in the Incubator under the guidance of Mentors.  We have received some
thoughtful advice on this general@ list lately.  Having Mentors available to
provide us with ongoing feedback during this phase of expansion will help us
to run as efficiently and effectively as possible, establishing healthy
cultural norms within the community that persist for years to come.

Our chief challenge is now identifying a Champion and three Mentors.  If we
can do that and pass an Incubator PMC vote, I believe that Lucy's future is
extremely bright.

We have solved the technical problems that Dave Balmain and I set out to
solve.  We have a product that delivers great features like mmap-powered
near-real-time search, and offers mechanisms for extension which will entice
users to become contributors.  We have the institutions, infrastructure, and
accumulated wisdom of Apache pushing us forward, particularly the traditions
of the Lucene community where Lucy was born.  And we have a collectively
devised proposal which incorporates lessons learned from past mistakes, and
which we are all very excited about.

Marvin Humphrey


Re: Draft proposal: Move Lucy to Incubator

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
+1, Marvin. My only comment is that you can hopefully add some additional committers when you move into the Incubator to really kick start the project.

Cheers,
Chris



On 7/2/10 10:15 AM, "Marvin Humphrey" <ma...@rectangular.com> wrote:



PREFACE
    Lucy is a sub-project which is being spun off from the Lucene TLP but is
    not yet ready for graduation.  We propose to address certain needs of the
    project by assimilating the KinoSearch codebase, and to enter the
    Incubator on a top-level-project track.

ABSTRACT
    Lucy will be a loose port of the Lucene search engine library, written in
    C and targeted at dynamic language users.

PROPOSAL
    Lucy has two aims.  First, it will be a high-performance C search engine
    library.  Second, it will maximize its usability and power when accessed
    via dynamic language bindings.  To that end, it will present highly
    idiomatic, carefully tailored APIs for each of its "host" binding
    languages, including support for subclasses written entirely in the "host"
    language.

BACKGROUND
    Lucy, a "loose C" port of Java Lucene, began as an ambitious, from-scratch
    Lucene sub-project, with David Balmain (author of Ferret, a Ruby/C port of
    Lucene), Doug Cutting, and Marvin Humphrey (founder of KinoSearch, a
    Perl/C port) as committers.  During an initial burst of activity, the
    overall architecture for Lucy was sketched out by Dave and Marvin.
    Unfortunately, Dave became unavailable soon after, and without a working
    codebase to release or any users, it proved difficult to replace him.
    Still, Marvin carried on their work throughout a period of seemingly low
    activity.

    In the last year, that work has come to fruition: major technical
    milestones have been achieved and Lucy's underpinnings have been
    completed.  Additionally, other developers from the KinoSearch community
    have taken an interest in Lucy and have begun to ramp up their
    contributions.  The next steps for Lucy were articulated by the Lucene PMC
    in a recent review: make releases, acquire users, grow community.

    To implement the Lucene PMC's recommendations and get to a release as
    quickly as possible, the Lucy community proposes to assimilate the
    KinoSearch codebase, which has been retrofitted to use Lucy's core.  Lucy
    still lacks a number of important indexing and search classes; we wish to
    flesh these out via IP clearance work rather than software development.

    Since the Lucene PMC will not be responsible for Lucy much longer, it is
    more appropriate for the software grant to take place within the context
    of the Incubator than the Lucene TLP.  As none of our current members have
    Apache PMC experience, we also seek to take advantage of the Incubator
    environment to prepare ourselves for responsible self-governance.

RATIONALE
    There is great hunger for a search engine library in the mode of Lucene
    which is accessible from various dynamic languages, and for one accessible
    from pure C.  Individuals naturally wish to code in their language of
    choice.  Organizations which do not have significant Java expertise may
    not want to support Java strictly for the sake of running a Lucene
    installation.  Developers may want to take advantage of C's
    interoperability and fine-grained control.  Lucy will meet all these
    demands.

    Apache is a natural home for our project given the way it has always
    operated: user-driven innovation, security as a requirement, lively and
    amiable mailing list discussions, strength through diversity, and so on.
    We feel comfortable here, and we believe that we will become exemplary
    Apache citizens.

INITIAL GOALS
    * Make a 1.0 stable release as quickly as possible.
    * Concentrate on community expansion immediately thereafter.
    * Expose a public C API.

CURRENT STATUS
  Meritocracy
    Our initial committer list includes two individuals (Peter Karman and
    Nathan Kurz) who started off as KinoSearch users, demonstrated merit
    through constructive forum participation, adept negotiation, consensus
    building, and submission of high-quality contributions, and were invited
    to become committers.  Peter now rolls most releases.

    We look forward to continuing to operate as a meritocracy under the
    established traditions and rules of the ASF.

  Community
    Lucy's most active participants of late have been drawn from the
    KinoSearch and Lucene communities.  Having been focused on features and
    technical goals for a long time, we are considerably overdue for a stable
    release, and anticipate rapid growth in its wake.

  Core Developers
     * Marvin Humphrey is the project founder of KinoSearch, and co-founded
       the existing Lucy sub-project.  He is presently employed by Eventful,
       Inc.
     * Peter Karman has contributed to several open source projects since
       2001, including being a committer at http://swish-e.org/ (a search
       engine), http://code.google.com/p/rose/ (an ORM) and
       http://catalyst.perl.org/ (web framework).  He is employed by American
       Public Media.
     * Nathan Kurz has participated in numerous open source projects and has
       been a KinoSearch committer since 2007.  He is currently Chief Flavor
       Engineer of Scream Sorbet, and writes software in his copious free
       time.

  Alignment
    One Apache value which is particularly cherished by the Lucy community is
    codebase transparency.  We have developed institutions which enable us to
    measure and maximize usability (see http://wiki.apache.org/lucy/BrainLog),
    and we feel strongly that the bindings for Lucy must present APIs and
    documentation which are idiomatic to the host language culture so that end
    users can consume our work as easily as possible.

    The controlled competition of meritocratic community development is also
    very important to us.  There has been substantial cross-pollination of
    ideas between Lucene and Lucy, yielding considerable benefits for both
    projects.  The Lucy developers envision that our host-language
    sub-communities will approach using and extending the library in distinct
    ways; we hope to harness the creative tension between them to drive
    innovation, building productive relationships akin to the one that Lucene
    and Lucy have today.

    A third priority of ours is to be bound by existing Apache institutions,
    for the protection of all our stakeholders.

KNOWN RISKS
  Orphaned products
    All initial committers have been associated with the project for several
    years across multiple jobs.  However, at this time, the project would
    probably not survive the departure of Marvin Humphrey, so there is a risk
    of being orphaned.  Marvin has no plans to leave, but we have been
    actively working to disperse his knowledge of the code base and
    administrative responsibilities in order to make him dispensable.  Having
    staggered badly after Dave Balmain's departure, we are keenly aware of
    this vulnerability and highly motivated to eliminate it.

  Inexperience with Open Source
    The initial committers have all have significant experience with open
    source development, and include one present Apache committer.  We
    recognize that we lack PMC experience and seek to address that deficiency
    by going through the Incubator.  In retrospect, Marvin wishes that Lucy
    had gone through the Incubator during its first inception.

  Homogenous Developers
    Our community is geographically dispersed, with members in San Diego,
    Oakland, and Minneapolis.  We all work for different organizations.

  Reliance on Salaried Developers
    Marvin Humphrey has a great job at Eventful working primarily on this
    project and supporting applications that use it.  Nevertheless, he is
    extremely dedicated to Lucy and is determined to see it through to the
    point where it becomes self-sustaining, regardless of work circumstances.

  Relationships with Other Apache Products
    Lucy's relationship with Lucene of cordial "coopetition" has produced
    benefits for Lucene users in terms of indexing speed, near-real-time
    search support, and more.  We expect this dynamic to continue delivering
    improvements for all parties involved.

  An Excessive Fascination with the Apache Brand
    Our desire to maintain Lucy's affiliation with Apache has less to do with
    the brand and more to do with our conviction that developing the project
    The Apache Way under Apache institutions is in Lucy's best interests.
    However, we have to acknowledge that during its time as a Lucene
    subproject, Lucy has not always fulfilled certain key requirements for an
    Apache project.  In particular, it has failed to "release early, release
    often", and it has made minimal progress in expanding its community.

    We attribute some of our difficulties to the what may have been excess
    ambition in the original Lucy plan, given the scope of the project and the
    size of the initial committer list:

        [http://www.apache.org/foundation/how-it-works.html#incubator]

        The basic requirements for incubation are:
          * a working codebase -- over the years and after several
            failures, the foundation came to understand that without an
            initial working codebase, it is generally hard to bootstrap a
            community.

    By rebooting the project with a working codebase, we expect to avoid the
    trap that ensnared Lucy's first incarnation: we will release early,
    release often, accumulate users, nurture contributors, and grow our
    community.

DOCUMENTATION
    * Subversion repository: [http://www.rectangular.com/svn/kinosearch/]
    * Perl API documentation: [http://www.rectangular.com/kinosearch/docs/devel/]
    * Discussion list: [http://www.rectangular.com/mailman/listinfo/kinosearch/]

INITIAL SOURCE
    The initial source will be a snapshot from the KinoSearch subversion
    repository.

SOURCE AND INTELLECTUAL PROPERTY SUBMISSION PLAN
    KinoSearch is currently under a GPL/Artistic license.  There are five
    individuals who have made multiple significant contributions to the
    codebase and whose participation is either essential or would be very
    helpful: Marvin Humphrey, Peter Karman, Nathan Kurz, Chris Nandor, and
    Father Chrysostomos.  All have been contacted and are amenable to
    re-licensing their work and contributing it to Apache.  We will contact as
    many other contributors as possible; if there are any that we cannot
    obtain permission from, we will refactor to expunge their work.

EXTERNAL DEPENDENCIES
    The Perl bindings for KinoSearch currently depend on a few CPAN modules
    which do not have Apache-compatible licenses.  It will be possible to
    eliminate all such dependencies if necessary.

REQUIRED RESOURCES
  Mailing lists
    * lucy-dev
    * lucy-private (with moderated subscriptions)
    * lucy-commits
    * lucy-users

  Subversion Directory
    [http://svn.apache.org/repos/asf/incubator/lucy]

  Issue Tracking
    Lucy already has a JIRA tracker: Lucy (LUCY)

  Other Resources
    Lucy already has a MoinMoin wiki at wiki.apache.org/lucy.  It would be
    convenient to keep it, especially since its current location is also where
    it would end up upon TLP graduation, but we will defer to the wishes of
    the Incubator PMC if standard Incubator wiki placement is recommended.

INITIAL COMMITTERS
    1. Marvin Humphrey (marvin at rectangular dot com)
    2. Peter Karman (peter at peknet dot com)
    3. Nathan Kurz ( nate@verse.com )

SPONSORS
  Champion
    TBD

  Nominated Mentors
    TBD

  Sponsoring Entity
    Lucy is currently sponsored by Lucene as a sub-project. This proposal
    advocates changing Lucy's relationship with Apache from developing all new
    code as a Lucene sub-project, to instead assimilating existing code
    (KinoSearch) under the sponsorship of the Incubator.





++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++