You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Michael Busch <bu...@gmail.com> on 2010/02/24 17:40:48 UTC

Re: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?

+1! I think that's the way to go. It's also confusing currently that  
some analysers are in Lucene's core jar, and that there is an  
additional contrib analysis jar. Your proposal would solve this  
problem too.

  Michael

On Feb 24, 2010, at 8:32 AM, Michael McCandless <lucene@mikemccandless.com 
 > wrote:

> I think, in order to stop duplicating our analysis code across
> Nutch/Solr/Lucene, we should separate out the analyzers into a
> standalone package, and maybe as its own sub-project under the Lucene
> tlp?
>
> The goal would be eventually to have a single source for all our
> analysis needs, and for all Lucene projects to eventually cutover to
> this source (deprecating their current analysis code).
>
> We could also at this time fix some of the known problems in the
> analysis APIs, eg that the Analyzer base class confusingly exposes
> both non-reuse and reuse APIs, that not all Analyzers are final, etc.
>
> What do people think...?
>
> Mike

Re: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?

Posted by Simon Willnauer <si...@googlemail.com>.
Mike, thanks for moving out of the JIRA issue. For completeness I just
add the link to the issue where this thread started though. -->
https://issues.apache.org/jira/browse/LUCENE-2279

I also think we need a solution for this problem but it does not seem
to be that easy. Would moving the analysis be compatible with the
lucene core having no dependencies? Not that I do not favor that
solution I really think we should move all that out but I'm not sure
about the place for this to live.

My first impression would be a lucene contrib module but that would
raise other issues like all solr committers then need access to that
contrib. A new project would surely make sense but is also quite an
overhead isn't it?!

simon
On Wed, Feb 24, 2010 at 5:40 PM, Michael Busch <bu...@gmail.com> wrote:
> +1! I think that's the way to go. It's also confusing currently that some
> analysers are in Lucene's core jar, and that there is an additional contrib
> analysis jar. Your proposal would solve this problem too.
>
>  Michael
>
> On Feb 24, 2010, at 8:32 AM, Michael McCandless <lu...@mikemccandless.com>
> wrote:
>
>> I think, in order to stop duplicating our analysis code across
>> Nutch/Solr/Lucene, we should separate out the analyzers into a
>> standalone package, and maybe as its own sub-project under the Lucene
>> tlp?
>>
>> The goal would be eventually to have a single source for all our
>> analysis needs, and for all Lucene projects to eventually cutover to
>> this source (deprecating their current analysis code).
>>
>> We could also at this time fix some of the known problems in the
>> analysis APIs, eg that the Analyzer base class confusingly exposes
>> both non-reuse and reuse APIs, that not all Analyzers are final, etc.
>>
>> What do people think...?
>>
>> Mike
>

RE: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?

Posted by Steven A Rowe <sa...@syr.edu>.
+1.  We can call the project LuAnn :) - Steve

On 02/24/2010 at 11:40 AM, Michael Busch wrote:
> +1! I think that's the way to go. It's also confusing currently that
> some analysers are in Lucene's core jar, and that there is an
> additional contrib analysis jar. Your proposal would solve this
> problem too.
> 
>   Michael
> 
> On Feb 24, 2010, at 8:32 AM, Michael McCandless
> <lucene@mikemccandless.com
> > wrote:
> > 
> > I think, in order to stop duplicating our analysis code across
> > Nutch/Solr/Lucene, we should separate out the analyzers into a
> > standalone package, and maybe as its own sub-project under the Lucene
> > tlp?
> > 
> > The goal would be eventually to have a single source for all our
> > analysis needs, and for all Lucene projects to eventually cutover to
> > this source (deprecating their current analysis code).
> > 
> > We could also at this time fix some of the known problems in the
> > analysis APIs, eg that the Analyzer base class confusingly exposes
> > both non-reuse and reuse APIs, that not all Analyzers are final, etc.
> > 
> > What do people think...?
> > 
> > Mike