You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Doug Cutting <cu...@apache.org> on 2006/02/21 17:47:52 UTC

Lucene 1.9 RC1 release available

Release 1.9 RC1 of Lucene is now available from:

http://www.apache.org/dyn/closer.cgi/lucene/java/

This release candidate has many improvements since release 1.4.3, 
including new features, performance improvements, bug fixes, etc.  For 
details, see:

http://svn.apache.org/viewcvs.cgi/*checkout*/lucene/java/branches/lucene_1_9/CHANGES.txt?rev=379190

1.9 will be the last 1.x release. It is both back-compatible with 1.4.3 
and forward-compatible with the upcoming 2.0 release. Many methods and 
classes in 1.4.3 have been deprecated in 1.9 and will be removed in 2.0. 
  Applications must compile against 1.9 without deprecation warnings 
before they are compatible with 2.0.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Yonik Seeley <ys...@gmail.com>.

Terry,

I think most of the examples you provide are normally handled via stemming.
Using wildcarding for stemming will normally be less accurate.

The current behavior is also consistent with the way file globbing works.

-Yonik


On 2/21/06, Terry Steichen <te...@net-frame.com> wrote:
> Yonik,
>
> No, I don't think that the riot* option would work for many queries.
> Let's take a simple case where you want a singular or plural form, like
> either cat or cats (which would be very common).  With 1.4.x, you can
> use cat? to retrieve such matches.  With the new change, you need to use
> (cat cats) or (cat cat?).  If you use cat*, you'll get a million matches
> you don't want (cater, catches, catwoman, category, catatonic,
> cataclysm, catamount, etc.).  Or, take a case where you want to retrieve
> terms like elder, elderly, elders but do not want things like
> elderberry, elderdice.  Or you want gun or guns, but not gunmen,
> gunshots, gunfire, gunpoint, gunston, etc.
>
> In contrast, as you appear to agree, it would actually be a fairly rare
> case where you really need a specific number of characters in the term.
>
> So, I would opt to either leave the behavior as it was in 1.4.x or
> provide a flag (defaulting either way).
>
> Terry
>
> Yonik Seeley wrote:
>
> >On 2/21/06, Terry Steichen <te...@net-frame.com> wrote:
> >
> >
> >>For example, let's say that I'm interested in docs with terms 'riot',
> >>'riots', 'rioting' and 'rioters' (which, I think, is a reasonable kind
> >>of query).  Under the previous versions of QueryParser, I could simply
> >>specify 'riot???' and capture all of those variants.
> >>
> >>
> >
> >Wouldn't the prefix query riot* fit the bill?
> >I would think that wanting 1,2, or 3 additional characters, but no
> >more would be a fairly rare case, yes?  And there might also be a rare
> >case where you want exactly 3 additional characters... the new change
> >makes both possible.
> >
> >-Yonik
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Terry Steichen <te...@net-frame.com>.

Marvin,

While a stemming analyzer can work well for general purpose queries, if 
you're seeking a decent level of precision/recall, stemming often 
severely limits you.  Moreover, unless the user is very familiar with 
the behavior of the stemmer used, some of the returned results can be 
quite surprising.  The logic of stemmers will, as you suggest, can 
eliminate some false positives, it will at the same time introduce new 
onees and false negatives as well.

I think the key is that, even if you have imprecise query demands that 
can be met by stemming, why limit Lucene's capability to achieve high 
levels of precision?  Especially when the alternative (in terms of the 
cat? behavior) provides a capability (matching a specific number of 
characters) that very few application apparently need?

Terry

Marvin Humphrey wrote:

> Terry,
>
> Is there a reason you wouldn't use a stemming analyzer of some kind,  
> which would match cat and cats but not cater, catches, etc?
>
> http://snowball.tartarus.org/demo.php
>
> Marvin Humphrey
> Rectangular Research
> http://www.rectangular.com/
>
> On Feb 21, 2006, at 3:13 PM, Terry Steichen wrote:
>
>> No, I don't think that the riot* option would work for many  
>> queries.  Let's take a simple case where you want a singular or  
>> plural form, like either cat or cats (which would be very common).   
>> With 1.4.x, you can use cat? to retrieve such matches.  With the  new 
>> change, you need to use (cat cats) or (cat cat?).  If you use  cat*, 
>> you'll get a million matches you don't want (cater, catches,  
>> catwoman, category, catatonic, cataclysm, catamount, etc.).  Or,  
>> take a case where you want to retrieve terms like elder, elderly,  
>> elders but do not want things like elderberry, elderdice.  Or you  
>> want gun or guns, but not gunmen, gunshots, gunfire, gunpoint,  
>> gunston, etc.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Marvin Humphrey <ma...@rectangular.com>.

Terry,

Is there a reason you wouldn't use a stemming analyzer of some kind,  
which would match cat and cats but not cater, catches, etc?

http://snowball.tartarus.org/demo.php

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/

On Feb 21, 2006, at 3:13 PM, Terry Steichen wrote:

> No, I don't think that the riot* option would work for many  
> queries.  Let's take a simple case where you want a singular or  
> plural form, like either cat or cats (which would be very common).   
> With 1.4.x, you can use cat? to retrieve such matches.  With the  
> new change, you need to use (cat cats) or (cat cat?).  If you use  
> cat*, you'll get a million matches you don't want (cater, catches,  
> catwoman, category, catatonic, cataclysm, catamount, etc.).  Or,  
> take a case where you want to retrieve terms like elder, elderly,  
> elders but do not want things like elderberry, elderdice.  Or you  
> want gun or guns, but not gunmen, gunshots, gunfire, gunpoint,  
> gunston, etc.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Terry Steichen <te...@net-frame.com>.

Yonik,

No, I don't think that the riot* option would work for many queries.  
Let's take a simple case where you want a singular or plural form, like 
either cat or cats (which would be very common).  With 1.4.x, you can 
use cat? to retrieve such matches.  With the new change, you need to use 
(cat cats) or (cat cat?).  If you use cat*, you'll get a million matches 
you don't want (cater, catches, catwoman, category, catatonic, 
cataclysm, catamount, etc.).  Or, take a case where you want to retrieve 
terms like elder, elderly, elders but do not want things like 
elderberry, elderdice.  Or you want gun or guns, but not gunmen, 
gunshots, gunfire, gunpoint, gunston, etc.

In contrast, as you appear to agree, it would actually be a fairly rare 
case where you really need a specific number of characters in the term.

So, I would opt to either leave the behavior as it was in 1.4.x or 
provide a flag (defaulting either way).

Terry

Yonik Seeley wrote:

>On 2/21/06, Terry Steichen <te...@net-frame.com> wrote:
>  
>
>>For example, let's say that I'm interested in docs with terms 'riot',
>>'riots', 'rioting' and 'rioters' (which, I think, is a reasonable kind
>>of query).  Under the previous versions of QueryParser, I could simply
>>specify 'riot???' and capture all of those variants.
>>    
>>
>
>Wouldn't the prefix query riot* fit the bill?
>I would think that wanting 1,2, or 3 additional characters, but no
>more would be a fairly rare case, yes?  And there might also be a rare
>case where you want exactly 3 additional characters... the new change
>makes both possible.
>
>-Yonik
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>  
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Yonik Seeley <ys...@gmail.com>.

On 2/21/06, Terry Steichen <te...@net-frame.com> wrote:
> For example, let's say that I'm interested in docs with terms 'riot',
> 'riots', 'rioting' and 'rioters' (which, I think, is a reasonable kind
> of query).  Under the previous versions of QueryParser, I could simply
> specify 'riot???' and capture all of those variants.

Wouldn't the prefix query riot* fit the bill?
I would think that wanting 1,2, or 3 additional characters, but no
more would be a fairly rare case, yes?  And there might also be a rare
case where you want exactly 3 additional characters... the new change
makes both possible.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Doug Cutting <cu...@apache.org>.

Arguing about this won't change the code.  A well-constructed patch 
might (but there are no guarantees).

To me, this sounds like an uphill battle.  If we want to add a feature 
to wildcard 0-N characters at the end of a word, then I don't think we'd 
use '?' plus a flag.  Rather I think it would be better to be explicit 
about it, e.g., "foo?=3" or somesuch.  Such a patch would stand a 
greater chance of being accepted.

Doug

Terry Steichen wrote:
> 1) Having a simple way to match singular and plural forms of a term with 
> a single wildcard expression is quite useful.
> 2) The trailing '?' behavior has been present since that wildcard was 
> first introduced.  Why not provide a flag to allow the original behavior 
> to optionally be preserved?
> 3) The fact that virtually no one objected to the original behavior 
> suggests that few if any were confused by it.
> 
> Chris Hostetter wrote:
> 
>> : In either case, what I'm arguing is that the current behavior makes 
>> more
>> : sense in the real world of query expressions (that is, makes the most
>> : common query expressions simpler), so why not continue it?
>>
>> I disagree with that statment.  People familiar with shell globing are
>> going to be confused if "riot??????????????????????" matches "riot" and
>> "riotXXX".
>>  
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Terry Steichen <te...@net-frame.com>.

1) Having a simple way to match singular and plural forms of a term with 
a single wildcard expression is quite useful.
2) The trailing '?' behavior has been present since that wildcard was 
first introduced.  Why not provide a flag to allow the original behavior 
to optionally be preserved?
3) The fact that virtually no one objected to the original behavior 
suggests that few if any were confused by it.

Chris Hostetter wrote:

>: In either case, what I'm arguing is that the current behavior makes more
>: sense in the real world of query expressions (that is, makes the most
>: common query expressions simpler), so why not continue it?
>
>I disagree with that statment.  People familiar with shell globing are
>going to be confused if "riot??????????????????????" matches "riot" and
>"riotXXX".
>  
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Chris Hostetter <ho...@fucit.org>.

: In either case, what I'm arguing is that the current behavior makes more
: sense in the real world of query expressions (that is, makes the most
: common query expressions simpler), so why not continue it?

I disagree with that statment.  People familiar with shell globing are
going to be confused if "riot??????????????????????" matches "riot" and
"riotXXX".

Are you expecting "r?ot" to match "rot" as well? (i'm not sure if
the initial bug in LUCENE-306 only affected trailing '?' or inner
instances as well)




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Terry Steichen <te...@net-frame.com>.

Hoss,

Whether the previous behavior (which I believe has been present in 
Lucene from the outset) was a "bug" or a "feature" is kind of academic.  
My point is that this behavior has value that's not countered by any 
argument that any significant value is added by eliminating it.

As to your riot?(0,3) syntax proposal, IMHO it's (a) too complicated, 
and (b) changes what has been previously the default behavior. 

Perhaps I have been "lucky" with the behavior of Lucene.  Alternatively, 
perhaps Lucene has been "lucky" to 'stumble' on a more useful capability 
than was arguably envisioned by the drafters of the documentation.

In either case, what I'm arguing is that the current behavior makes more 
sense in the real world of query expressions (that is, makes the most 
common query expressions simpler), so why not continue it?

Terry

Chris Hostetter wrote:

>: of query).  Under the previous versions of QueryParser, I could simply
>: specify 'riot???' and capture all of those variants.
>
>I don't have a strong opinion on this issue, but it seems clear to me that
>this was a bug in 1.4.3 not a change in the orriginally intended behavior.
>queryparsersyntax.html clearly states...
>
>  To perform a single character wildcard search use the "?" symbol.
>  To perform a multiple character wildcard search use the "*" symbol.
>
>...which implies to me that if you were relying on "riot???" to match
>"riots" you weren't using the code as documented, and were just getting
>lucky that what you were doing worked.  Applying LUCENE-306 definitely
>seems like the right thing to do to fix a bug in the documented behavior
>-- espeically since the behavior as documented closely matches what people
>use to file globbibg would probably consider the "expected" behavior.
>
>adding syntax support for an "n to m" character match (ala "riot?{0,3}")
>would probably be a worthwhile new feature - but it seems like exactly
>that: a new feature, not an issue with the the patch as applied.
>
>
>
>-Hoss
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>  
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Chris Hostetter <ho...@fucit.org>.

: of query).  Under the previous versions of QueryParser, I could simply
: specify 'riot???' and capture all of those variants.

I don't have a strong opinion on this issue, but it seems clear to me that
this was a bug in 1.4.3 not a change in the orriginally intended behavior.
queryparsersyntax.html clearly states...

  To perform a single character wildcard search use the "?" symbol.
  To perform a multiple character wildcard search use the "*" symbol.

...which implies to me that if you were relying on "riot???" to match
"riots" you weren't using the code as documented, and were just getting
lucky that what you were doing worked.  Applying LUCENE-306 definitely
seems like the right thing to do to fix a bug in the documented behavior
-- espeically since the behavior as documented closely matches what people
use to file globbibg would probably consider the "expected" behavior.

adding syntax support for an "n to m" character match (ala "riot?{0,3}")
would probably be a worthwhile new feature - but it seems like exactly
that: a new feature, not an issue with the the patch as applied.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Terry Steichen <te...@net-frame.com>.

In reviewing the latest changes incorporated into release 1.9 RC1, I 
noticed a change responding to JIRA item LUCENE-306.  According to the 
writeup, the new change forces the wildcard pattern 'cat??' to exactly 
match the length of the term (in this case, a five-letter term starting 
with 'cat'). 

If my understanding of this change is correct, I'm concerned that this 
will actually make some fairly common query expressions more difficult 
to construct. 

For example, let's say that I'm interested in docs with terms 'riot', 
'riots', 'rioting' and 'rioters' (which, I think, is a reasonable kind 
of query).  Under the previous versions of QueryParser, I could simply 
specify 'riot???' and capture all of those variants. 

With the new change (if I understand it correctly), I would have to 
either spell out all the individual terms, or specify something like 
'(riot riot? riot???)'.  I can see that there might be some cases where 
you might want to get only term matches of a specific length; however, 
IMHO, those needs would be far less common than the kind of needs I've 
tried to illustrate with 'riot???'.

I have a personal interest in that this new change will force me to 
change and retest a large number of pre-defined, fairly complex 
queries.  If it is deemed important to have the new capability, could we 
perhaps incorporate a flag that could be set to override the change and 
QueryParser using the previous logic on this?

Doug Cutting wrote:

> Release 1.9 RC1 of Lucene is now available from:
>
> http://www.apache.org/dyn/closer.cgi/lucene/java/
>
> This release candidate has many improvements since release 1.4.3, 
> including new features, performance improvements, bug fixes, etc.  For 
> details, see:
>
> http://svn.apache.org/viewcvs.cgi/*checkout*/lucene/java/branches/lucene_1_9/CHANGES.txt?rev=379190 
>
>
> 1.9 will be the last 1.x release. It is both back-compatible with 
> 1.4.3 and forward-compatible with the upcoming 2.0 release. Many 
> methods and classes in 1.4.3 have been deprecated in 1.9 and will be 
> removed in 2.0.  Applications must compile against 1.9 without 
> deprecation warnings before they are compatible with 2.0.
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Feb 23, 2006, at 6:33 PM, Daniel Naber wrote:
> BTW, lucli (the command line lucene searcher) builds fine, but the  
> manifest
> in the jar doesn't specify the Main-Class so you cannot start it with
> java's -jar option. Could someone have a look at this (Erik?)? I don't
> understand how that ant task is supposed to work, the manifest is
> specified and looks okay...

Daniel - I'll look into, and resolve, this issue.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Feb 26, 2006, at 7:18 AM, Daniel Naber wrote:
> On Sonntag 26 Februar 2006 02:42, Erik Hatcher wrote:
>
>> I personally don't think we should be distributing any external
>> dependencies.  Whoever builds the releases needs to have the
>> dependencies locally, but 3rd party JARs, even Apache ones, should
>> not go along for the .tar/zip ride IMO.
>
> I think they should be included to make life easier for our users,  
> unless
> it makes Lucene too big (which is not the case for now as the files  
> are
> quite small).

For some it will be a benefit to having these additional dependencies  
immediately handy, but others may already be using different versions  
of those libraries and will need to carefully manage which versions  
they use.

I'm not in a position to devote any time to the changes needed to the  
contrib build area to have it be smart about this sort of thing  
though.  There may be 3rd party libraries that cannot be distributed,  
so the build system cannot bulk copy everything in each lib/  
directory, I don't think.

	Erik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Daniel Naber <lu...@danielnaber.de>.

On Sonntag 26 Februar 2006 02:42, Erik Hatcher wrote:

> I personally don't think we should be distributing any external  
> dependencies.  Whoever builds the releases needs to have the  
> dependencies locally, but 3rd party JARs, even Apache ones, should  
> not go along for the .tar/zip ride IMO.  

I think they should be included to make life easier for our users, unless 
it makes Lucene too big (which is not the case for now as the files are 
quite small).

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by DM Smith <dm...@gmail.com>.


Erik Hatcher wrote:
>
> On Feb 25, 2006, at 3:24 PM, Daniel Naber wrote:
>> On Freitag 24 Februar 2006 00:50, Doug Cutting wrote:
>>
>>>> Are these all modules that don't need external libs?
>>>
>>> So far as I know!
>>
>> I found another module that requires external libraries: regex. These 
>> are
>> even defined in the additional.dependencies property in the 
>> build.xml, but
>> it seems it's not used (at least not for copying the libs to the
>> distribution).
>
> I personally don't think we should be distributing any external 
> dependencies.  Whoever builds the releases needs to have the 
> dependencies locally, but 3rd party JARs, even Apache ones, should not 
> go along for the .tar/zip ride IMO.  In the same manner that Ant 
> doesn't ship with junit.jar or any other 3rd party dependencies, it 
> still was compiled with them.
>
> I'm happy to go with the flow of the consensus though, and if folks 
> want the other JARs to go along then that's fine also.  There should 
> definitely be some docs that explain these 3rd party dependencies, and 
> I'll add that to the regex docs that I'm going to work on tomorrow.
My opinion as a user of the lucene:

If I understand correctly, there are no dependencies for lucene itself, 
but only for contrib? If so, please don't package jars. If not, document 
them and let us get them if we use the classes that require them.

On Linux I use jpackage for installs, I expect that the dependencies to 
be broken out as separate installs.
As far as Windows goes, I don't have any problem getting jars as I need 
them.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Feb 25, 2006, at 3:24 PM, Daniel Naber wrote:
> On Freitag 24 Februar 2006 00:50, Doug Cutting wrote:
>
>>> Are these all modules that don't need external libs?
>>
>> So far as I know!
>
> I found another module that requires external libraries: regex.  
> These are
> even defined in the additional.dependencies property in the  
> build.xml, but
> it seems it's not used (at least not for copying the libs to the
> distribution).

I personally don't think we should be distributing any external  
dependencies.  Whoever builds the releases needs to have the  
dependencies locally, but 3rd party JARs, even Apache ones, should  
not go along for the .tar/zip ride IMO.  In the same manner that Ant  
doesn't ship with junit.jar or any other 3rd party dependencies, it  
still was compiled with them.

I'm happy to go with the flow of the consensus though, and if folks  
want the other JARs to go along then that's fine also.  There should  
definitely be some docs that explain these 3rd party dependencies,  
and I'll add that to the regex docs that I'm going to work on tomorrow.

	Erik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Daniel Naber <lu...@danielnaber.de>.

On Freitag 24 Februar 2006 00:50, Doug Cutting wrote:

> > Are these all modules that don't need external libs?
>
> So far as I know!

I found another module that requires external libraries: regex. These are 
even defined in the additional.dependencies property in the build.xml, but 
it seems it's not used (at least not for copying the libs to the 
distribution).

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Daniel Naber <lu...@danielnaber.de>.

On Freitag 24 Februar 2006 00:50, Doug Cutting wrote:

> > Are these all modules that don't need external libs?
>
> So far as I know!  They built without me downloading anything extra.

Lucli requires jline.jar which is also in SVN and can be distributed thanks 
to its very liberal license (jline.LICENSE). Of course the license file 
needs to be included then, too.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available: surround package.html files

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Thanks Paul, it's in.
Otis

----- Original Message ----
From: Paul Elschot <pa...@xs4all.nl>
To: java-dev@lucene.apache.org
Sent: Saturday, February 25, 2006 7:09:25 AM
Subject: Re: Lucene 1.9 RC1 release available: surround package.html files

On Saturday 25 February 2006 01:23, Chris Hostetter wrote:
> 
...
> 
> ...Which means this weekend would be a good time for contrib module owners
> to commit a quick one sentence "package.html" for each package in their
> module.  Now that the contrib classes are built/bundled/distributed along
> with lucene-core, documenting what these modules do will be really handy.
> 
> contrib packages that are currently lacking a package.html ...
> 
..
At the moment issues.apache.org currently does not respond,
so I'll inline them here, all APL 2:

>   surround       org.apache.lucene.queryParser.surround.parser

<html>
<head>
<title>Surround parser package</title>
</head>
<body>
This package contains the QueryParser.jj source file for the Surround parser.
<br>
Parsing the text of a query results in a SrndQuery in the
org.apache.lucene.queryParser.surround.query package.
</body>
</html>

>   surround       org.apache.lucene.queryParser.surround.query

<html>
<head>
<title>Surround query package</title>
</head>
<body>
This package contains SrndQuery and its subclasses.
<br>
The parser in the org.apache.lucene.queryParser.surround.parser package
normally generates a SrndQuery.
<br>
For searching an org.apache.lucene.search.Query is provided by
the SrndQuery.makeLuceneQueryField method.
For this, TermQuery, BooleanQuery and SpanQuery are used from Lucene.
</body>
</html>

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available: surround package.html files

Posted by Paul Elschot <pa...@xs4all.nl>.

On Saturday 25 February 2006 01:23, Chris Hostetter wrote:
> 
...
> 
> ...Which means this weekend would be a good time for contrib module owners
> to commit a quick one sentence "package.html" for each package in their
> module.  Now that the contrib classes are built/bundled/distributed along
> with lucene-core, documenting what these modules do will be really handy.
> 
> contrib packages that are currently lacking a package.html ...
> 
..
At the moment issues.apache.org currently does not respond,
so I'll inline them here, all APL 2:

>   surround       org.apache.lucene.queryParser.surround.parser

<html>
<head>
<title>Surround parser package</title>
</head>
<body>
This package contains the QueryParser.jj source file for the Surround parser.
<br>
Parsing the text of a query results in a SrndQuery in the
org.apache.lucene.queryParser.surround.query package.
</body>
</html>

>   surround       org.apache.lucene.queryParser.surround.query

<html>
<head>
<title>Surround query package</title>
</head>
<body>
This package contains SrndQuery and its subclasses.
<br>
The parser in the org.apache.lucene.queryParser.surround.parser package
normally generates a SrndQuery.
<br>
For searching an org.apache.lucene.search.Query is provided by
the SrndQuery.makeLuceneQueryField method.
For this, TermQuery, BooleanQuery and SpanQuery are used from Lucene.
</body>
</html>

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Chris Hostetter <ho...@fucit.org>.

: FYI, I think all of the commits to trunk since the RC1 release are safe
: to merge to the 1.9 branch.  They're mostly documentation improvements.
:   So my plan is currently, on Monday, to merge these changes to the 1.9
: branch, then make a 1.9-final release.  I'll again announce it to the

...Which means this weekend would be a good time for contrib module owners
to commit a quick one sentence "package.html" for each package in their
module.  Now that the contrib classes are built/bundled/distributed along
with lucene-core, documenting what these modules do will be really handy.

contrib packages that are currently lacking a package.html ...

  ant            org.apache.lucene.ant
  luceli         lucli
  miscellaneous  org.apache.lucene.misc
  miscellaneous  org.apache.lucene.queryParser.analyzing
  miscellaneous  org.apache.lucene.queryParser.precedence
  similarity     org.apache.lucene.search.similar
  regex          org.apache.lucene.search.regex
  regex          org.apache.regexp
  surround       org.apache.lucene.queryParser.surround.parser
  surround       org.apache.lucene.queryParser.surround.query

The Demo code base is also missing package level documentation...

  org.apache.lucene.demo
  org.apache.lucene.demo.html


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Doug Cutting <cu...@apache.org>.

Daniel Naber wrote:
> Are these all modules that don't need external libs?

So far as I know!  They built without me downloading anything extra.

FYI, I think all of the commits to trunk since the RC1 release are safe 
to merge to the 1.9 branch.  They're mostly documentation improvements. 
  So my plan is currently, on Monday, to merge these changes to the 1.9 
branch, then make a 1.9-final release.  I'll again announce it to the 
dev list on Tuesday, after it's been mirrored, and to the user list on 
Wednesday.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Daniel Naber <lu...@danielnaber.de>.

On Freitag 24 Februar 2006 00:33, Daniel Naber wrote:

> Shouldn't we include at least some package from contrib, like analyzers
> and highlighter?

Sorry, I totally missed the "contrib" sub directory that contains 
everything I'm asking for... Are these all modules that don't need 
external libs?

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Daniel Naber <lu...@danielnaber.de>.

On Dienstag 21 Februar 2006 18:50, Doug Cutting wrote:

> I will send this announcement to user list tomorrow if no major issues
> are identified.  If things still look good next week, I will promote
> this release to 1.9-final.

Shouldn't we include at least some package from contrib, like analyzers and 
highlighter? "ant package" builds fine on my system, but it downloads 
libraries from the web, so we cannot distribute everything. But analyzers 
and highlighter should be okay. Actually analyzers contains classes that 
were in core before, so we really need to include it.

BTW, lucli (the command line lucene searcher) builds fine, but the manifest 
in the jar doesn't specify the Main-Class so you cannot start it with 
java's -jar option. Could someone have a look at this (Erik?)? I don't 
understand how that ant task is supposed to work, the manifest is 
specified and looks okay...

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Doug Cutting <cu...@apache.org>.

Doug Cutting wrote:
> Release 1.9 RC1 of Lucene is now available from:
> 
> http://www.apache.org/dyn/closer.cgi/lucene/java/

I will send this announcement to user list tomorrow if no major issues 
are identified.  If things still look good next week, I will promote 
this release to 1.9-final.  Once that's out, we can start removing all 
of the deprecated stuff from trunk, preparing for the 2.0 release.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Doug Cutting <cu...@apache.org>.

Grant Ingersoll wrote:
> I am wondering what the motivation is for being forward compatible to 
> 2.0.  Is the only change from 1.9 to 2.0 going to be the removal of 
> deprecated items? 

Pretty much, yes.

> Are we going to be preventing ourselves from making 
> broader structural changes?  My understanding of a major release is that 
> it allows you to make large scale changes, if needed, that may break 
> existing dependencies.

We're using the major version number to denote API compatibility.  Index 
formats can change (back-compatibly) and features can be added within a 
major release, but the API should not change incompatibly.  When we do 
need to change the API compatibly we try to do it by introducing new 
methods and classes and deprecating old methods an classes.

Lucene has a large install base.  A little effort towards 
back-compatibility on our part saves folks a lot of effort.

> For instance, I am working on a lazy field 
> loader patch (so that large fields aren't loaded just b/c the document 
> is loaded) and also am looking into the possibility of updating single 
> fields on a document.  The first change takes the Field class and makes 
> it an interface which has two implementations, one that is lazy and the 
> current one.  Granted, I haven't submitted the patch yet, but if this is 
> something that people are interested in, then it would make 2.0 not be 
> compatible with 1.9.

This sounds like it could be done with back-compatible APIs.  The new 
interface could be named Fieldable or something rather than field.

> From http://wiki.apache.org/jakarta-lucene/Lucene2Whiteboard, item #11 
> is almost certainly going to break things, as well, if someone takes it on.

I hope this can be done back-compatibly both for the API and the index 
format.  If it cannot be reasonably done that way then we'll deal with 
that then.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Lucene 1.9 RC1 release available

Posted by Grant Ingersoll <gs...@syr.edu>.

Doug Cutting wrote:
>
> 1.9 will be the last 1.x release. It is both back-compatible with 
> 1.4.3 and forward-compatible with the upcoming 2.0 release. Many 
> methods and classes in 1.4.3 have been deprecated in 1.9 and will be 
> removed in 2.0.  Applications must compile against 1.9 without 
> deprecation warnings before they are compatible with 2.0.
>
I am wondering what the motivation is for being forward compatible to 
2.0.  Is the only change from 1.9 to 2.0 going to be the removal of 
deprecated items?  Are we going to be preventing ourselves from making 
broader structural changes?  My understanding of a major release is that 
it allows you to make large scale changes, if needed, that may break 
existing dependencies.  For instance, I am working on a lazy field 
loader patch (so that large fields aren't loaded just b/c the document 
is loaded) and also am looking into the possibility of updating single 
fields on a document.  The first change takes the Field class and makes 
it an interface which has two implementations, one that is lazy and the 
current one.  Granted, I haven't submitted the patch yet, but if this is 
something that people are interested in, then it would make 2.0 not be 
compatible with 1.9. 

 From http://wiki.apache.org/jakarta-lucene/Lucene2Whiteboard, item #11 
is almost certainly going to break things, as well, if someone takes it on.

Is this kind of thing ruled out by the forward-compatibility issue or 
should I just submit my patch when it is ready and let the chips fall 
where they may?

-Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org