You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Jason Rutherglen <ja...@gmail.com> on 2011/08/16 04:23:32 UTC

Lucene 4.x release

We should release Lucene 4.x soon.  What else is hyper critical for
the initial release?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Lucene 4.x release

Posted by Robert Muir <rc...@gmail.com>.
all of the GSOC projects.

On Mon, Aug 15, 2011 at 10:23 PM, Jason Rutherglen
<ja...@gmail.com> wrote:
> We should release Lucene 4.x soon.  What else is hyper critical for
> the initial release?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>



-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Lucene 4.x release

Posted by Jason Rutherglen <ja...@gmail.com>.
Hmm... That one sounds like the most important hindrance to the 4.x
release.  Given there's so much that's new in the release, it seems
like releasing sooner is better than waiting for all of the details to
be completed.  Eg, then the release will be further delayed because of
non-general-deployment / all kinds of new stuff for people to digest
and break.

On Tue, Aug 16, 2011 at 8:54 PM, Robert Muir <rc...@gmail.com> wrote:
> the postings api.
>
> On Tue, Aug 16, 2011 at 8:24 PM, Jason Rutherglen
> <ja...@gmail.com> wrote:
>> I didn't know the bulk API was so important.  Which bulk API (eg the
>> postings one or the terms dict)?
>>
>> On Mon, Aug 15, 2011 at 11:17 PM, Robert Muir <rc...@gmail.com> wrote:
>>> On Mon, Aug 15, 2011 at 10:49 PM, Mark Miller <ma...@gmail.com> wrote:
>>>> Just throwing this out there, but:
>>>>
>>>> I think it would be really cool if we could get 4.0 out by the end of the year.
>>>>
>>>> With such a large release, I think it would also make a lot of sense if we tried a more formal beta release, just to increase the amount of usage before we officially sign off on a final 4.0.
>>>>
>>>
>>> I agree with the beta idea, I think its really necessary actually: we
>>> are just being honest at that point that its a real point-zero
>>> release.
>>>
>>> on the other hand, besides the GSOC stuff, I think we should
>>> accomplish a few things first to ensure we can actually make the 4.x
>>> release useful and issue minor releases off of it:
>>> * fix the bulk API: otherwise we only have "flexible indexing, as long
>>> as you don't mind flexible == slower". This is really important, I
>>> dont think we have to implement a bunch of new compression algorithms
>>> but the whole postings APIs are suboptimal, and biased towards
>>> lucene's current format: the bulk APIs arent low level enough to give
>>> good performance, the payloads APIs assume you can ask for a payload
>>> at any time (they assume basically that you are going to 'steal bits'
>>> from the positions like we do today), etc etc.
>>> * round out docvalues, especially merging with different docvalues
>>> types and things like that. arguably these are nocommits... I think
>>> you will get an exception during merge? I also think its bad we still
>>> don't use docvalues for norms nor the faceting module, fixing these
>>> kinds of real world uses is probably a great way to round this out.
>>> * figure out the packaging system for modules such that things like
>>> clover/hudson/javadocs etc all work across them (not quite today). We
>>> also need to look at all the minor things like CHANGES.txt and such...
>>> there are too many of these. Furthermore at least I wanted the
>>> analyzers modularization to move forward to a point where we can
>>> remove the Version crap and you just use the old jar file, I don't
>>> feel like we are even close to that.
>>> * fix codec naming: i think its silly to name a codec "Standard" and
>>> use the codec header for backwards compatibility, easier to name the
>>> codec "Standard40" and just package this codec in the next release for
>>> backwards compatibility, e.g. if we want to introduce a new index
>>> format we make it "Standard42". This is just my opinion though, its
>>> not the only way to solve the index backwards compat here but I think
>>> its easiest.
>>>
>>> I have a ton more pet peeves, but I think these are the biggest. It
>>> probably sounds like a lot but I think its totally stupid to release
>>> 4.0 if we cannot "grow" the 4.x branch with 4.1, 4.2, etc while we
>>> work on 5.x. Otherwise we are just jumping from 4.0 to 5.0 and thats a
>>> sign we just shouldnt have released at all.
>>>
>>> --
>>> lucidimagination.com
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
>
>
> --
> lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Lucene 4.x release

Posted by Robert Muir <rc...@gmail.com>.
the postings api.

On Tue, Aug 16, 2011 at 8:24 PM, Jason Rutherglen
<ja...@gmail.com> wrote:
> I didn't know the bulk API was so important.  Which bulk API (eg the
> postings one or the terms dict)?
>
> On Mon, Aug 15, 2011 at 11:17 PM, Robert Muir <rc...@gmail.com> wrote:
>> On Mon, Aug 15, 2011 at 10:49 PM, Mark Miller <ma...@gmail.com> wrote:
>>> Just throwing this out there, but:
>>>
>>> I think it would be really cool if we could get 4.0 out by the end of the year.
>>>
>>> With such a large release, I think it would also make a lot of sense if we tried a more formal beta release, just to increase the amount of usage before we officially sign off on a final 4.0.
>>>
>>
>> I agree with the beta idea, I think its really necessary actually: we
>> are just being honest at that point that its a real point-zero
>> release.
>>
>> on the other hand, besides the GSOC stuff, I think we should
>> accomplish a few things first to ensure we can actually make the 4.x
>> release useful and issue minor releases off of it:
>> * fix the bulk API: otherwise we only have "flexible indexing, as long
>> as you don't mind flexible == slower". This is really important, I
>> dont think we have to implement a bunch of new compression algorithms
>> but the whole postings APIs are suboptimal, and biased towards
>> lucene's current format: the bulk APIs arent low level enough to give
>> good performance, the payloads APIs assume you can ask for a payload
>> at any time (they assume basically that you are going to 'steal bits'
>> from the positions like we do today), etc etc.
>> * round out docvalues, especially merging with different docvalues
>> types and things like that. arguably these are nocommits... I think
>> you will get an exception during merge? I also think its bad we still
>> don't use docvalues for norms nor the faceting module, fixing these
>> kinds of real world uses is probably a great way to round this out.
>> * figure out the packaging system for modules such that things like
>> clover/hudson/javadocs etc all work across them (not quite today). We
>> also need to look at all the minor things like CHANGES.txt and such...
>> there are too many of these. Furthermore at least I wanted the
>> analyzers modularization to move forward to a point where we can
>> remove the Version crap and you just use the old jar file, I don't
>> feel like we are even close to that.
>> * fix codec naming: i think its silly to name a codec "Standard" and
>> use the codec header for backwards compatibility, easier to name the
>> codec "Standard40" and just package this codec in the next release for
>> backwards compatibility, e.g. if we want to introduce a new index
>> format we make it "Standard42". This is just my opinion though, its
>> not the only way to solve the index backwards compat here but I think
>> its easiest.
>>
>> I have a ton more pet peeves, but I think these are the biggest. It
>> probably sounds like a lot but I think its totally stupid to release
>> 4.0 if we cannot "grow" the 4.x branch with 4.1, 4.2, etc while we
>> work on 5.x. Otherwise we are just jumping from 4.0 to 5.0 and thats a
>> sign we just shouldnt have released at all.
>>
>> --
>> lucidimagination.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>



-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Lucene 4.x release

Posted by Jason Rutherglen <ja...@gmail.com>.
I didn't know the bulk API was so important.  Which bulk API (eg the
postings one or the terms dict)?

On Mon, Aug 15, 2011 at 11:17 PM, Robert Muir <rc...@gmail.com> wrote:
> On Mon, Aug 15, 2011 at 10:49 PM, Mark Miller <ma...@gmail.com> wrote:
>> Just throwing this out there, but:
>>
>> I think it would be really cool if we could get 4.0 out by the end of the year.
>>
>> With such a large release, I think it would also make a lot of sense if we tried a more formal beta release, just to increase the amount of usage before we officially sign off on a final 4.0.
>>
>
> I agree with the beta idea, I think its really necessary actually: we
> are just being honest at that point that its a real point-zero
> release.
>
> on the other hand, besides the GSOC stuff, I think we should
> accomplish a few things first to ensure we can actually make the 4.x
> release useful and issue minor releases off of it:
> * fix the bulk API: otherwise we only have "flexible indexing, as long
> as you don't mind flexible == slower". This is really important, I
> dont think we have to implement a bunch of new compression algorithms
> but the whole postings APIs are suboptimal, and biased towards
> lucene's current format: the bulk APIs arent low level enough to give
> good performance, the payloads APIs assume you can ask for a payload
> at any time (they assume basically that you are going to 'steal bits'
> from the positions like we do today), etc etc.
> * round out docvalues, especially merging with different docvalues
> types and things like that. arguably these are nocommits... I think
> you will get an exception during merge? I also think its bad we still
> don't use docvalues for norms nor the faceting module, fixing these
> kinds of real world uses is probably a great way to round this out.
> * figure out the packaging system for modules such that things like
> clover/hudson/javadocs etc all work across them (not quite today). We
> also need to look at all the minor things like CHANGES.txt and such...
> there are too many of these. Furthermore at least I wanted the
> analyzers modularization to move forward to a point where we can
> remove the Version crap and you just use the old jar file, I don't
> feel like we are even close to that.
> * fix codec naming: i think its silly to name a codec "Standard" and
> use the codec header for backwards compatibility, easier to name the
> codec "Standard40" and just package this codec in the next release for
> backwards compatibility, e.g. if we want to introduce a new index
> format we make it "Standard42". This is just my opinion though, its
> not the only way to solve the index backwards compat here but I think
> its easiest.
>
> I have a ton more pet peeves, but I think these are the biggest. It
> probably sounds like a lot but I think its totally stupid to release
> 4.0 if we cannot "grow" the 4.x branch with 4.1, 4.2, etc while we
> work on 5.x. Otherwise we are just jumping from 4.0 to 5.0 and thats a
> sign we just shouldnt have released at all.
>
> --
> lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Lucene 4.x release

Posted by Robert Muir <rc...@gmail.com>.
On Mon, Aug 15, 2011 at 10:49 PM, Mark Miller <ma...@gmail.com> wrote:
> Just throwing this out there, but:
>
> I think it would be really cool if we could get 4.0 out by the end of the year.
>
> With such a large release, I think it would also make a lot of sense if we tried a more formal beta release, just to increase the amount of usage before we officially sign off on a final 4.0.
>

I agree with the beta idea, I think its really necessary actually: we
are just being honest at that point that its a real point-zero
release.

on the other hand, besides the GSOC stuff, I think we should
accomplish a few things first to ensure we can actually make the 4.x
release useful and issue minor releases off of it:
* fix the bulk API: otherwise we only have "flexible indexing, as long
as you don't mind flexible == slower". This is really important, I
dont think we have to implement a bunch of new compression algorithms
but the whole postings APIs are suboptimal, and biased towards
lucene's current format: the bulk APIs arent low level enough to give
good performance, the payloads APIs assume you can ask for a payload
at any time (they assume basically that you are going to 'steal bits'
from the positions like we do today), etc etc.
* round out docvalues, especially merging with different docvalues
types and things like that. arguably these are nocommits... I think
you will get an exception during merge? I also think its bad we still
don't use docvalues for norms nor the faceting module, fixing these
kinds of real world uses is probably a great way to round this out.
* figure out the packaging system for modules such that things like
clover/hudson/javadocs etc all work across them (not quite today). We
also need to look at all the minor things like CHANGES.txt and such...
there are too many of these. Furthermore at least I wanted the
analyzers modularization to move forward to a point where we can
remove the Version crap and you just use the old jar file, I don't
feel like we are even close to that.
* fix codec naming: i think its silly to name a codec "Standard" and
use the codec header for backwards compatibility, easier to name the
codec "Standard40" and just package this codec in the next release for
backwards compatibility, e.g. if we want to introduce a new index
format we make it "Standard42". This is just my opinion though, its
not the only way to solve the index backwards compat here but I think
its easiest.

I have a ton more pet peeves, but I think these are the biggest. It
probably sounds like a lot but I think its totally stupid to release
4.0 if we cannot "grow" the 4.x branch with 4.1, 4.2, etc while we
work on 5.x. Otherwise we are just jumping from 4.0 to 5.0 and thats a
sign we just shouldnt have released at all.

-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Lucene 4.x release

Posted by Mark Miller <ma...@gmail.com>.
Just throwing this out there, but:

I think it would be really cool if we could get 4.0 out by the end of the year.

With such a large release, I think it would also make a lot of sense if we tried a more formal beta release, just to increase the amount of usage before we officially sign off on a final 4.0.

- Mark Miller
lucidimagination.com









---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Lucene 4.x release

Posted by Sujit Pal <su...@comcast.net>.
I would like Lucene 3236 to be addressed for version 4 if possible - I
have included a patch and it retains backward compatibility, so it
should not be too much work.
https://issues.apache.org/jira/browse/LUCENE-3236 

Thanks very much,
Sujit

On Mon, 2011-08-15 at 22:23 -0400, Jason Rutherglen wrote:
> We should release Lucene 4.x soon.  What else is hyper critical for
> the initial release?
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org