You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Shai Erera <se...@gmail.com> on 2010/04/13 17:27:21 UTC

Proposal about Version API "relaxation"

Hi

I'd like to propose a relaxation on the Version API. Uwe, please read the
entire email before you reply :).

I was thinking, following a question on the user list, that the
Version-based API may not be very intuitive to users, especially those who
don't care about versioning, as well as very inconvenient. So there are two
issues here:
1) How should one use Version smartly so that he keeps backwards
compatibility. I think we all know the answer, but a Wiki page with some
"best practices" tips would really help users use it.
2) How can one write sane code, which doesn't pass versions all over the
place if: (1) he doesn't care about versions, or (2) he cares, and sets the
Version to the same value in his app, in all places.

Also, I think that today we offer a flexibility to users, to set different
Versions on different objects in the life span of their application - which
is a good flexibility but can also lead people to shoot themselves in the
legs if they're not careful -- e.g. upgrading Version across their app, but
failing to do so for one or two places ...

So the change I'd like to propose is to mostly alleviate (2) and better
protect users - I DO NOT PROPOSE TO GET RID OF Version :).

I was thinking that we can add on Version a DEFAULT version, which the
caller can set. So Version.setDefault and Version.getDefault will be added,
as static members (more on the static-ness of it later). We then change the
API which requires Version to also expose an API which doesn't require it,
and that API will call Version.getDefault(). People can use it if they want
to ...

Few points:
1) As a default DEFAULT Version is controversial, I don't want to propose
it, even though I think Lucene can define the DEFAULT to be the latest.
Instead, I propose that Version.getDefault throw a
DefaultVersionNotSetException if it wasn't set, while an API which relies on
the default Version is called (I don't want to return null, not sure how
safe it is).
2) That DEFAULT Version is static, which means it will affect all indexing
code running inside the JVM. Which is fine:
2.1) Perhaps all the indexing code should use the same Version
2.2) If you know that's not the case, then pass Version to the API which
requires it - you cannot use the 'default Version' API -- nothing changes
for you.
One case is missing -- you might not know if your code is the only indexing
code which runs in the JVM ... I don't have a solution to that, but I think
it'll be revealed pretty quickly, and you can change your code then ...

So to summarize - the current Version API will remain and people can still
use it. The DEFAULT Version API is meant for convenience for those who don't
want to pass Version everywhere, for the reasons I outlined above. This will
also clean our test code significantly, as the tests will set the DEFAULT
version to TEST_VERSION_CURRENT at start ...

The changes to the Version class will be very simple.

If people think that's acceptable, I can open an issue and work on it.

Shai

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

I wholeheartedly support this anti-version riot :)

On Tue, Apr 13, 2010 at 19:27, Shai Erera <se...@gmail.com> wrote:
> Hi
>
> I'd like to propose a relaxation on the Version API. Uwe, please read the
> entire email before you reply :).
>
> I was thinking, following a question on the user list, that the
> Version-based API may not be very intuitive to users, especially those who
> don't care about versioning, as well as very inconvenient. So there are two
> issues here:
> 1) How should one use Version smartly so that he keeps backwards
> compatibility. I think we all know the answer, but a Wiki page with some
> "best practices" tips would really help users use it.
> 2) How can one write sane code, which doesn't pass versions all over the
> place if: (1) he doesn't care about versions, or (2) he cares, and sets the
> Version to the same value in his app, in all places.
>
> Also, I think that today we offer a flexibility to users, to set different
> Versions on different objects in the life span of their application - which
> is a good flexibility but can also lead people to shoot themselves in the
> legs if they're not careful -- e.g. upgrading Version across their app, but
> failing to do so for one or two places ...
>
> So the change I'd like to propose is to mostly alleviate (2) and better
> protect users - I DO NOT PROPOSE TO GET RID OF Version :).
>
> I was thinking that we can add on Version a DEFAULT version, which the
> caller can set. So Version.setDefault and Version.getDefault will be added,
> as static members (more on the static-ness of it later). We then change the
> API which requires Version to also expose an API which doesn't require it,
> and that API will call Version.getDefault(). People can use it if they want
> to ...
>
> Few points:
> 1) As a default DEFAULT Version is controversial, I don't want to propose
> it, even though I think Lucene can define the DEFAULT to be the latest.
> Instead, I propose that Version.getDefault throw a
> DefaultVersionNotSetException if it wasn't set, while an API which relies on
> the default Version is called (I don't want to return null, not sure how
> safe it is).
> 2) That DEFAULT Version is static, which means it will affect all indexing
> code running inside the JVM. Which is fine:
> 2.1) Perhaps all the indexing code should use the same Version
> 2.2) If you know that's not the case, then pass Version to the API which
> requires it - you cannot use the 'default Version' API -- nothing changes
> for you.
> One case is missing -- you might not know if your code is the only indexing
> code which runs in the JVM ... I don't have a solution to that, but I think
> it'll be revealed pretty quickly, and you can change your code then ...
>
> So to summarize - the current Version API will remain and people can still
> use it. The DEFAULT Version API is meant for convenience for those who don't
> want to pass Version everywhere, for the reasons I outlined above. This will
> also clean our test code significantly, as the tests will set the DEFAULT
> version to TEST_VERSION_CURRENT at start ...
>
> The changes to the Version class will be very simple.
>
> If people think that's acceptable, I can open an issue and work on it.
>
> Shai
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Tim Williams <wi...@gmail.com>.

On Tue, Apr 13, 2010 at 12:41 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi Shai,
>
>
>
> one of the problem I have is: That is a static default! We want to get rid
> of them (and did it mostly, only some relicts remain), so there are no plans
> to reimplement such a thing again. The badest one is
> BooleanQuery.maxClauseCount. The same applies to all types of sysprops. As
> Lucene and solr is mostly running in servlet containers, this type of thing
> makes web applications no longer isolated. This is also a general contract
> for libraries: never ever rely on sysprops or statics.

Do classpath resources fall in the same undesirable category?  (e.g.
put lucene.properties in your classpath with version=X.X).  Not great
but it'd get around the container challenge, right?

--tim

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael Busch <bu...@gmail.com>.

I agree with Uwe.  We shouldn't use non-final public statics.

Thinking out loud:  Could IndexWriter/IndexReader propagate the Version 
to the downstream classes (e.g. IndexWriter to Analyzers, IndexReader to 
queries) if not previously explicitly set?

E.g. an IndexWriter calls setVersion on an analyzer before it uses it, 
which only has an effect if it wasn't set before by the Analyzer's 
constructor that takes a version?

  Michael

On 4/13/10 9:41 AM, Uwe Schindler wrote:
>
> Hi Shai,
>
> one of the problem I have is: That is a static default! We want to get 
> rid of them (and did it mostly, only some relicts remain), so there 
> are no plans to reimplement such a thing again. The badest one is 
> BooleanQuery.maxClauseCount. The same applies to all types of 
> sysprops. As Lucene and solr is mostly running in servlet containers, 
> this type of thing  makes web applications no longer isolated. This is 
> also a general contract for libraries: never ever rely on sysprops or 
> statics.
>
> Uwe
>
> -----
>
> Uwe Schindler
>
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
> http://www.thetaphi.de <http://www.thetaphi.de/>
>
> eMail: uwe@thetaphi.de
>
> *From:* Shai Erera [mailto:serera@gmail.com]
> *Sent:* Tuesday, April 13, 2010 5:27 PM
> *To:* java-dev@lucene.apache.org
> *Subject:* Proposal about Version API "relaxation"
>
> Hi
>
> I'd like to propose a relaxation on the Version API. Uwe, please read 
> the entire email before you reply :).
>
> I was thinking, following a question on the user list, that the 
> Version-based API may not be very intuitive to users, especially those 
> who don't care about versioning, as well as very inconvenient. So 
> there are two issues here:
> 1) How should one use Version smartly so that he keeps backwards 
> compatibility. I think we all know the answer, but a Wiki page with 
> some "best practices" tips would really help users use it.
> 2) How can one write sane code, which doesn't pass versions all over 
> the place if: (1) he doesn't care about versions, or (2) he cares, and 
> sets the Version to the same value in his app, in all places.
>
> Also, I think that today we offer a flexibility to users, to set 
> different Versions on different objects in the life span of their 
> application - which is a good flexibility but can also lead people to 
> shoot themselves in the legs if they're not careful -- e.g. upgrading 
> Version across their app, but failing to do so for one or two places ...
>
> So the change I'd like to propose is to mostly alleviate (2) and 
> better protect users - I DO NOT PROPOSE TO GET RID OF Version :).
>
> I was thinking that we can add on Version a DEFAULT version, which the 
> caller can set. So Version.setDefault and Version.getDefault will be 
> added, as static members (more on the static-ness of it later). We 
> then change the API which requires Version to also expose an API which 
> doesn't require it, and that API will call Version.getDefault(). 
> People can use it if they want to ...
>
> Few points:
> 1) As a default DEFAULT Version is controversial, I don't want to 
> propose it, even though I think Lucene can define the DEFAULT to be 
> the latest. Instead, I propose that Version.getDefault throw a 
> DefaultVersionNotSetException if it wasn't set, while an API which 
> relies on the default Version is called (I don't want to return null, 
> not sure how safe it is).
> 2) That DEFAULT Version is static, which means it will affect all 
> indexing code running inside the JVM. Which is fine:
> 2.1) Perhaps all the indexing code should use the same Version
> 2.2) If you know that's not the case, then pass Version to the API 
> which requires it - you cannot use the 'default Version' API -- 
> nothing changes for you.
> One case is missing -- you might not know if your code is the only 
> indexing code which runs in the JVM ... I don't have a solution to 
> that, but I think it'll be revealed pretty quickly, and you can change 
> your code then ...
>
> So to summarize - the current Version API will remain and people can 
> still use it. The DEFAULT Version API is meant for convenience for 
> those who don't want to pass Version everywhere, for the reasons I 
> outlined above. This will also clean our test code significantly, as 
> the tests will set the DEFAULT version to TEST_VERSION_CURRENT at 
> start ...
>
> The changes to the Version class will be very simple.
>
> If people think that's acceptable, I can open an issue and work on it.
>
> Shai
>

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Wed, Apr 14, 2010 at 12:29 PM, Marvin Humphrey <ma...@rectangular.com>wrote:
>
>  > I also am not sure whether it in the past we just missed/ignored more
> back
> > compatibility issues or whether now we are creating more back compat.
> issues
> > due to more rapid change.
>
> It would be hard to search the archives to confirm my recollection, but I
> seem
> to remember back compat for Analyzers coming up every once in a while --
> say,
> in the context of modifying StandardAnalyzer's stoplist -- and changes not
> being made because they would change search results.
>

I think even things considered bugs were not actually "fixed" by default
because of this, until Version?

-- 
Robert Muir
rcmuir@gmail.com

RE: Proposal about Version API "relaxation"

Posted by Uwe Schindler <uw...@thetaphi.de>.

+1, Thanks for this detailed explanation! In my apps I have no problem to define a static default myself. And passing this to every ctor is easy, so where is the problem? Look at solr, since we introduced the version param to solrconfig, you have exactly that behavior, but its limited to this solr installation using this solr config. And you can still override.

Lucene is a library, no application, so it's not in lucene's responsibility to handle such things. Configuration and configuration objects passing around is an application responsibility.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: Wednesday, April 14, 2010 6:58 PM
> To: java-dev@lucene.apache.org
> Subject: Re: Proposal about Version API "relaxation"
> 
> On 04/14/2010 12:29 PM, Marvin Humphrey wrote:
> > On Wed, Apr 14, 2010 at 08:30:14AM -0400, Grant Ingersoll wrote:
> >
> >> The thing I keep going back to is that somehow Lucene has managed
> for years
> >> (and I mean lots of years) w/o stuff like Version and all this
> massive back
> >> compatibility checking.
> >>
> > Non-constant global variables are an anti-pattern.
> >
> 
> I think clinging to such rules in the face of all situations is an
> anti-pattern :) I take it as a rule of thumb.
> 
> In regards to this discussion:
> 
> I agree that the Version stuff is a bit of a mess. I also agree that
> many users will want to just use one version across their app that is
> easy to change.
> 
> I disagree that we should allow that behavior by just using a
> constructor without the Version param - or that you would be forced to
> set the static Version setting by trying to run your app and seeing an
> exception happen. That is all a bit ugly.
> 
> Too many users will not understand Version or care to if they see they
> can skip passing it. IMO, you should have to specify that you are
> looking for this behavior. In which case, why not just specify it using
> the version param itself :) E.g. if a user wants to get this kind of
> static behavior, they can just choose to do it on their own, and pass
> their *own* static Version constant to all the constructors.
> 
> I don't think we need to go through this hassle and introduce a less
> than ideal solution just so that users can pass one less param -
> especially when I think you should explicitly choose this behavior
> rather than get it by ignoring the Version param.
> 
> --
> - Mark
> 
> http://www.lucidimagination.com
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 04/14/2010 12:29 PM, Marvin Humphrey wrote:
> On Wed, Apr 14, 2010 at 08:30:14AM -0400, Grant Ingersoll wrote:
>    
>> The thing I keep going back to is that somehow Lucene has managed for years
>> (and I mean lots of years) w/o stuff like Version and all this massive back
>> compatibility checking.
>>      
> Non-constant global variables are an anti-pattern.
>    

I think clinging to such rules in the face of all situations is an 
anti-pattern :) I take it as a rule of thumb.

In regards to this discussion:

I agree that the Version stuff is a bit of a mess. I also agree that 
many users will want to just use one version across their app that is 
easy to change.

I disagree that we should allow that behavior by just using a 
constructor without the Version param - or that you would be forced to 
set the static Version setting by trying to run your app and seeing an 
exception happen. That is all a bit ugly.

Too many users will not understand Version or care to if they see they 
can skip passing it. IMO, you should have to specify that you are 
looking for this behavior. In which case, why not just specify it using 
the version param itself :) E.g. if a user wants to get this kind of 
static behavior, they can just choose to do it on their own, and pass 
their *own* static Version constant to all the constructors.

I don't think we need to go through this hassle and introduce a less 
than ideal solution just so that users can pass one less param - 
especially when I think you should explicitly choose this behavior 
rather than get it by ignoring the Version param.

-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Wed, Apr 14, 2010 at 08:30:14AM -0400, Grant Ingersoll wrote:
> The thing I keep going back to is that somehow Lucene has managed for years
> (and I mean lots of years) w/o stuff like Version and all this massive back
> compatibility checking.

Non-constant global variables are an anti-pattern.  Having a non-constant
global determine library behavior which results in silent failure (search
results degrade subtly, as opposed to e.g. an exception being thrown) is a
particularly insidious anti-pattern. 

In the Perl world, where modules are very heavily used thanks to CPAN, you're
more likely to come across the action-at-a-distance bugs spawned by this
anti-pattern.  I have direct experience debugging such usage of global vars.
It is extremely costly and frustrating.

For instance, there was one time when some module set the global variable
$YAML::Syck::ImplicitUnicode to a true value.  Whether or not that module was
loaded affected how YAML::Syck's Load() function would interpret character
data in completely unrelated portions of the code.  As with subtly degraded
search results, the result was silent failure (incorrect text stored in a
database).  It took many hours to hunt down what was going wrong because the
code that was causing the problem was nowhere near the code where the problem
manifested.  The authors of the affected code had done nothing wrong, aside
from using a poorly designed module like YAML::Syck.

I am strongly opposed to using a global variable for versioning because I do
not wish to impose such maddening debugging sessions on a handful of unlucky
duckies who have done nothing wrong other than to choose Lucene as their
search engine library.  

This shouldn't be controversial.  The temptations of global variables are
obvious, but their flaws are well understood:

    http://www.google.com/search?q=global+variables+evil

It is to be expected that the global would work most of the time.  This design
flaw, by nature, disproportionately afflicts a small number of users with
action-at-a-distance bugs.  Knowingly choosing to impose such costs on a
random few is deeply unfair.

> I also am not sure whether it in the past we just missed/ignored more back
> compatibility issues or whether now we are creating more back compat. issues
> due to more rapid change.  

It would be hard to search the archives to confirm my recollection, but I seem
to remember back compat for Analyzers coming up every once in a while -- say,
in the context of modifying StandardAnalyzer's stoplist -- and changes not
being made because they would change search results.

Marvin Humphrey

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Andi Vajda <va...@osafoundation.org>.

On Apr 14, 2010, at 7:45, Yonik Seeley <yo...@lucidimagination.com>  
wrote:

> On Wed, Apr 14, 2010 at 10:39 AM, DM Smith <dm...@gmail.com>  
> wrote:
>> Maybe have the index store the version(s) and use that when  
>> constructing a
>> reader or writer?
>
> That would cause a reindex to change behavior (among other problems).

If the index contained this information it could prevent mistakes  
where one adds documents or queries them with a different analyzer  
version setting than used when the index was created, leading to  
subtle bugs...

It seems to me, then, that the only time an analyzer version would be  
required is at index (re)creation time.

Andi..

>
> -Yonik
> Apache Lucene Eurocon 2010
> 18-21 May 2010 | Prague
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Yonik Seeley <yo...@lucidimagination.com>.

On Wed, Apr 14, 2010 at 10:39 AM, DM Smith <dm...@gmail.com> wrote:
> Maybe have the index store the version(s) and use that when constructing a
> reader or writer?

That would cause a reindex to change behavior (among other problems).

-Yonik
Apache Lucene Eurocon 2010
18-21 May 2010 | Prague

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On 04/14/2010 09:13 AM, Robert Muir wrote:
> Its not sidetracked at all. there seem to be more compelling 
> alternatives to achieve the same thing, so we should consider 
> alternative solutions, too.
Maybe have the index store the version(s) and use that when constructing 
a reader or writer?
Given enough minor releases, it is likely that different analyzers would 
use different versions. So each feature would need to be represented.

>
> On Wed, Apr 14, 2010 at 8:54 AM, Earwin Burrfoot <earwin@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     The thread somehow got sidetracked. So, let's get this carriage back
>     on its rails?
>
>     Let me remind - we have an API on hands that is mandatory and tends to
>     be cumbersome.
>     Proposed solution does indeed have ultrascary word "static" in it. But
>     if you brace yourself and look closer - the use of said static is
>     opt-in and heavily guarded.
>     So even a long-standing hater of everything static like me is tempted.
>
>
>     On Wed, Apr 14, 2010 at 16:30, Grant Ingersoll
>     <gsingers@apache.org <ma...@apache.org>> wrote:
>     >
>     > On Apr 14, 2010, at 12:49 AM, Robert Muir wrote:
>     >
>     >>
>     >> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
>     <marvin@rectangular.com <ma...@rectangular.com>> wrote:
>     >> New class names would work, too.
>     >>
>     >> I only mention that for the sake of completeness, though --
>     it's not a
>     >> suggestion.
>     >>
>     >> Right, to me this is just as bad.
>     >> In my eyes, the Version thing really shows the problem with the
>     analysis stuff:
>     >> * Used by QueryParsers, etc at search and index time, with no
>     real clean way to do back-compat
>     >> * Concepts like Version and class-naming push some of the
>     burden to the user: users decide the back-compat level, but it
>     still leaves devs with back-compat management hassle.
>     >>
>     >> The idea of having a real versioned-module is the same as
>     Version and class-naming, except it both pushes the burden to the
>     user in a more natural way (people are used to versioned jar files
>     and things like that... not Version constants), and it relieves
>     devs of the back compat
>     >>
>     >> In all honesty with the current scheme, release schedules of
>     Lucene, and Lucene's policy, the analysis stuff will soon deadlock
>     into being nearly unmaintainable, and to many users, the API is
>     already unconsumable: its difficult to write reusable analyzers
>     due to historical relics in the API, methods are named
>     inappropriately, e.g. Tokenizer.reset(Reader) and
>     TokenStream.reset(), they don't understand Version, and probably a
>     few other things I am forgetting that are basically impossible to
>     fix right now with the current state of affairs.
>     >
>     >
>     > The thing I keep going back to is that somehow Lucene has
>     managed for years (and I mean lots of years) w/o stuff like
>     Version and all this massive back compatibility checking.  I'm
>     still undecided as to whether that is a good thing or not.  I also
>     am not sure whether it in the past we just missed/ignored more
>     back compatibility issues or whether now we are creating more back
>     compat. issues due to more rapid change.  I agree, though, that
>     all of this stuff is making it harder and harder to develop (and I
>     don't mean for us committers, I mean for end consumers.)
>     >
>     > I also agree about Robert's point about the incorrectness of
>     naming something 3.0 versus 3.1 when 3.1 is the thing that has all
>     the new features and is really the "major" release.
>     >
>     > -Grant
>     >
>     ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <ma...@lucene.apache.org>
>     > For additional commands, e-mail: java-dev-help@lucene.apache.org
>     <ma...@lucene.apache.org>
>     >
>     >
>
>
>
>     --
>     Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com
>     <ma...@gmail.com>)
>     Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>     ICQ: 104465785
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <ma...@lucene.apache.org>
>     For additional commands, e-mail: java-dev-help@lucene.apache.org
>     <ma...@lucene.apache.org>
>
>
>
>
> -- 
> Robert Muir
> rcmuir@gmail.com <ma...@gmail.com>

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

Its not sidetracked at all. there seem to be more compelling alternatives to
achieve the same thing, so we should consider alternative solutions, too.

On Wed, Apr 14, 2010 at 8:54 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> The thread somehow got sidetracked. So, let's get this carriage back
> on its rails?
>
> Let me remind - we have an API on hands that is mandatory and tends to
> be cumbersome.
> Proposed solution does indeed have ultrascary word "static" in it. But
> if you brace yourself and look closer - the use of said static is
> opt-in and heavily guarded.
> So even a long-standing hater of everything static like me is tempted.
>
>
> On Wed, Apr 14, 2010 at 16:30, Grant Ingersoll <gs...@apache.org>
> wrote:
> >
> > On Apr 14, 2010, at 12:49 AM, Robert Muir wrote:
> >
> >>
> >> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey <
> marvin@rectangular.com> wrote:
> >> New class names would work, too.
> >>
> >> I only mention that for the sake of completeness, though -- it's not a
> >> suggestion.
> >>
> >> Right, to me this is just as bad.
> >> In my eyes, the Version thing really shows the problem with the analysis
> stuff:
> >> * Used by QueryParsers, etc at search and index time, with no real clean
> way to do back-compat
> >> * Concepts like Version and class-naming push some of the burden to the
> user: users decide the back-compat level, but it still leaves devs with
> back-compat management hassle.
> >>
> >> The idea of having a real versioned-module is the same as Version and
> class-naming, except it both pushes the burden to the user in a more natural
> way (people are used to versioned jar files and things like that... not
> Version constants), and it relieves devs of the back compat
> >>
> >> In all honesty with the current scheme, release schedules of Lucene, and
> Lucene's policy, the analysis stuff will soon deadlock into being nearly
> unmaintainable, and to many users, the API is already unconsumable: its
> difficult to write reusable analyzers due to historical relics in the API,
> methods are named inappropriately, e.g. Tokenizer.reset(Reader) and
> TokenStream.reset(), they don't understand Version, and probably a few other
> things I am forgetting that are basically impossible to fix right now with
> the current state of affairs.
> >
> >
> > The thing I keep going back to is that somehow Lucene has managed for
> years (and I mean lots of years) w/o stuff like Version and all this massive
> back compatibility checking.  I'm still undecided as to whether that is a
> good thing or not.  I also am not sure whether it in the past we just
> missed/ignored more back compatibility issues or whether now we are creating
> more back compat. issues due to more rapid change.  I agree, though, that
> all of this stuff is making it harder and harder to develop (and I don't
> mean for us committers, I mean for end consumers.)
> >
> > I also agree about Robert's point about the incorrectness of naming
> something 3.0 versus 3.1 when 3.1 is the thing that has all the new features
> and is really the "major" release.
> >
> > -Grant
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

The thread somehow got sidetracked. So, let's get this carriage back
on its rails?

Let me remind - we have an API on hands that is mandatory and tends to
be cumbersome.
Proposed solution does indeed have ultrascary word "static" in it. But
if you brace yourself and look closer - the use of said static is
opt-in and heavily guarded.
So even a long-standing hater of everything static like me is tempted.


On Wed, Apr 14, 2010 at 16:30, Grant Ingersoll <gs...@apache.org> wrote:
>
> On Apr 14, 2010, at 12:49 AM, Robert Muir wrote:
>
>>
>> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey <ma...@rectangular.com> wrote:
>> New class names would work, too.
>>
>> I only mention that for the sake of completeness, though -- it's not a
>> suggestion.
>>
>> Right, to me this is just as bad.
>> In my eyes, the Version thing really shows the problem with the analysis stuff:
>> * Used by QueryParsers, etc at search and index time, with no real clean way to do back-compat
>> * Concepts like Version and class-naming push some of the burden to the user: users decide the back-compat level, but it still leaves devs with back-compat management hassle.
>>
>> The idea of having a real versioned-module is the same as Version and class-naming, except it both pushes the burden to the user in a more natural way (people are used to versioned jar files and things like that... not Version constants), and it relieves devs of the back compat
>>
>> In all honesty with the current scheme, release schedules of Lucene, and Lucene's policy, the analysis stuff will soon deadlock into being nearly unmaintainable, and to many users, the API is already unconsumable: its difficult to write reusable analyzers due to historical relics in the API, methods are named inappropriately, e.g. Tokenizer.reset(Reader) and TokenStream.reset(), they don't understand Version, and probably a few other things I am forgetting that are basically impossible to fix right now with the current state of affairs.
>
>
> The thing I keep going back to is that somehow Lucene has managed for years (and I mean lots of years) w/o stuff like Version and all this massive back compatibility checking.  I'm still undecided as to whether that is a good thing or not.  I also am not sure whether it in the past we just missed/ignored more back compatibility issues or whether now we are creating more back compat. issues due to more rapid change.  I agree, though, that all of this stuff is making it harder and harder to develop (and I don't mean for us committers, I mean for end consumers.)
>
> I also agree about Robert's point about the incorrectness of naming something 3.0 versus 3.1 when 3.1 is the thing that has all the new features and is really the "major" release.
>
> -Grant
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Grant Ingersoll <gs...@apache.org>.

On Apr 14, 2010, at 12:49 AM, Robert Muir wrote:

> 
> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey <ma...@rectangular.com> wrote:
> New class names would work, too.
> 
> I only mention that for the sake of completeness, though -- it's not a
> suggestion.
> 
> Right, to me this is just as bad. 
> In my eyes, the Version thing really shows the problem with the analysis stuff:
> * Used by QueryParsers, etc at search and index time, with no real clean way to do back-compat
> * Concepts like Version and class-naming push some of the burden to the user: users decide the back-compat level, but it still leaves devs with back-compat management hassle.
> 
> The idea of having a real versioned-module is the same as Version and class-naming, except it both pushes the burden to the user in a more natural way (people are used to versioned jar files and things like that... not Version constants), and it relieves devs of the back compat
> 
> In all honesty with the current scheme, release schedules of Lucene, and Lucene's policy, the analysis stuff will soon deadlock into being nearly unmaintainable, and to many users, the API is already unconsumable: its difficult to write reusable analyzers due to historical relics in the API, methods are named inappropriately, e.g. Tokenizer.reset(Reader) and TokenStream.reset(), they don't understand Version, and probably a few other things I am forgetting that are basically impossible to fix right now with the current state of affairs.

The thing I keep going back to is that somehow Lucene has managed for years (and I mean lots of years) w/o stuff like Version and all this massive back compatibility checking.  I'm still undecided as to whether that is a good thing or not.  I also am not sure whether it in the past we just missed/ignored more back compatibility issues or whether now we are creating more back compat. issues due to more rapid change.  I agree, though, that all of this stuff is making it harder and harder to develop (and I don't mean for us committers, I mean for end consumers.)

I also agree about Robert's point about the incorrectness of naming something 3.0 versus 3.1 when 3.1 is the thing that has all the new features and is really the "major" release.

-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Wed, Apr 14, 2010 at 2:49 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

> > And 2.9's backwards compatibility layer in
> > TokenStream
> > was significantly slower.
>
> I protest! No, it was not slower, only at the beginning because of missing
> reflection caching! But this also affected the *new* API. With 2.9.x and old
> TokenStreams there is no speed difference, really.
>

but it wasn't like this initially. only after you put even more work into
the backwards compatibility layer, after discovering performance issues with
Solr, all happening in a minor release from major changes.

I guess Marvin is hinting that perhaps major changes could be associated
with major versions. For that example, perhaps more time could have been
instead spent upgrading Solr tokenstreams so it could move to 3.0 (rather
than almost a year later).

And I do think its a good example, you put a ton of work into this, but not
all the backwards compatibility can be done like this, and what if somehow
this one had slipped through without this caching? I think most users would
consider it strange to experience a performance degradation in a minor
release from major changes...

-- 
Robert Muir
rcmuir@gmail.com

RE: Proposal about Version API "relaxation"

Posted by Uwe Schindler <uw...@thetaphi.de>.

> And 2.9's backwards compatibility layer in
> TokenStream
> was significantly slower.

I protest! No, it was not slower, only at the beginning because of missing reflection caching! But this also affected the *new* API. With 2.9.x and old TokenStreams there is no speed difference, really.

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Wed, Apr 14, 2010 at 12:49:52AM -0400, Robert Muir wrote:

> its very unnatural for release 3.0 to be almost a no-op and for release 3.1
> to provide a new default index format and support for customizing how the
> index is stored. And now we are looking at providing flexibility in scoring
> that will hopefully redefine lucene from being a vector-space search engine
> library to something much more flexible?  This is a minor release?!

I agree, but what really bothers me are the X.9 releases.  

2.9 changed performance characteristics dramatically enough that it was a
backwards-break in all but name for many users -- most prominently, Solr[1].
Solr's FieldCache RAM requirements doubled because of the transition to
per-segment search.  And 2.9's backwards compatibility layer in TokenStream
was significantly slower.

In my opinion, the transition to per-segment search and new-style TokenStreams
should have triggered a major version break.  Had that been the case, less
effort could have been spent on backwards compatibility shims and fewer API
design compromises would have been necessary.

To avoid such costs in the future, and to communicate disruptions in the
library to users via version numbers more accurately...

  * There should not be a Lucene 3.9.  
  * Lucene 4.0 should do more than remove deprecations.

Marvin Humphrey

[1] Thanks to Robert and Mark Miller for reminding me just what the
    Solr/Lucene-2.9 problems were via IRC.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey <ma...@rectangular.com>wrote:

> New class names would work, too.
>
> I only mention that for the sake of completeness, though -- it's not a
> suggestion.
>

Right, to me this is just as bad.
In my eyes, the Version thing really shows the problem with the analysis
stuff:
* Used by QueryParsers, etc at search and index time, with no real clean way
to do back-compat
* Concepts like Version and class-naming push some of the burden to the
user: users decide the back-compat level, but it still leaves devs with
back-compat management hassle.

The idea of having a real versioned-module is the same as Version and
class-naming, except it both pushes the burden to the user in a more natural
way (people are used to versioned jar files and things like that... not
Version constants), and it relieves devs of the back compat nightmare.

In all honesty with the current scheme, release schedules of Lucene, and
Lucene's policy, the analysis stuff will soon deadlock into being nearly
unmaintainable, and to many users, the API is already unconsumable: its
difficult to write reusable analyzers due to historical relics in the API,
methods are named inappropriately, e.g. Tokenizer.reset(Reader) and
TokenStream.reset(), they don't understand Version, and probably a few other
things I am forgetting that are basically impossible to fix right now with
the current state of affairs.

> I'm a little concerned about the issue DM Smith brought up: what happens
> when
> you have separate applications within the same JVM which have built indexes
> using separate versions of an Analyzer?

> That use case is supported under the current regime, but I'm not sure
> whether
> it would be with aggressively versioned Analyzer packages.  If it's not,
> under
> what circumstances does that matter?
>

I think this is an advanced use case. No offense to DM, but for every
advanced use-case on java-dev like him, there are 100 people on java-user
that don't have to juggle independently versioned indexes with different
Analyzer versions within the same JVM. I think we should look at back-compat
reasonably, and at the end of the day, its an open source project, so if
theres some extreme advanced use case someone can do a few eclipse renames
themselves.

> Well, for Lucy, I think we may have addressed this problem with the new
> back
> compat policy we're auditioning with KS:
>
>    KinoSearch spins off stable forks into new namespaces periodically. As
> of
>    this release, the latest is "KinoSearch1", forked from version 0.165.
>    Users who require strong backwards compatibility should use a stable
> fork.
>
>    The main namespace, "KinoSearch", is an unstable development branch (as
>    hinted at by its version number). Superficial API changes are frequent.
>    Hard file format compatibility breaks which require reindexing are rare,
>    as we generally try to provide continuity across multiple releases, but
>    they happen every once in a while.
>

This is a whole lot larger issue (the concept of stable or release forks and
having a trunk that allows for quicker development) and its definitely
interesting. We spend a lot of time on backwards compatibility, but to take
advantage of many new features (for example, faster speed with release 3.1
rather than using flex-emulation APIs) you need to reindex anyway. I just
think analysis is really the worst-case, with not many other mechanisms for
back-compat, so its especially nasty.

Hmm, I suppose that doesn't work with the convention that the only
> difference
> between Lucene X.9 and Lucene Y.0 is the removal of deprecations.  But if
> anything is crying out for a rethink in the Lucene back compat policy, IMO
> that's it: make major version breaks act like major version breaks and
> change
> stuff that needs changin'.
>

This brings up a great point, its very unnatural for release 3.0 to be
almost a no-op and for release 3.1 to provide a new default index format and
support for customizing how the index is stored. And now we are looking at
providing flexibility in scoring that will hopefully redefine lucene from
being a vector-space search engine library to something much more flexible?
This is a minor release?!

I definitely think we should rethink things.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Chris Male <ge...@gmail.com>.

On Wed, Apr 14, 2010 at 11:22 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
> <ma...@rectangular.com> wrote:
>
> > Essentially, we're free to break back compat within "Lucy" at any time,
> but
> > we're not able to break back compat within a stable fork like "Lucy1",
> > "Lucy2", etc.  So what we'll probably do during normal development with
> > Analyzers is just change them and note the break in the Changes file.
>
> So... what if we change up how we develop and release Lucene:
>
>  * A major release always bumps the major release number (2.x ->
>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>    releases along that branch
>
>  * There is no back compat across major releases (index nor APIs),
>    but full back compat within branches.
>
> This would match how many other projects work (KS/Lucy, as Marvin
> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>
> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
> if any devs have the itch, they could freely back-port improvements
> from trunk as long as they kept back-compat within the branch.
>
> I think in such a future world, we could:
>
>  * Remove Version entirely!
>
>  * Not worry at all about back-compat when developing on trunk
>
>  * Give proper names to new improved classes instead of
>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>    today; rename existing classes.
>
>  * Let analyzers freely, incrementally improve
>
>  * Use interfaces without fear
>
>  * Stop spending the truly substantial time (look @ Uwe's awesome
>    back-compat layer for analyzers!) that we now must spend when
>    adding new features, for back-compat
>
>  * Be more free to introduce very new not-fully-baked features/APIs,
>    marked as experimental, on the expectation that once they are used
>    (in trunk) they will iterate/change/improve vs trying so hard to
>    get things right on the first go for fear of future back compat
>    horrors.
>
> Thoughts...?
>

+1


>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
Chris Male | Software Developer | JTeam BV.| www.jteam.nl

Re: Proposal about Version API "relaxation"

Posted by Danil ŢORIN <to...@gmail.com>.

I'm realize that just transforming old index won't give me anything new.

The applications usually evolve.

Let's take as example 2.9 (relatively few changes in index structure, but
Trie was a nice addition, per segment search and reload was a bless):
- There are 4 billion documents which don't have numeric ranges (but those
still got faster reopen)
- But for next 1 billion documents in another index i do have numeric
ranges.

The whole application works in ONE environment from same codebase.

Splitting it into several environments based on whatever version of lucene
happend to be current at index creation date,
and maintaining branches of code would be quite a PITA for a developer (and
very error prone)

So yeah, I won't get new features for old indexes if i transform them to new
format, but new indexes will be able to use them.
And my application as a whole will be much cleaner and easier to maintain
(I'm a lazy developer that thinks that he is already overworked)

I just want my system as a whole to evolve together with lucene without
dropping the indexes I already have
and keeping tens of branches of code and remembering how things worked back
in 2005 to slightly modify the analyzer because data in 2010 changed a bit.

Danil.

On Thu, Apr 15, 2010 at 15:56, Robert Muir <rc...@gmail.com> wrote:

> I think you guys miss the entire point.
>
> The idea that you can keep getting "all the new features" without
> reindexing is merely an illusion
>
> Instead, features simply aren't being added at all, because the policy
> makes it too cumbersome.
>
> Why is it problematic to have a different SVN branch/release, with lots of
> new features, but requires you to reindex and change your app?
>
> If its too difficult to reindex, it doesnt break your app that features
> exist elsewhere that you cannot access.
> Its the same as it is today, there are features you cannot access, except
> they do not even exist in apache SVN at all, even trunk, because of these
> problems.
>
> On Thu, Apr 15, 2010 at 8:42 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
>
>> I like the idea of index conversion tool over silent online upgrade
>> because it is
>> 1. controllable - with online upgrade you never know for sure when
>> your index is completely upgraded, even optimize() won't help here, as
>> it is a noop for already-optimized indexes
>> 2. way easier to write - as flex shows, index format changes are
>> accompanied by API changes. Here you don't have to emulate new APIs
>> over old structures (can be impossible for some cases?), you only have
>> to, well, convert.
>>
>> On Thu, Apr 15, 2010 at 16:32, Danil ŢORIN <to...@gmail.com> wrote:
>> > All I ask is a way to migrate existing indexes to newer format.
>> >
>> >
>> > On Thu, Apr 15, 2010 at 15:21, Robert Muir <rc...@gmail.com> wrote:
>> >>
>> >> its open source, if you feel this way, you can put the work to add
>> >> features to some version branch from trunk in a backwards compatible
>> way.
>> >> Then this branch can have a backwards-compatible minor release with new
>> >> features, but nothing ground-breaking.
>> >> but this kinda stuff shouldnt hinder development on trunk.
>> >>
>> >> On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com>
>> wrote:
>> >>>
>> >>> Sometimes it's REALLY impossible to reindex, or has absolutely
>> >>> prohibitive cost to do in a running production system (i can't shut it
>> down
>> >>> for maintainance, so i need a lot of hardware to reindex ~5 billion
>> >>> documents, i have no idea what are the costs to retrieve that data all
>> over
>> >>> again, but i estimate it to be quite a lot)
>> >>> And providing a way to migrate existing indexes to new lucene is
>> crucial
>> >>> from my point of view.
>> >>> I don't care what this way is: calling optimize() with newer lucene or
>> >>> running some tool that takes 5 days, it's ok with me.
>> >>> Just don't put me through full reindexing as I really don't have all
>> that
>> >>> data anymore.
>> >>> It's not my data, i just receive it from clients, and provide a search
>> >>> interface.
>> >>> It took years to build those indexes, rebuilding is not an option, and
>> >>> staying with old lucene forever just sucks.
>> >>>
>> >>> Danil.
>> >>> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
>> >>>>
>> >>>>
>> >>>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>> Well ... I must say that I completely disagree w/ dropping index
>> >>>>> structure back-support. Our customers will simply not hear of
>> reindexing 10s
>> >>>>> of TBs of content because of version upgrades. Such a decision is
>> key to
>> >>>>> Lucene adoption in large-scale projects. It's entirely not about
>> whether
>> >>>>> Lucene is a content store or not - content is stored on other
>> systems, I
>> >>>>> agree. But that doesn't mean reindexing it is tolerable.
>> >>>>>
>> >>>>
>> >>>> I don't understand how its helpful to do a MAJOR version upgrade
>> without
>> >>>> reindexing... what in the world do you stand to gain from that?
>> >>>> The idea here, is that development can be free of such hassles.
>> >>>> Development should be this way.
>> >>>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want
>> to
>> >>>> reindex, and are willing to do the work to port it back to Version 3
>> in a
>> >>>> completely backwards compatible way, then under this new scheme it
>> can
>> >>>> happen.
>> >>>>
>> >>>> --
>> >>>> Robert Muir
>> >>>> rcmuir@gmail.com
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Robert Muir
>> >> rcmuir@gmail.com
>> >
>> >
>>
>>
>>
>> --
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Well ... I could argue that it's you who miss the point :).

I completely don't buy the "all the new features" comment --> how many new
features are in a major release which force you to consider reindexing? Yet
there are many of them that change the API. How will I know whether a
release supports my index or not? Why do I need to work hard to back-port
all the new developed issues onto a branch I use? How many of those branches
will exist? Will they all run nightly unit tests? Can I cut a release of
such branch myself? Or will I need the PMC or a VOTE? This will get
complicated pretty fast ...

Lucene is not a "do it yourself" kit - we try so hard to have the best
defaults, best out of the box experience ... best everything for our users.
Even w/ Analyzers we try so damn hard. While we could have simply
componentize everything and tell the users "you can use those filters,
tokenizers, segment mergers, policies etc. to make up your indexing
application" ...

And I don't think there are features out there that exist and are not
contributed because people are afraid of the index format changes ...
obviously if they have done it, they're passed the fear of handling index
format ... I'd like to hear of one such feature. I'd bet there are such out
there that are not contributed for IP, Business and Laziness reasons.

Shai

On Thu, Apr 15, 2010 at 3:56 PM, Robert Muir <rc...@gmail.com> wrote:

> I think you guys miss the entire point.
>
> The idea that you can keep getting "all the new features" without
> reindexing is merely an illusion
>
> Instead, features simply aren't being added at all, because the policy
> makes it too cumbersome.
>
> Why is it problematic to have a different SVN branch/release, with lots of
> new features, but requires you to reindex and change your app?
>
> If its too difficult to reindex, it doesnt break your app that features
> exist elsewhere that you cannot access.
> Its the same as it is today, there are features you cannot access, except
> they do not even exist in apache SVN at all, even trunk, because of these
> problems.
>
> On Thu, Apr 15, 2010 at 8:42 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
>
>> I like the idea of index conversion tool over silent online upgrade
>> because it is
>> 1. controllable - with online upgrade you never know for sure when
>> your index is completely upgraded, even optimize() won't help here, as
>> it is a noop for already-optimized indexes
>> 2. way easier to write - as flex shows, index format changes are
>> accompanied by API changes. Here you don't have to emulate new APIs
>> over old structures (can be impossible for some cases?), you only have
>> to, well, convert.
>>
>> On Thu, Apr 15, 2010 at 16:32, Danil ŢORIN <to...@gmail.com> wrote:
>> > All I ask is a way to migrate existing indexes to newer format.
>> >
>> >
>> > On Thu, Apr 15, 2010 at 15:21, Robert Muir <rc...@gmail.com> wrote:
>> >>
>> >> its open source, if you feel this way, you can put the work to add
>> >> features to some version branch from trunk in a backwards compatible
>> way.
>> >> Then this branch can have a backwards-compatible minor release with new
>> >> features, but nothing ground-breaking.
>> >> but this kinda stuff shouldnt hinder development on trunk.
>> >>
>> >> On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com>
>> wrote:
>> >>>
>> >>> Sometimes it's REALLY impossible to reindex, or has absolutely
>> >>> prohibitive cost to do in a running production system (i can't shut it
>> down
>> >>> for maintainance, so i need a lot of hardware to reindex ~5 billion
>> >>> documents, i have no idea what are the costs to retrieve that data all
>> over
>> >>> again, but i estimate it to be quite a lot)
>> >>> And providing a way to migrate existing indexes to new lucene is
>> crucial
>> >>> from my point of view.
>> >>> I don't care what this way is: calling optimize() with newer lucene or
>> >>> running some tool that takes 5 days, it's ok with me.
>> >>> Just don't put me through full reindexing as I really don't have all
>> that
>> >>> data anymore.
>> >>> It's not my data, i just receive it from clients, and provide a search
>> >>> interface.
>> >>> It took years to build those indexes, rebuilding is not an option, and
>> >>> staying with old lucene forever just sucks.
>> >>>
>> >>> Danil.
>> >>> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
>> >>>>
>> >>>>
>> >>>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>> Well ... I must say that I completely disagree w/ dropping index
>> >>>>> structure back-support. Our customers will simply not hear of
>> reindexing 10s
>> >>>>> of TBs of content because of version upgrades. Such a decision is
>> key to
>> >>>>> Lucene adoption in large-scale projects. It's entirely not about
>> whether
>> >>>>> Lucene is a content store or not - content is stored on other
>> systems, I
>> >>>>> agree. But that doesn't mean reindexing it is tolerable.
>> >>>>>
>> >>>>
>> >>>> I don't understand how its helpful to do a MAJOR version upgrade
>> without
>> >>>> reindexing... what in the world do you stand to gain from that?
>> >>>> The idea here, is that development can be free of such hassles.
>> >>>> Development should be this way.
>> >>>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want
>> to
>> >>>> reindex, and are willing to do the work to port it back to Version 3
>> in a
>> >>>> completely backwards compatible way, then under this new scheme it
>> can
>> >>>> happen.
>> >>>>
>> >>>> --
>> >>>> Robert Muir
>> >>>> rcmuir@gmail.com
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Robert Muir
>> >> rcmuir@gmail.com
>> >
>> >
>>
>>
>>
>> --
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

I think you guys miss the entire point.

The idea that you can keep getting "all the new features" without reindexing
is merely an illusion

Instead, features simply aren't being added at all, because the policy makes
it too cumbersome.

Why is it problematic to have a different SVN branch/release, with lots of
new features, but requires you to reindex and change your app?

If its too difficult to reindex, it doesnt break your app that features
exist elsewhere that you cannot access.
Its the same as it is today, there are features you cannot access, except
they do not even exist in apache SVN at all, even trunk, because of these
problems.

On Thu, Apr 15, 2010 at 8:42 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> I like the idea of index conversion tool over silent online upgrade
> because it is
> 1. controllable - with online upgrade you never know for sure when
> your index is completely upgraded, even optimize() won't help here, as
> it is a noop for already-optimized indexes
> 2. way easier to write - as flex shows, index format changes are
> accompanied by API changes. Here you don't have to emulate new APIs
> over old structures (can be impossible for some cases?), you only have
> to, well, convert.
>
> On Thu, Apr 15, 2010 at 16:32, Danil ŢORIN <to...@gmail.com> wrote:
> > All I ask is a way to migrate existing indexes to newer format.
> >
> >
> > On Thu, Apr 15, 2010 at 15:21, Robert Muir <rc...@gmail.com> wrote:
> >>
> >> its open source, if you feel this way, you can put the work to add
> >> features to some version branch from trunk in a backwards compatible
> way.
> >> Then this branch can have a backwards-compatible minor release with new
> >> features, but nothing ground-breaking.
> >> but this kinda stuff shouldnt hinder development on trunk.
> >>
> >> On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com>
> wrote:
> >>>
> >>> Sometimes it's REALLY impossible to reindex, or has absolutely
> >>> prohibitive cost to do in a running production system (i can't shut it
> down
> >>> for maintainance, so i need a lot of hardware to reindex ~5 billion
> >>> documents, i have no idea what are the costs to retrieve that data all
> over
> >>> again, but i estimate it to be quite a lot)
> >>> And providing a way to migrate existing indexes to new lucene is
> crucial
> >>> from my point of view.
> >>> I don't care what this way is: calling optimize() with newer lucene or
> >>> running some tool that takes 5 days, it's ok with me.
> >>> Just don't put me through full reindexing as I really don't have all
> that
> >>> data anymore.
> >>> It's not my data, i just receive it from clients, and provide a search
> >>> interface.
> >>> It took years to build those indexes, rebuilding is not an option, and
> >>> staying with old lucene forever just sucks.
> >>>
> >>> Danil.
> >>> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
> >>>>
> >>>>
> >>>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
> >>>>>
> >>>>> Well ... I must say that I completely disagree w/ dropping index
> >>>>> structure back-support. Our customers will simply not hear of
> reindexing 10s
> >>>>> of TBs of content because of version upgrades. Such a decision is key
> to
> >>>>> Lucene adoption in large-scale projects. It's entirely not about
> whether
> >>>>> Lucene is a content store or not - content is stored on other
> systems, I
> >>>>> agree. But that doesn't mean reindexing it is tolerable.
> >>>>>
> >>>>
> >>>> I don't understand how its helpful to do a MAJOR version upgrade
> without
> >>>> reindexing... what in the world do you stand to gain from that?
> >>>> The idea here, is that development can be free of such hassles.
> >>>> Development should be this way.
> >>>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
> >>>> reindex, and are willing to do the work to port it back to Version 3
> in a
> >>>> completely backwards compatible way, then under this new scheme it can
> >>>> happen.
> >>>>
> >>>> --
> >>>> Robert Muir
> >>>> rcmuir@gmail.com
> >>>
> >>
> >>
> >>
> >> --
> >> Robert Muir
> >> rcmuir@gmail.com
> >
> >
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Erick Erickson <er...@gmail.com>.

'Cause some exec finally noticed the product was losing market share.
Or got a wild hair strategically placed. My point is only that
we should be clear that some number of Lucene users *will* be in such
a position.

I'm actually fine with a decision that we're not going to support such
a scenario, but let's be clear that that's the decision we're making.

And corporate competence aside, there's still licensing that may prevent
me archiving the raw data....

Erick

On Thu, Apr 15, 2010 at 10:20 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> I think the need to upgrade to latest and greatest lucene for poor
> corporate users that lost all their data is somewhat overblown.
> Why the heck do you need to upgrade if your app rotted in neglect for
> years??
>
> On Thu, Apr 15, 2010 at 18:14, Erick Erickson <er...@gmail.com>
> wrote:
> > Coming in late to the discussion, and without really understanding the
> > underlying Lucene issues, but...
> > The size of the problem of reindexing is under-appreciated I think.
> > Somewhere
> > in my company is the original data I indexed. But the effort it would
> take
> > to
> > resurrect it is O(unknown). An unfortunate reality of commercial products
> is
> > that the often receive very little love for extended periods of time
> until
> > all of
> > the sudden more work is required. There ensues an extended period of
> > re-orientation, even if the people who originally worked on the project
> are
> > still
> > around.
> > *Assuming* the data is available to reindex (and there are many reasons
> > besides poor practice on the part of the company that it may not be),
> > remembering/finding out exactly which of the various backups you made
> > of the original data is the one that's actually in your product can be
> > highly
> > non-trivial. Compounded by the fact that the product manager will be
> > adamant about "Do NOT surprise our customers".
> > So I can be in a spot of saying "I *think* I have the original data set,
> and
> > I
> > *think* I have the original code used to index it, and if I get a new
> > version of
> > Lucene I *think* I can recreate the index and I *think* that the user
> will
> > see
> > the expected change. After all that effort is completed, I *think* we'll
> see
> > the
> > expected changes, but we won't know until we try it" puts me in a very
> > precarious position.
> > This assumes that I have a reasonable chance of getting the original
> data.
> > But
> > say I've been indexing data from a live feed. Sure as hell hope I stored
> the
> > data somewhere, because going back to the source and saying "please
> resend
> > me 10 years worth of data that I have in my index" is...er...hard. Or say
> > that the original provider has gone out of business, or the licensing
> > arrangement
> > specifies a one-time transmission of data that may not be retained in its
> > original
> > form or.....
> > The point of this long diatribe is that there are many reasons why
> > reindexing is
> > impossible and/or impractical. Making any decision that requires
> reindexing
> > for
> > a new version is locking a user into a version potentially forever. We
> > should not
> > underestimate how painful that can be and should never think that "just
> > reindex"
> > is acceptable in all situations. It's not. Period.
> > Be very clear that some number of Lucene users will absolutely not be
> able
> > to reindex. We may still make a decision that requires this, but let's
> make
> > it
> > without deluding ourselves that it's a possible solution for everyone.
> > So an upgrade tool seems like a reasonable compromise. I agree that being
> > hampered in what we can develop in Lucene by having to accomodate
> > reading old indexes slows new features etc. It's always nice to be
> > able to work without dealing with pesky legacy issues <G>. Perhaps
> > splitting out the indexing upgrades into a separate program lets us
> > accommodate both concerns.
> > FWIW
> > Erick
> > On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN <to...@gmail.com> wrote:
> >>
> >> True. Just need the tool.
> >>
> >> On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <ea...@gmail.com>
> wrote:
> >> >
> >> > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <
> yonik@lucidimagination.com>
> >> > wrote:
> >> > > Seamless online upgrades have their place too... say you are
> upgrading
> >> > > one server at a time in a cluster.
> >> >
> >> > Nothing here that can't be solved with an upgrade tool. Down one
> >> > server, upgrade index, upgrade sofware, up.
> >> >
> >> > --
> >> > Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> >> > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> >> > ICQ: 104465785
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> >
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Danil ŢORIN <to...@gmail.com>.

The app is not rotted, it's alive and kicking, and gets a lot of TLC.

There are some older indexes that use some features and there are
newer indexes that will benefit greatly from newer features.
All running in one freaking big distributed application.

Leveraging lucene version by updating to newer lucene for new indexes and
changing analyzer chain of old indexes in a way that doesn't affect
(too much) search results they used to get,
is a logical way from my point of view.

I only ask for a tool to convert from old lucene format to new one.
I don't expect magic to happen, but give me the possibility to go
forward and let me worry about backward compatibility of search
results.

On Thu, Apr 15, 2010 at 17:20, Earwin Burrfoot <ea...@gmail.com> wrote:
> I think the need to upgrade to latest and greatest lucene for poor
> corporate users that lost all their data is somewhat overblown.
> Why the heck do you need to upgrade if your app rotted in neglect for years??
>
> On Thu, Apr 15, 2010 at 18:14, Erick Erickson <er...@gmail.com> wrote:
>> Coming in late to the discussion, and without really understanding the
>> underlying Lucene issues, but...
>> The size of the problem of reindexing is under-appreciated I think.
>> Somewhere
>> in my company is the original data I indexed. But the effort it would take
>> to
>> resurrect it is O(unknown). An unfortunate reality of commercial products is
>> that the often receive very little love for extended periods of time until
>> all of
>> the sudden more work is required. There ensues an extended period of
>> re-orientation, even if the people who originally worked on the project are
>> still
>> around.
>> *Assuming* the data is available to reindex (and there are many reasons
>> besides poor practice on the part of the company that it may not be),
>> remembering/finding out exactly which of the various backups you made
>> of the original data is the one that's actually in your product can be
>> highly
>> non-trivial. Compounded by the fact that the product manager will be
>> adamant about "Do NOT surprise our customers".
>> So I can be in a spot of saying "I *think* I have the original data set, and
>> I
>> *think* I have the original code used to index it, and if I get a new
>> version of
>> Lucene I *think* I can recreate the index and I *think* that the user will
>> see
>> the expected change. After all that effort is completed, I *think* we'll see
>> the
>> expected changes, but we won't know until we try it" puts me in a very
>> precarious position.
>> This assumes that I have a reasonable chance of getting the original data.
>> But
>> say I've been indexing data from a live feed. Sure as hell hope I stored the
>> data somewhere, because going back to the source and saying "please resend
>> me 10 years worth of data that I have in my index" is...er...hard. Or say
>> that the original provider has gone out of business, or the licensing
>> arrangement
>> specifies a one-time transmission of data that may not be retained in its
>> original
>> form or.....
>> The point of this long diatribe is that there are many reasons why
>> reindexing is
>> impossible and/or impractical. Making any decision that requires reindexing
>> for
>> a new version is locking a user into a version potentially forever. We
>> should not
>> underestimate how painful that can be and should never think that "just
>> reindex"
>> is acceptable in all situations. It's not. Period.
>> Be very clear that some number of Lucene users will absolutely not be able
>> to reindex. We may still make a decision that requires this, but let's make
>> it
>> without deluding ourselves that it's a possible solution for everyone.
>> So an upgrade tool seems like a reasonable compromise. I agree that being
>> hampered in what we can develop in Lucene by having to accomodate
>> reading old indexes slows new features etc. It's always nice to be
>> able to work without dealing with pesky legacy issues <G>. Perhaps
>> splitting out the indexing upgrades into a separate program lets us
>> accommodate both concerns.
>> FWIW
>> Erick
>> On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN <to...@gmail.com> wrote:
>>>
>>> True. Just need the tool.
>>>
>>> On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <ea...@gmail.com> wrote:
>>> >
>>> > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com>
>>> > wrote:
>>> > > Seamless online upgrades have their place too... say you are upgrading
>>> > > one server at a time in a cluster.
>>> >
>>> > Nothing here that can't be solved with an upgrade tool. Down one
>>> > server, upgrade index, upgrade sofware, up.
>>> >
>>> > --
>>> > Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>>> > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>> > ICQ: 104465785
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

I think the need to upgrade to latest and greatest lucene for poor
corporate users that lost all their data is somewhat overblown.
Why the heck do you need to upgrade if your app rotted in neglect for years??

On Thu, Apr 15, 2010 at 18:14, Erick Erickson <er...@gmail.com> wrote:
> Coming in late to the discussion, and without really understanding the
> underlying Lucene issues, but...
> The size of the problem of reindexing is under-appreciated I think.
> Somewhere
> in my company is the original data I indexed. But the effort it would take
> to
> resurrect it is O(unknown). An unfortunate reality of commercial products is
> that the often receive very little love for extended periods of time until
> all of
> the sudden more work is required. There ensues an extended period of
> re-orientation, even if the people who originally worked on the project are
> still
> around.
> *Assuming* the data is available to reindex (and there are many reasons
> besides poor practice on the part of the company that it may not be),
> remembering/finding out exactly which of the various backups you made
> of the original data is the one that's actually in your product can be
> highly
> non-trivial. Compounded by the fact that the product manager will be
> adamant about "Do NOT surprise our customers".
> So I can be in a spot of saying "I *think* I have the original data set, and
> I
> *think* I have the original code used to index it, and if I get a new
> version of
> Lucene I *think* I can recreate the index and I *think* that the user will
> see
> the expected change. After all that effort is completed, I *think* we'll see
> the
> expected changes, but we won't know until we try it" puts me in a very
> precarious position.
> This assumes that I have a reasonable chance of getting the original data.
> But
> say I've been indexing data from a live feed. Sure as hell hope I stored the
> data somewhere, because going back to the source and saying "please resend
> me 10 years worth of data that I have in my index" is...er...hard. Or say
> that the original provider has gone out of business, or the licensing
> arrangement
> specifies a one-time transmission of data that may not be retained in its
> original
> form or.....
> The point of this long diatribe is that there are many reasons why
> reindexing is
> impossible and/or impractical. Making any decision that requires reindexing
> for
> a new version is locking a user into a version potentially forever. We
> should not
> underestimate how painful that can be and should never think that "just
> reindex"
> is acceptable in all situations. It's not. Period.
> Be very clear that some number of Lucene users will absolutely not be able
> to reindex. We may still make a decision that requires this, but let's make
> it
> without deluding ourselves that it's a possible solution for everyone.
> So an upgrade tool seems like a reasonable compromise. I agree that being
> hampered in what we can develop in Lucene by having to accomodate
> reading old indexes slows new features etc. It's always nice to be
> able to work without dealing with pesky legacy issues <G>. Perhaps
> splitting out the indexing upgrades into a separate program lets us
> accommodate both concerns.
> FWIW
> Erick
> On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN <to...@gmail.com> wrote:
>>
>> True. Just need the tool.
>>
>> On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <ea...@gmail.com> wrote:
>> >
>> > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com>
>> > wrote:
>> > > Seamless online upgrades have their place too... say you are upgrading
>> > > one server at a time in a cluster.
>> >
>> > Nothing here that can't be solved with an upgrade tool. Down one
>> > server, upgrade index, upgrade sofware, up.
>> >
>> > --
>> > Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> > ICQ: 104465785
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Erick Erickson <er...@gmail.com>.

I never said finding oneself in this position was the result of careful
planning and flawless execution <G>. But that's the reality some of
our users will find themselves in.

Even worse... *I* may find myself in that position because of a decision
someone
*else* made before they were fired.....

Erick

On Thu, Apr 15, 2010 at 10:18 AM, Mark Miller <ma...@gmail.com> wrote:

> If you absolutely cannot re-index, and you have *no* access to the data
> again - you are one ballsy mofo to upgrade to a new version of Lucene for
> "features". It means you likely BASE jump in your free time?
>
>
> On 04/15/2010 10:14 AM, Erick Erickson wrote:
>
>> Coming in late to the discussion, and without really understanding the
>> underlying Lucene issues, but...
>>
>> The size of the problem of reindexing is under-appreciated I think.
>> Somewhere
>> in my company is the original data I indexed. But the effort it would take
>> to
>> resurrect it is O(unknown). An unfortunate reality of commercial products
>> is
>> that the often receive very little love for extended periods of time until
>> all of
>> the sudden more work is required. There ensues an extended period of
>> re-orientation, even if the people who originally worked on the project
>> are still
>> around.
>>
>> *Assuming* the data is available to reindex (and there are many reasons
>> besides poor practice on the part of the company that it may not be),
>> remembering/finding out exactly which of the various backups you made
>> of the original data is the one that's actually in your product can be
>> highly
>> non-trivial. Compounded by the fact that the product manager will be
>> adamant about "Do NOT surprise our customers".
>>
>> So I can be in a spot of saying "I *think* I have the original data set,
>> and I
>> *think* I have the original code used to index it, and if I get a new
>> version of
>> Lucene I *think* I can recreate the index and I *think* that the user will
>> see
>> the expected change. After all that effort is completed, I *think* we'll
>> see the
>> expected changes, but we won't know until we try it" puts me in a very
>> precarious position.
>>
>> This assumes that I have a reasonable chance of getting the original data.
>> But
>> say I've been indexing data from a live feed. Sure as hell hope I stored
>> the
>> data somewhere, because going back to the source and saying "please resend
>> me 10 years worth of data that I have in my index" is...er...hard. Or say
>> that the original provider has gone out of business, or the licensing
>> arrangement
>> specifies a one-time transmission of data that may not be retained in its
>> original
>> form or.....
>>
>> The point of this long diatribe is that there are many reasons why
>> reindexing is
>> impossible and/or impractical. Making any decision that requires
>> reindexing for
>> a new version is locking a user into a version potentially forever. We
>> should not
>> underestimate how painful that can be and should never think that "just
>> reindex"
>> is acceptable in all situations. It's not. Period.
>>
>> Be very clear that some number of Lucene users will absolutely not be able
>> to reindex. We may still make a decision that requires this, but let's
>> make it
>> without deluding ourselves that it's a possible solution for everyone.
>>
>> So an upgrade tool seems like a reasonable compromise. I agree that being
>> hampered in what we can develop in Lucene by having to accomodate
>> reading old indexes slows new features etc. It's always nice to be
>> able to work without dealing with pesky legacy issues <G>. Perhaps
>> splitting out the indexing upgrades into a separate program lets us
>> accommodate both concerns.
>>
>> FWIW
>> Erick
>>
>> On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN <torindan@gmail.com <mailto:
>> torindan@gmail.com>> wrote:
>>
>>    True. Just need the tool.
>>
>>    On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <earwin@gmail.com
>>    <ma...@gmail.com>> wrote:
>>    >
>>    > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley
>>    <yonik@lucidimagination.com <ma...@lucidimagination.com>>
>>
>>    wrote:
>>    > > Seamless online upgrades have their place too... say you are
>>    upgrading
>>    > > one server at a time in a cluster.
>>    >
>>    > Nothing here that can't be solved with an upgrade tool. Down one
>>    > server, upgrade index, upgrade sofware, up.
>>    >
>>    > --
>>    > Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com
>>    <ma...@gmail.com>)
>>
>>    > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>    > ICQ: 104465785
>>    >
>>    >
>>    ---------------------------------------------------------------------
>>    > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>    <ma...@lucene.apache.org>
>>
>>    > For additional commands, e-mail: java-dev-help@lucene.apache.org
>>    <ma...@lucene.apache.org>
>>
>>    >
>>
>>    ---------------------------------------------------------------------
>>    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>    <ma...@lucene.apache.org>
>>
>>    For additional commands, e-mail: java-dev-help@lucene.apache.org
>>    <ma...@lucene.apache.org>
>>
>>
>>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

If you absolutely cannot re-index, and you have *no* access to the data 
again - you are one ballsy mofo to upgrade to a new version of Lucene 
for "features". It means you likely BASE jump in your free time?

On 04/15/2010 10:14 AM, Erick Erickson wrote:
> Coming in late to the discussion, and without really understanding the
> underlying Lucene issues, but...
>
> The size of the problem of reindexing is under-appreciated I think. 
> Somewhere
> in my company is the original data I indexed. But the effort it would 
> take to
> resurrect it is O(unknown). An unfortunate reality of commercial 
> products is
> that the often receive very little love for extended periods of time 
> until all of
> the sudden more work is required. There ensues an extended period of
> re-orientation, even if the people who originally worked on the 
> project are still
> around.
>
> *Assuming* the data is available to reindex (and there are many reasons
> besides poor practice on the part of the company that it may not be),
> remembering/finding out exactly which of the various backups you made
> of the original data is the one that's actually in your product can be 
> highly
> non-trivial. Compounded by the fact that the product manager will be
> adamant about "Do NOT surprise our customers".
>
> So I can be in a spot of saying "I *think* I have the original data 
> set, and I
> *think* I have the original code used to index it, and if I get a new 
> version of
> Lucene I *think* I can recreate the index and I *think* that the user 
> will see
> the expected change. After all that effort is completed, I *think* 
> we'll see the
> expected changes, but we won't know until we try it" puts me in a very
> precarious position.
>
> This assumes that I have a reasonable chance of getting the original 
> data. But
> say I've been indexing data from a live feed. Sure as hell hope I 
> stored the
> data somewhere, because going back to the source and saying "please resend
> me 10 years worth of data that I have in my index" is...er...hard. Or say
> that the original provider has gone out of business, or the licensing 
> arrangement
> specifies a one-time transmission of data that may not be retained in 
> its original
> form or.....
>
> The point of this long diatribe is that there are many reasons why 
> reindexing is
> impossible and/or impractical. Making any decision that requires 
> reindexing for
> a new version is locking a user into a version potentially forever. We 
> should not
> underestimate how painful that can be and should never think that 
> "just reindex"
> is acceptable in all situations. It's not. Period.
>
> Be very clear that some number of Lucene users will absolutely not be able
> to reindex. We may still make a decision that requires this, but let's 
> make it
> without deluding ourselves that it's a possible solution for everyone.
>
> So an upgrade tool seems like a reasonable compromise. I agree that being
> hampered in what we can develop in Lucene by having to accomodate
> reading old indexes slows new features etc. It's always nice to be
> able to work without dealing with pesky legacy issues <G>. Perhaps
> splitting out the indexing upgrades into a separate program lets us
> accommodate both concerns.
>
> FWIW
> Erick
>
> On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN <torindan@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     True. Just need the tool.
>
>     On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <earwin@gmail.com
>     <ma...@gmail.com>> wrote:
>     >
>     > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley
>     <yonik@lucidimagination.com <ma...@lucidimagination.com>>
>     wrote:
>     > > Seamless online upgrades have their place too... say you are
>     upgrading
>     > > one server at a time in a cluster.
>     >
>     > Nothing here that can't be solved with an upgrade tool. Down one
>     > server, upgrade index, upgrade sofware, up.
>     >
>     > --
>     > Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com
>     <ma...@gmail.com>)
>     > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>     > ICQ: 104465785
>     >
>     >
>     ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <ma...@lucene.apache.org>
>     > For additional commands, e-mail: java-dev-help@lucene.apache.org
>     <ma...@lucene.apache.org>
>     >
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <ma...@lucene.apache.org>
>     For additional commands, e-mail: java-dev-help@lucene.apache.org
>     <ma...@lucene.apache.org>
>
>


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Erick Erickson <er...@gmail.com>.

Coming in late to the discussion, and without really understanding the
underlying Lucene issues, but...

The size of the problem of reindexing is under-appreciated I think.
Somewhere
in my company is the original data I indexed. But the effort it would take
to
resurrect it is O(unknown). An unfortunate reality of commercial products is
that the often receive very little love for extended periods of time until
all of
the sudden more work is required. There ensues an extended period of
re-orientation, even if the people who originally worked on the project are
still
around.

*Assuming* the data is available to reindex (and there are many reasons
besides poor practice on the part of the company that it may not be),
remembering/finding out exactly which of the various backups you made
of the original data is the one that's actually in your product can be
highly
non-trivial. Compounded by the fact that the product manager will be
adamant about "Do NOT surprise our customers".

So I can be in a spot of saying "I *think* I have the original data set, and
I
*think* I have the original code used to index it, and if I get a new
version of
Lucene I *think* I can recreate the index and I *think* that the user will
see
the expected change. After all that effort is completed, I *think* we'll see
the
expected changes, but we won't know until we try it" puts me in a very
precarious position.

This assumes that I have a reasonable chance of getting the original data.
But
say I've been indexing data from a live feed. Sure as hell hope I stored the
data somewhere, because going back to the source and saying "please resend
me 10 years worth of data that I have in my index" is...er...hard. Or say
that the original provider has gone out of business, or the licensing
arrangement
specifies a one-time transmission of data that may not be retained in its
original
form or.....

The point of this long diatribe is that there are many reasons why
reindexing is
impossible and/or impractical. Making any decision that requires reindexing
for
a new version is locking a user into a version potentially forever. We
should not
underestimate how painful that can be and should never think that "just
reindex"
is acceptable in all situations. It's not. Period.

Be very clear that some number of Lucene users will absolutely not be able
to reindex. We may still make a decision that requires this, but let's make
it
without deluding ourselves that it's a possible solution for everyone.

So an upgrade tool seems like a reasonable compromise. I agree that being
hampered in what we can develop in Lucene by having to accomodate
reading old indexes slows new features etc. It's always nice to be
able to work without dealing with pesky legacy issues <G>. Perhaps
splitting out the indexing upgrades into a separate program lets us
accommodate both concerns.

FWIW
Erick

On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN <to...@gmail.com> wrote:

> True. Just need the tool.
>
> On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <ea...@gmail.com> wrote:
> >
> > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com>
> wrote:
> > > Seamless online upgrades have their place too... say you are upgrading
> > > one server at a time in a cluster.
> >
> > Nothing here that can't be solved with an upgrade tool. Down one
> > server, upgrade index, upgrade sofware, up.
> >
> > --
> > Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> > ICQ: 104465785
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Danil ŢORIN <to...@gmail.com>.

True. Just need the tool.

On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <ea...@gmail.com> wrote:
>
> On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com> wrote:
> > Seamless online upgrades have their place too... say you are upgrading
> > one server at a time in a cluster.
>
> Nothing here that can't be solved with an upgrade tool. Down one
> server, upgrade index, upgrade sofware, up.
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

but seriously... are you moving across major lucene releases every single
day?

if you are using 3.x, how does it hurt you if there is a version 4.x that
you can't use without re-indexing?

why wouldn't you just stay on your stable branch (say 3.x)?

2010/4/15 jm <jm...@gmail.com>

> Not sure if plain users are allowed/encouraged to post in this list,
> but wanted to mention (just an opinion from a happy user), as other
> users have, that not all of us can reindex just like that. It would
> not be 10 min for one of our installations for sure...
>
> First, i would need to implement some code to reindex, cause my source
> data is postprocessed/compressed/encrypted/moved after it arrives to
> the application, so I would need to retrieve all etc. And then
> reindexing it would take days.
> javier
>
> On Thu, Apr 15, 2010 at 9:04 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
> >> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
> >> manual migration on the segments that are still on old versions.
> >> That's not the point about whether optimize() is good or not. It is
> >> the difference between telling the customer to run a 5-day migration
> >> process, or a couple of hours. At the end of the day, the same
> >> migration code will need to be written whether for the manual or
> >> automatic case. And probably by the same developer which changed the
> >> index format. It's the difference of when does it happen.
> >
> > Converting stuff is easier then emulating, that's exactly why I want a
> > separate tool.
> > There's no need to support cross-version merging, nor to emulate old
> APIs.
> >
> > I also don't understand why offline migration is going to take days
> > instead of hours for online migration??
> > WTF, it's gonna be even faster, as it doesn't have to merge things.
> >
> > --
> > Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> > ICQ: 104465785
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

> Not sure if plain users are allowed/encouraged to post in this list,
> but wanted to mention (just an opinion from a happy user), as other
> users have, that not all of us can reindex just like that. It would
> not be 10 min for one of our installations for sure...
>
> First, i would need to implement some code to reindex, cause my source
> data is postprocessed/compressed/encrypted/moved after it arrives to
> the application, so I would need to retrieve all etc. And then
> reindexing it would take days.
> javier

There's absolutely no, zero, nada, way to use modified/fixed analyzer
stack without reindexing.
If you want it - reindex, if you don't - stick with the stable branch.

If your stack is unchanged, but the index format changes - upgrade it
with the proposed tool and be happy.

Speaking as a happy plain user, whose indexes take two days to be
fully rebuilt and who does it (though not always full) at least once a
month.

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

RE: Proposal about Version API "relaxation"

Posted by Uwe Schindler <uw...@thetaphi.de>.

I wish we could have a face to face talk like in the evenings at ApacheCon :(

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Grant Ingersoll [mailto:gsiasf@gmail.com] On Behalf Of Grant
> Ingersoll
> Sent: Thursday, April 15, 2010 9:46 PM
> To: java-dev@lucene.apache.org
> Subject: Re: Proposal about Version API "relaxation"
> 
> From IRC:
> "why do I get the feeling that everyone is in "heated agreement" on the
> Version thread?
> there are some cases that mean people will have to reindex
> in those cases, we should tell people they will have to reindex
> then they can decide to upgrade or not
> all other cases, just do the sensible thing and test first
> I have yet to meet anyone who simply drops a new version into
> production and says go"
> 
> So, as I said earlier, why don't we just move forward with it, strive
> to support reading X-1 index format in X and let the user know the
> cases in which they will have to re-index. If a migration tool is
> necessary, then someone can write it at the appropriate time.  Just as
> was said w/ the Solr merge, it's software.  If it doesn't work, we can
> change it.  Thank goodness we don't have a back compatibility policy
> for our policies!
> 
> -Grant
> 
> 
> 
> 
> On Apr 15, 2010, at 3:35 PM, Michael McCandless wrote:
> 
> > Unfortunately, live searching against an old index can get very
> hairy.
> > EG look at what I had to do for the "flex API on pre-flex index" flex
> > emulation layer.
> >
> > It's also not great because it gives the illusion that all is good,
> > yet, you've taken a silent hit (up to ~10% or so) in your search
> > perf.
> >
> > Whereas building & maintaining a one-time index migration tool, in
> > contrast, is much less work.
> >
> > I realize the migration tool has issues -- it fixes the hard changes
> > but silently allows the soft changes to break (ie, your analyzers my
> > not produce the same tokens, until we move all core analyzers outside
> > of core, so they are separately versioned), but it seems like a good
> > compromise here?
> >
> > Mike
> >
> > 2010/4/15 Shai Erera <se...@gmail.com>:
> >> The reason Earwin why online migration is faster is because when u
> >> finally need to *fully* migrate your index, most chances are that
> most
> >> of the segments are already on the newer format. Offline migration
> >> will just keep the application idle for some amount of time until
> ALL
> >> segments are migrated.
> >>
> >> During the lifecycle of the index, segments are merged anyway, so
> >> migrating them on the fly virtually costs nothing. At the end, when
> u
> >> upgrade to a Lucene version which doesn't support the previous index
> >> format, you'll on the worse case need to migrate few large segments
> >> which were never merged. I don't know how many of those there will
> be
> >> as it really depends on the application, but I'd bet this process
> will
> >> touch just a few segments. And hence, throughput wise it will be a
> lot
> >> faster.
> >>
> >> We should create a migrate() API on IW which will touch just those
> >> segments and not incur a full optimize. That API can also be used
> for
> >> an offline migration tool, if we decide that's what we want.
> >>
> >> Shai
> >>
> >> On Thursday, April 15, 2010, jm <jm...@gmail.com> wrote:
> >>> Not sure if plain users are allowed/encouraged to post in this
> list,
> >>> but wanted to mention (just an opinion from a happy user), as other
> >>> users have, that not all of us can reindex just like that. It would
> >>> not be 10 min for one of our installations for sure...
> >>>
> >>> First, i would need to implement some code to reindex, cause my
> source
> >>> data is postprocessed/compressed/encrypted/moved after it arrives
> to
> >>> the application, so I would need to retrieve all etc. And then
> >>> reindexing it would take days.
> >>> javier
> >>>
> >>> On Thu, Apr 15, 2010 at 9:04 PM, Earwin Burrfoot <ea...@gmail.com>
> wrote:
> >>>>> BTW Earwin, we can come up w/ a migrate() method on IW to
> accomplish
> >>>>> manual migration on the segments that are still on old versions.
> >>>>> That's not the point about whether optimize() is good or not. It
> is
> >>>>> the difference between telling the customer to run a 5-day
> migration
> >>>>> process, or a couple of hours. At the end of the day, the same
> >>>>> migration code will need to be written whether for the manual or
> >>>>> automatic case. And probably by the same developer which changed
> the
> >>>>> index format. It's the difference of when does it happen.
> >>>>
> >>>> Converting stuff is easier then emulating, that's exactly why I
> want a
> >>>> separate tool.
> >>>> There's no need to support cross-version merging, nor to emulate
> old APIs.
> >>>>
> >>>> I also don't understand why offline migration is going to take
> days
> >>>> instead of hours for online migration??
> >>>> WTF, it's gonna be even faster, as it doesn't have to merge
> things.
> >>>>
> >>>> --
> >>>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> >>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> >>>> ICQ: 104465785
> >>>>
> >>>> ------------------------------------------------------------------
> ---
> >>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>>>
> >>>>
> >>>
> >>> -------------------------------------------------------------------
> --
> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>>
> >>>
> >>
> >> --------------------------------------------------------------------
> -
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem using Solr/Lucene:
> http://www.lucidimagination.com/search
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Grant Ingersoll <gs...@apache.org>.

From IRC:
"why do I get the feeling that everyone is in "heated agreement" on the Version thread?
there are some cases that mean people will have to reindex
in those cases, we should tell people they will have to reindex
then they can decide to upgrade or not
all other cases, just do the sensible thing and test first
I have yet to meet anyone who simply drops a new version into production and says go"

So, as I said earlier, why don't we just move forward with it, strive to support reading X-1 index format in X and let the user know the cases in which they will have to re-index. If a migration tool is necessary, then someone can write it at the appropriate time.  Just as was said w/ the Solr merge, it's software.  If it doesn't work, we can change it.  Thank goodness we don't have a back compatibility policy for our policies!

-Grant




On Apr 15, 2010, at 3:35 PM, Michael McCandless wrote:

> Unfortunately, live searching against an old index can get very hairy.
> EG look at what I had to do for the "flex API on pre-flex index" flex
> emulation layer.
> 
> It's also not great because it gives the illusion that all is good,
> yet, you've taken a silent hit (up to ~10% or so) in your search
> perf.
> 
> Whereas building & maintaining a one-time index migration tool, in
> contrast, is much less work.
> 
> I realize the migration tool has issues -- it fixes the hard changes
> but silently allows the soft changes to break (ie, your analyzers my
> not produce the same tokens, until we move all core analyzers outside
> of core, so they are separately versioned), but it seems like a good
> compromise here?
> 
> Mike
> 
> 2010/4/15 Shai Erera <se...@gmail.com>:
>> The reason Earwin why online migration is faster is because when u
>> finally need to *fully* migrate your index, most chances are that most
>> of the segments are already on the newer format. Offline migration
>> will just keep the application idle for some amount of time until ALL
>> segments are migrated.
>> 
>> During the lifecycle of the index, segments are merged anyway, so
>> migrating them on the fly virtually costs nothing. At the end, when u
>> upgrade to a Lucene version which doesn't support the previous index
>> format, you'll on the worse case need to migrate few large segments
>> which were never merged. I don't know how many of those there will be
>> as it really depends on the application, but I'd bet this process will
>> touch just a few segments. And hence, throughput wise it will be a lot
>> faster.
>> 
>> We should create a migrate() API on IW which will touch just those
>> segments and not incur a full optimize. That API can also be used for
>> an offline migration tool, if we decide that's what we want.
>> 
>> Shai
>> 
>> On Thursday, April 15, 2010, jm <jm...@gmail.com> wrote:
>>> Not sure if plain users are allowed/encouraged to post in this list,
>>> but wanted to mention (just an opinion from a happy user), as other
>>> users have, that not all of us can reindex just like that. It would
>>> not be 10 min for one of our installations for sure...
>>> 
>>> First, i would need to implement some code to reindex, cause my source
>>> data is postprocessed/compressed/encrypted/moved after it arrives to
>>> the application, so I would need to retrieve all etc. And then
>>> reindexing it would take days.
>>> javier
>>> 
>>> On Thu, Apr 15, 2010 at 9:04 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>>>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>>>>> manual migration on the segments that are still on old versions.
>>>>> That's not the point about whether optimize() is good or not. It is
>>>>> the difference between telling the customer to run a 5-day migration
>>>>> process, or a couple of hours. At the end of the day, the same
>>>>> migration code will need to be written whether for the manual or
>>>>> automatic case. And probably by the same developer which changed the
>>>>> index format. It's the difference of when does it happen.
>>>> 
>>>> Converting stuff is easier then emulating, that's exactly why I want a
>>>> separate tool.
>>>> There's no need to support cross-version merging, nor to emulate old APIs.
>>>> 
>>>> I also don't understand why offline migration is going to take days
>>>> instead of hours for online migration??
>>>> WTF, it's gonna be even faster, as it doesn't have to merge things.
>>>> 
>>>> --
>>>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>>> ICQ: 104465785
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>> 
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>> 
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> 
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

As soon as I have removed version, then we can fix StandardTokenizer too!

On Thu, Apr 15, 2010 at 5:13 PM, Shai Erera <se...@gmail.com> wrote:

> By all means Robert ... by all means :). Remember who started that thread,
> and for what reason :D.
>
> Shai
>
>
> On Fri, Apr 16, 2010 at 12:01 AM, Robert Muir <rc...@gmail.com> wrote:
>
>> If you really believe this. then you have no problem if i remove all
>> Version from all core and contrib analyzers right now.
>>
>> On Thu, Apr 15, 2010 at 4:50 PM, Shai Erera <se...@gmail.com> wrote:
>>
>>> Robert ... I'm sorry but changes to Analyzers don't *force* people to
>>> reindex. They can simply choose not to use the latest version. They can
>>> choose not to upgrade a Unicode version. They can copy the entire Analyzer
>>> code to match their needs. Index format changes is what I'm worried about
>>> because that *forces* people to reindex.
>>>
>>> Analyzers, believe it or not, are just a tool, an out of the box tool
>>> even, we're giving users to analyze their stuff. Probably a tool used by
>>> most of our users, but not all. Some have their own tools, that are
>>> currently wrapped as a Lucene Analyzer just because the API mandates. But we
>>> were talking about that too recently no? Ripping Analyzer off IndexWriter?
>>>
>>> Just to be clear - I think your work on Analyzers is fantastic ! Really !
>>> Seriously !
>>> But it's a choice someone can make ... whereas index format is a given -
>>> you have to live with it, or never upgrade Lucene.
>>>
>>> But I think we've chewed that way too much. I am all for removing bw on
>>> Analyzers, and 2396 is a great step towards it (or maybe it is IT?). Even
>>> index format - I don't see when it will change next (but I think I have an
>>> idea ...), so we can tackle it then.
>>>
>>> Shai
>>>
>>>
>>> On Thu, Apr 15, 2010 at 11:33 PM, Robert Muir <rc...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Apr 15, 2010 at 4:21 PM, Shai Erera <se...@gmail.com> wrote:
>>>>
>>>>> Actually, I'd like to know if people like Robert (basically those who
>>>>> have no problem to reindex and don't understand the fuss around it) will
>>>>> want to change the index format - can I count on them to be asked to provide
>>>>> such tool? That's to me a policy we should decide on ... whatever the
>>>>> consequences.
>>>>>
>>>>
>>>> just look at the 1.8MB of backwards compat code in contrib/analyzers i
>>>> want to remove in LUCENE-2396?
>>>> are you serious? I wrote most of that cruft to prevent reindexing and
>>>> you are trying to say I "don't understand the fuss about it"?
>>>>
>>>> We shouldnt make people reindex, but we should have the chance, even if
>>>> we only do it ONE TIME, to reset Lucene to a new "Major Version" that has a
>>>> bunch of stuff fixed we couldnt fix before, and more flexibility.
>>>>
>>>> because with the current policy, its like we are in 1.x forever.... our
>>>> version numbers are a joke!
>>>> --
>>>> Robert Muir
>>>> rcmuir@gmail.com
>>>>
>>>
>>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

By all means Robert ... by all means :). Remember who started that thread,
and for what reason :D.

Shai

On Fri, Apr 16, 2010 at 12:01 AM, Robert Muir <rc...@gmail.com> wrote:

> If you really believe this. then you have no problem if i remove all
> Version from all core and contrib analyzers right now.
>
> On Thu, Apr 15, 2010 at 4:50 PM, Shai Erera <se...@gmail.com> wrote:
>
>> Robert ... I'm sorry but changes to Analyzers don't *force* people to
>> reindex. They can simply choose not to use the latest version. They can
>> choose not to upgrade a Unicode version. They can copy the entire Analyzer
>> code to match their needs. Index format changes is what I'm worried about
>> because that *forces* people to reindex.
>>
>> Analyzers, believe it or not, are just a tool, an out of the box tool
>> even, we're giving users to analyze their stuff. Probably a tool used by
>> most of our users, but not all. Some have their own tools, that are
>> currently wrapped as a Lucene Analyzer just because the API mandates. But we
>> were talking about that too recently no? Ripping Analyzer off IndexWriter?
>>
>> Just to be clear - I think your work on Analyzers is fantastic ! Really !
>> Seriously !
>> But it's a choice someone can make ... whereas index format is a given -
>> you have to live with it, or never upgrade Lucene.
>>
>> But I think we've chewed that way too much. I am all for removing bw on
>> Analyzers, and 2396 is a great step towards it (or maybe it is IT?). Even
>> index format - I don't see when it will change next (but I think I have an
>> idea ...), so we can tackle it then.
>>
>> Shai
>>
>>
>> On Thu, Apr 15, 2010 at 11:33 PM, Robert Muir <rc...@gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Apr 15, 2010 at 4:21 PM, Shai Erera <se...@gmail.com> wrote:
>>>
>>>> Actually, I'd like to know if people like Robert (basically those who
>>>> have no problem to reindex and don't understand the fuss around it) will
>>>> want to change the index format - can I count on them to be asked to provide
>>>> such tool? That's to me a policy we should decide on ... whatever the
>>>> consequences.
>>>>
>>>
>>> just look at the 1.8MB of backwards compat code in contrib/analyzers i
>>> want to remove in LUCENE-2396?
>>> are you serious? I wrote most of that cruft to prevent reindexing and you
>>> are trying to say I "don't understand the fuss about it"?
>>>
>>> We shouldnt make people reindex, but we should have the chance, even if
>>> we only do it ONE TIME, to reset Lucene to a new "Major Version" that has a
>>> bunch of stuff fixed we couldnt fix before, and more flexibility.
>>>
>>> because with the current policy, its like we are in 1.x forever.... our
>>> version numbers are a joke!
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>>
>>
>>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 22, 2010 at 9:52 AM, Shai Erera <se...@gmail.com> wrote:

> So instead of forcing all development to go through stable + trunk, I
> propose to go through trunk, and back port to stable only if requested. In
> the end we'll be in the same position (trunk having all features) except for
> stable which will include just those features of interest to other people.
>
>
I think i sorta disagree with this. I think we should set the Version: in
jira appropriately, and if its not risky and won't introduce breaks, try to
apply as many issues as possible to both trunk and stable.

The reason is: with stable actually being stable, perhaps we could actually
get faster releases out this way with solid improvements, while still giving
the more complex features the adequate time they need to really bake in
trunk.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

Shai. People are free to bash their brains out against back-compat on
a stable branch. IF they want.
If they don't want, they work on trunk. When stuff is ported from
stable to trunk, cruft is removed. When (if) stuff is ported from
trunk to stable, cruft is added.

The only point Mike's offer differs from yours/mine, is that
development goes on in both places, instead of one.

If it takes that for them to agree on a free-for-all trunk, I can live
with the mess :)

On Thu, Apr 22, 2010 at 18:16, Shai Erera <se...@gmail.com> wrote:
> But I thought that was the whole point - get rid of Version and loosen on
> the bw policy to not be so restrictive on API. We can finally move to use
> interfaces, stop that API refactoring and deprecation (as one said on a blog
> - "orgy"). If we adopt Mike's proposal, where does it leave us - 99% of the
> development double the efforts, and that tiny percentage like flex (even
> though it's a huge feature in and on itself) having easier life?
>
> Perhaps I'm missing something, but if that's what is proposed and meant, I
> think that not changing anything will (surprisingly and confusingly !) make
> our life easier ...
>
> So Mark, I have to agree w/ you: "If we take that route, I am vehemently
> against changing our policy." +1 !
>
> Shai
>
> On Thu, Apr 22, 2010 at 5:04 PM, Mark Miller <ma...@gmail.com> wrote:
>>
>> I'd vote -1 on Shai's variation and +1 on Mike's proposal.
>>
>> I don't think features should be backported to stable on request. If we go
>> this route, I think it should be a matter of course unless the feature is
>> hairy enough to warrant unstable.
>>
>> Saying we should do all dev on unstable, and only back port on request
>> (who will police that? everyone will accept all requests?) and that we
>> should just release trunk more often to accommodate, is like saying, lets
>> just throw back compat out the window, every release will be free to break
>> back compat, we will just release more often...
>>
>> Working on two branches won't be 100% joy, but loosening the existing much
>> larger annoyance of back compat is not going to be free IMO. To me, Shai's
>> proposal is essentially - lets keep everything the same, but release more
>> often (we have decided to that 100 times) and lose back compat requirements.
>> Then if a dev takes pity on a user, perhaps one of the unstable releases
>> will get a backport of a feature.
>>
>> If we take that route, I am vehemently against changing our policy.
>>
>> On 4/22/10 9:52 AM, Shai Erera wrote:
>>
>> I was advocating that we always develop on trunk w/ no back-compat
>> support, API-wise ... you could have developed flex w/ no bw support.
>>
>> Currently what you're proposing would cause most features to be developed
>> on stable w/ bw support and trunk w/o. I propose to leave 'stable', develop
>> on trunk w/ no bw support (except for index format) and back port features
>> "on demand" to stable w/ bw support.
>>
>> So instead of forcing all development to go through stable + trunk, I
>> propose to go through trunk, and back port to stable only if requested. In
>> the end we'll be in the same position (trunk having all features) except for
>> stable which will include just those features of interest to other people.
>>
>> Shai
>>
>> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
>> <lu...@mikemccandless.com> wrote:
>>>
>>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <se...@gmail.com> wrote:
>>>
>>> > The only downside is that we will need to do everything twice: once on
>>> > stable and once on trunk. I still think that most of the issues and
>>> > development don't affect bw at all and thus we'll always say "this
>>> > needs to go to stable and trunk" which will just be an annoyance and
>>> > complicate the life of the developers even more because not only will
>>> > we need to keep bw compat, we'll need to write the code for trunk as
>>> > well.
>>>
>>> Well, most things.  Some features (eg flex would've been such a
>>> feature) will only happen in trunk.
>>>
>>> But, yes, this is a downside -- stable changes will have to be merged
>>> up to trunk.
>>>
>>> > What if we always develop on trunk, release it more often, and if
>>> > requested or a committer needs it, we backport a certain feature to
>>> > stable?
>>>
>>> This is what we do today, and I think what's broken about it is we are
>>> unable to make a big change that has major breaks from the start.
>>> Every big change is required to land on trunk with back compat intact.
>>>
>>> This is terribly costly for changes like the new analyzer API (Token
>>> -> AttrSource migration), and flex.
>>>
>>> So with the new model, a big change like flex could land on trunk with
>>> no back compat, and age for a long time, along with other such
>>> changes, before being included in a major release.
>>>
>>> I'm not sure we'll release trunk (major releases) more often.  I think
>>> it could go both ways...
>>>
>>> For small changes, I think whether a given dev works on trunk and
>>> merges back to stable, or stable and merges forwards to trunk, is an
>>> individual choice...
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>
>>
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

Merging a change is far less than double the effort...

The first time around you had to type each character yourself, create
tests from scratch, fix the failures, post patches, respond to
feedback, iterate like crazy, maybe throw it all over and start over,
etc.

The second time around "svn merge" does all of this in one go.

Yeah you can get conflicts on merging, but (as much as we like to
complain about them), resolving them is usually fast.

It took ~6 hours for us (Uwe, Robert, Mark, myself) to merge flex back
to trunk...

If the unstable line (trunk) diverges too much from stable then
merging does get more costly... but that's probably a good sign that
we should do a major release.

Mike

On Thu, Apr 22, 2010 at 10:16 AM, Shai Erera <se...@gmail.com> wrote:
> But I thought that was the whole point - get rid of Version and loosen on
> the bw policy to not be so restrictive on API. We can finally move to use
> interfaces, stop that API refactoring and deprecation (as one said on a blog
> - "orgy"). If we adopt Mike's proposal, where does it leave us - 99% of the
> development double the efforts, and that tiny percentage like flex (even
> though it's a huge feature in and on itself) having easier life?
>
> Perhaps I'm missing something, but if that's what is proposed and meant, I
> think that not changing anything will (surprisingly and confusingly !) make
> our life easier ...
>
> So Mark, I have to agree w/ you: "If we take that route, I am vehemently
> against changing our policy." +1 !
>
> Shai
>
> On Thu, Apr 22, 2010 at 5:04 PM, Mark Miller <ma...@gmail.com> wrote:
>>
>> I'd vote -1 on Shai's variation and +1 on Mike's proposal.
>>
>> I don't think features should be backported to stable on request. If we go
>> this route, I think it should be a matter of course unless the feature is
>> hairy enough to warrant unstable.
>>
>> Saying we should do all dev on unstable, and only back port on request
>> (who will police that? everyone will accept all requests?) and that we
>> should just release trunk more often to accommodate, is like saying, lets
>> just throw back compat out the window, every release will be free to break
>> back compat, we will just release more often...
>>
>> Working on two branches won't be 100% joy, but loosening the existing much
>> larger annoyance of back compat is not going to be free IMO. To me, Shai's
>> proposal is essentially - lets keep everything the same, but release more
>> often (we have decided to that 100 times) and lose back compat requirements.
>> Then if a dev takes pity on a user, perhaps one of the unstable releases
>> will get a backport of a feature.
>>
>> If we take that route, I am vehemently against changing our policy.
>>
>> On 4/22/10 9:52 AM, Shai Erera wrote:
>>
>> I was advocating that we always develop on trunk w/ no back-compat
>> support, API-wise ... you could have developed flex w/ no bw support.
>>
>> Currently what you're proposing would cause most features to be developed
>> on stable w/ bw support and trunk w/o. I propose to leave 'stable', develop
>> on trunk w/ no bw support (except for index format) and back port features
>> "on demand" to stable w/ bw support.
>>
>> So instead of forcing all development to go through stable + trunk, I
>> propose to go through trunk, and back port to stable only if requested. In
>> the end we'll be in the same position (trunk having all features) except for
>> stable which will include just those features of interest to other people.
>>
>> Shai
>>
>> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
>> <lu...@mikemccandless.com> wrote:
>>>
>>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <se...@gmail.com> wrote:
>>>
>>> > The only downside is that we will need to do everything twice: once on
>>> > stable and once on trunk. I still think that most of the issues and
>>> > development don't affect bw at all and thus we'll always say "this
>>> > needs to go to stable and trunk" which will just be an annoyance and
>>> > complicate the life of the developers even more because not only will
>>> > we need to keep bw compat, we'll need to write the code for trunk as
>>> > well.
>>>
>>> Well, most things.  Some features (eg flex would've been such a
>>> feature) will only happen in trunk.
>>>
>>> But, yes, this is a downside -- stable changes will have to be merged
>>> up to trunk.
>>>
>>> > What if we always develop on trunk, release it more often, and if
>>> > requested or a committer needs it, we backport a certain feature to
>>> > stable?
>>>
>>> This is what we do today, and I think what's broken about it is we are
>>> unable to make a big change that has major breaks from the start.
>>> Every big change is required to land on trunk with back compat intact.
>>>
>>> This is terribly costly for changes like the new analyzer API (Token
>>> -> AttrSource migration), and flex.
>>>
>>> So with the new model, a big change like flex could land on trunk with
>>> no back compat, and age for a long time, along with other such
>>> changes, before being included in a major release.
>>>
>>> I'm not sure we'll release trunk (major releases) more often.  I think
>>> it could go both ways...
>>>
>>> For small changes, I think whether a given dev works on trunk and
>>> merges back to stable, or stable and merges forwards to trunk, is an
>>> individual choice...
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/22/10 10:23 AM, Shai Erera wrote:
> I don't remember a release w/ an empty BW section in CHANGES ... and I 
> think it's healthy. Otherwise, you'll need to wait endlessly until a 
> major version is released until you can use some features that you, 
> yourself, developed (if you need to use a released Lucene and cannot 
> satisfy w/ trunk).
>
> Shai
>
Actually, the BW break section is pretty new. Pretty sure it didn't 
exist before 2.4, and 2.4 had one entry.

I remember most releases not having it...

-- 
- Mark

http://www.lucidimagination.com


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

I don't remember a release w/ an empty BW section in CHANGES ... and I think
it's healthy. Otherwise, you'll need to wait endlessly until a major version
is released until you can use some features that you, yourself, developed
(if you need to use a released Lucene and cannot satisfy w/ trunk).

Shai

On Thu, Apr 22, 2010 at 5:20 PM, Robert Muir <rc...@gmail.com> wrote:

>
>
> On Thu, Apr 22, 2010 at 10:16 AM, Shai Erera <se...@gmail.com> wrote:
>
>> But I thought that was the whole point - get rid of Version and loosen on
>> the bw policy to not be so restrictive on API. We can finally move to use
>> interfaces, stop that API refactoring and deprecation (as one said on a blog
>> - "orgy"). If we adopt Mike's proposal, where does it leave us - 99% of the
>> development double the efforts, and that tiny percentage like flex (even
>> though it's a huge feature in and on itself) having easier life?
>>
>> Perhaps I'm missing something, but if that's what is proposed and meant, I
>> think that not changing anything will (surprisingly and confusingly !) make
>> our life easier ...
>>
>> So Mark, I have to agree w/ you: "If we take that route, I am vehemently
>> against changing our policy." +1 !
>>
>> Shai
>>
>>
> I think its less than 1% (flex, etc) that should be excluded from stable,
> but thats my opinion.
>
> Ideally, stable would have no backwards-break section at all in CHANGES...
> and it seems this is a pretty significant portion of patches these days.
>
> And I don't think merging is "double effort" especially if we aren't doing
> risky crazy merges with hairy back compat.
>
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 22, 2010 at 10:20 AM, Robert Muir <rc...@gmail.com> wrote:

>
> I think its *more* <-- sorry,  than 1% (flex, etc) that should be excluded
> from stable, but thats my opinion.
>
>

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 22, 2010 at 10:16 AM, Shai Erera <se...@gmail.com> wrote:

> But I thought that was the whole point - get rid of Version and loosen on
> the bw policy to not be so restrictive on API. We can finally move to use
> interfaces, stop that API refactoring and deprecation (as one said on a blog
> - "orgy"). If we adopt Mike's proposal, where does it leave us - 99% of the
> development double the efforts, and that tiny percentage like flex (even
> though it's a huge feature in and on itself) having easier life?
>
> Perhaps I'm missing something, but if that's what is proposed and meant, I
> think that not changing anything will (surprisingly and confusingly !) make
> our life easier ...
>
> So Mark, I have to agree w/ you: "If we take that route, I am vehemently
> against changing our policy." +1 !
>
> Shai
>
>
I think its less than 1% (flex, etc) that should be excluded from stable,
but thats my opinion.

Ideally, stable would have no backwards-break section at all in CHANGES...
and it seems this is a pretty significant portion of patches these days.

And I don't think merging is "double effort" especially if we aren't doing
risky crazy merges with hairy back compat.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

But I thought that was the whole point - get rid of Version and loosen on
the bw policy to not be so restrictive on API. We can finally move to use
interfaces, stop that API refactoring and deprecation (as one said on a blog
- "orgy"). If we adopt Mike's proposal, where does it leave us - 99% of the
development double the efforts, and that tiny percentage like flex (even
though it's a huge feature in and on itself) having easier life?

Perhaps I'm missing something, but if that's what is proposed and meant, I
think that not changing anything will (surprisingly and confusingly !) make
our life easier ...

So Mark, I have to agree w/ you: "If we take that route, I am vehemently
against changing our policy." +1 !

Shai

On Thu, Apr 22, 2010 at 5:04 PM, Mark Miller <ma...@gmail.com> wrote:

>  I'd vote -1 on Shai's variation and +1 on Mike's proposal.
>
> I don't think features should be backported to stable on request. If we go
> this route, I think it should be a matter of course unless the feature is
> hairy enough to warrant unstable.
>
> Saying we should do all dev on unstable, and only back port on request (who
> will police that? everyone will accept all requests?) and that we should
> just release trunk more often to accommodate, is like saying, lets just
> throw back compat out the window, every release will be free to break back
> compat, we will just release more often...
>
> Working on two branches won't be 100% joy, but loosening the existing much
> larger annoyance of back compat is not going to be free IMO. To me, Shai's
> proposal is essentially - lets keep everything the same, but release more
> often (we have decided to that 100 times) and lose back compat requirements.
> Then if a dev takes pity on a user, perhaps one of the unstable releases
> will get a backport of a feature.
>
> If we take that route, I am vehemently against changing our policy.
>
>
> On 4/22/10 9:52 AM, Shai Erera wrote:
>
> I was advocating that we always develop on trunk w/ no back-compat support,
> API-wise ... you could have developed flex w/ no bw support.
>
> Currently what you're proposing would cause most features to be developed
> on stable w/ bw support and trunk w/o. I propose to leave 'stable', develop
> on trunk w/ no bw support (except for index format) and back port features
> "on demand" to stable w/ bw support.
>
> So instead of forcing all development to go through stable + trunk, I
> propose to go through trunk, and back port to stable only if requested. In
> the end we'll be in the same position (trunk having all features) except for
> stable which will include just those features of interest to other people.
>
> Shai
>
> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <se...@gmail.com> wrote:
>>
>> > The only downside is that we will need to do everything twice: once on
>> > stable and once on trunk. I still think that most of the issues and
>> > development don't affect bw at all and thus we'll always say "this
>> > needs to go to stable and trunk" which will just be an annoyance and
>> > complicate the life of the developers even more because not only will
>> > we need to keep bw compat, we'll need to write the code for trunk as
>> > well.
>>
>>  Well, most things.  Some features (eg flex would've been such a
>> feature) will only happen in trunk.
>>
>> But, yes, this is a downside -- stable changes will have to be merged
>> up to trunk.
>>
>> > What if we always develop on trunk, release it more often, and if
>> > requested or a committer needs it, we backport a certain feature to
>> > stable?
>>
>>  This is what we do today, and I think what's broken about it is we are
>> unable to make a big change that has major breaks from the start.
>> Every big change is required to land on trunk with back compat intact.
>>
>> This is terribly costly for changes like the new analyzer API (Token
>> -> AttrSource migration), and flex.
>>
>> So with the new model, a big change like flex could land on trunk with
>> no back compat, and age for a long time, along with other such
>> changes, before being included in a major release.
>>
>> I'm not sure we'll release trunk (major releases) more often.  I think
>> it could go both ways...
>>
>> For small changes, I think whether a given dev works on trunk and
>> merges back to stable, or stable and merges forwards to trunk, is an
>> individual choice...
>>
>> Mike
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
>

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Correction Mike - "ongoing changes go onto the stable AND trunk branches"
... let's make it clear.

Shai

On Thu, Apr 22, 2010 at 5:15 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Right, I think the default should be that ongoing changes go onto the
> stable branch.
>
> And the exception is if back-compat is too hard/risky to accomplish,
> we argue for doing it only on trunk.
>
> This discussion can take place up front -- the issue's Version will be
> set accordingly -- and revisited as the issue is developed (eg if back
> compat turned out to be trickier than we first thought).
>
> Mike
>
> On Thu, Apr 22, 2010 at 10:04 AM, Mark Miller <ma...@gmail.com>
> wrote:
> > I'd vote -1 on Shai's variation and +1 on Mike's proposal.
> >
> > I don't think features should be backported to stable on request. If we
> go
> > this route, I think it should be a matter of course unless the feature is
> > hairy enough to warrant unstable.
> >
> > Saying we should do all dev on unstable, and only back port on request
> (who
> > will police that? everyone will accept all requests?) and that we should
> > just release trunk more often to accommodate, is like saying, lets just
> > throw back compat out the window, every release will be free to break
> back
> > compat, we will just release more often...
> >
> > Working on two branches won't be 100% joy, but loosening the existing
> much
> > larger annoyance of back compat is not going to be free IMO. To me,
> Shai's
> > proposal is essentially - lets keep everything the same, but release more
> > often (we have decided to that 100 times) and lose back compat
> requirements.
> > Then if a dev takes pity on a user, perhaps one of the unstable releases
> > will get a backport of a feature.
> >
> > If we take that route, I am vehemently against changing our policy.
> >
> > On 4/22/10 9:52 AM, Shai Erera wrote:
> >
> > I was advocating that we always develop on trunk w/ no back-compat
> support,
> > API-wise ... you could have developed flex w/ no bw support.
> >
> > Currently what you're proposing would cause most features to be developed
> on
> > stable w/ bw support and trunk w/o. I propose to leave 'stable', develop
> on
> > trunk w/ no bw support (except for index format) and back port features
> "on
> > demand" to stable w/ bw support.
> >
> > So instead of forcing all development to go through stable + trunk, I
> > propose to go through trunk, and back port to stable only if requested.
> In
> > the end we'll be in the same position (trunk having all features) except
> for
> > stable which will include just those features of interest to other
> people.
> >
> > Shai
> >
> > On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
> > <lu...@mikemccandless.com> wrote:
> >>
> >> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <se...@gmail.com> wrote:
> >>
> >> > The only downside is that we will need to do everything twice: once on
> >> > stable and once on trunk. I still think that most of the issues and
> >> > development don't affect bw at all and thus we'll always say "this
> >> > needs to go to stable and trunk" which will just be an annoyance and
> >> > complicate the life of the developers even more because not only will
> >> > we need to keep bw compat, we'll need to write the code for trunk as
> >> > well.
> >>
> >> Well, most things.  Some features (eg flex would've been such a
> >> feature) will only happen in trunk.
> >>
> >> But, yes, this is a downside -- stable changes will have to be merged
> >> up to trunk.
> >>
> >> > What if we always develop on trunk, release it more often, and if
> >> > requested or a committer needs it, we backport a certain feature to
> >> > stable?
> >>
> >> This is what we do today, and I think what's broken about it is we are
> >> unable to make a big change that has major breaks from the start.
> >> Every big change is required to land on trunk with back compat intact.
> >>
> >> This is terribly costly for changes like the new analyzer API (Token
> >> -> AttrSource migration), and flex.
> >>
> >> So with the new model, a big change like flex could land on trunk with
> >> no back compat, and age for a long time, along with other such
> >> changes, before being included in a major release.
> >>
> >> I'm not sure we'll release trunk (major releases) more often.  I think
> >> it could go both ways...
> >>
> >> For small changes, I think whether a given dev works on trunk and
> >> merges back to stable, or stable and merges forwards to trunk, is an
> >> individual choice...
> >>
> >> Mike
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

Right, I think the default should be that ongoing changes go onto the
stable branch.

And the exception is if back-compat is too hard/risky to accomplish,
we argue for doing it only on trunk.

This discussion can take place up front -- the issue's Version will be
set accordingly -- and revisited as the issue is developed (eg if back
compat turned out to be trickier than we first thought).

Mike

On Thu, Apr 22, 2010 at 10:04 AM, Mark Miller <ma...@gmail.com> wrote:
> I'd vote -1 on Shai's variation and +1 on Mike's proposal.
>
> I don't think features should be backported to stable on request. If we go
> this route, I think it should be a matter of course unless the feature is
> hairy enough to warrant unstable.
>
> Saying we should do all dev on unstable, and only back port on request (who
> will police that? everyone will accept all requests?) and that we should
> just release trunk more often to accommodate, is like saying, lets just
> throw back compat out the window, every release will be free to break back
> compat, we will just release more often...
>
> Working on two branches won't be 100% joy, but loosening the existing much
> larger annoyance of back compat is not going to be free IMO. To me, Shai's
> proposal is essentially - lets keep everything the same, but release more
> often (we have decided to that 100 times) and lose back compat requirements.
> Then if a dev takes pity on a user, perhaps one of the unstable releases
> will get a backport of a feature.
>
> If we take that route, I am vehemently against changing our policy.
>
> On 4/22/10 9:52 AM, Shai Erera wrote:
>
> I was advocating that we always develop on trunk w/ no back-compat support,
> API-wise ... you could have developed flex w/ no bw support.
>
> Currently what you're proposing would cause most features to be developed on
> stable w/ bw support and trunk w/o. I propose to leave 'stable', develop on
> trunk w/ no bw support (except for index format) and back port features "on
> demand" to stable w/ bw support.
>
> So instead of forcing all development to go through stable + trunk, I
> propose to go through trunk, and back port to stable only if requested. In
> the end we'll be in the same position (trunk having all features) except for
> stable which will include just those features of interest to other people.
>
> Shai
>
> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
> <lu...@mikemccandless.com> wrote:
>>
>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <se...@gmail.com> wrote:
>>
>> > The only downside is that we will need to do everything twice: once on
>> > stable and once on trunk. I still think that most of the issues and
>> > development don't affect bw at all and thus we'll always say "this
>> > needs to go to stable and trunk" which will just be an annoyance and
>> > complicate the life of the developers even more because not only will
>> > we need to keep bw compat, we'll need to write the code for trunk as
>> > well.
>>
>> Well, most things.  Some features (eg flex would've been such a
>> feature) will only happen in trunk.
>>
>> But, yes, this is a downside -- stable changes will have to be merged
>> up to trunk.
>>
>> > What if we always develop on trunk, release it more often, and if
>> > requested or a committer needs it, we backport a certain feature to
>> > stable?
>>
>> This is what we do today, and I think what's broken about it is we are
>> unable to make a big change that has major breaks from the start.
>> Every big change is required to land on trunk with back compat intact.
>>
>> This is terribly costly for changes like the new analyzer API (Token
>> -> AttrSource migration), and flex.
>>
>> So with the new model, a big change like flex could land on trunk with
>> no back compat, and age for a long time, along with other such
>> changes, before being included in a major release.
>>
>> I'm not sure we'll release trunk (major releases) more often.  I think
>> it could go both ways...
>>
>> For small changes, I think whether a given dev works on trunk and
>> merges back to stable, or stable and merges forwards to trunk, is an
>> individual choice...
>>
>> Mike
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Thu, Apr 22, 2010 at 10:10 AM, Robert Muir <rc...@gmail.com> wrote:
>
>
> On Thu, Apr 22, 2010 at 10:08 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>
>> Okay, let's live with parallel development, but make sure we 'always'
>> port things from stable to trunk, and 'always' remove possible
>> back-compat layers when doing such a port?
>>
>
> Why wouldnt you commit to trunk, then merge to the stable branch? This could
> be nice for some patches, as you could first introduce the patch without
> back compat shims and make for easier review.

+1, for those features that have a back-compat layer/shim.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

Whether such features (that require some amount of back compat
"stuff") are done on stable then ported to unstable (trunk), or
vice-versa, is something we each can work out / iterate.

It'll come down to individual preference, and that's perfectly fine.
Some us still use emacs and some of us swear by IntelliJ ;)  We don't
really need to agree on these mechanics, in order to vote on opening
up an "unstable" (trunk) line for development.

Mike

On Thu, Apr 22, 2010 at 11:00 AM, Robert Muir <rc...@gmail.com> wrote:
>
>
> On Thu, Apr 22, 2010 at 10:57 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>>
>> But you need one, if you want to backwport some feature. My point was,
>> that if you are planning to backport something, its better to start
>> developing on stable, as else you can possibly get problems when you only
>> have a clean API without any idea how to implement a backwards.
>
> I don't think we should backport any features that require sophisticated
> backwards layers.
>>
>>
>>
>> So for features that should be backported, start to plan with backwards in
>> mind from the beginning.
>
> I disagree, I think because Lucene is a library, features should be
> developed with the best possible API from a users perspective in mind. I
> think this doesn't quite get the priority it should today, as "back-compat
> trumps all" despite the fact its broken in every release anyway.
>
> Then separately, if there is a way to backport them easily, then we should
> certainly do it, but not if it require sophistication. instead it should
> just wait for the next major release
> --
> Robert Muir
> rcmuir@gmail.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 22, 2010 at 10:57 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

>  But you need one, if you want to backwport some feature. My point was,
> that if you are planning to backport something, its better to start
> developing on stable, as else you can possibly get problems when you only
> have a clean API without any idea how to implement a backwards.
>

I don't think we should backport any features that require sophisticated
backwards layers.

>
>
> So for features that should be backported, start to plan with backwards in
> mind from the beginning.
>

I disagree, I think because Lucene is a library, features should be
developed with the best possible API from a users perspective in mind. I
think this doesn't quite get the priority it should today, as "back-compat
trumps all" despite the fact its broken in every release anyway.

Then separately, if there is a way to backport them easily, then we should
certainly do it, but not if it require sophistication. instead it should
just wait for the next major release

-- 
Robert Muir
rcmuir@gmail.com

RE: Proposal about Version API "relaxation"

Posted by Uwe Schindler <uw...@thetaphi.de>.

But you need one, if you want to backwport some feature. My point was, that if you are planning to backport something, its better to start developing on stable, as else you can possibly get problems when you only have a clean API without any idea how to implement a backwards.

 

So for features that should be backported, start to plan with backwards in mind from the beginning.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Robert Muir [mailto:rcmuir@gmail.com] 
Sent: Thursday, April 22, 2010 4:53 PM
To: dev@lucene.apache.org
Subject: Re: Proposal about Version API "relaxation"

 

 

On Thu, Apr 22, 2010 at 10:48 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

Hi Robert,

 

My main problem with devleoping new features on trunk first and then porting by adding backwards cruft is, that you first don’t care with backwards and then suddenly have to think about it. This may change the API on trunk again, to get nearer to backwards or maybe because a backwards layer is not possible. E.g. at the beginning of AttributeSource-TokenStream API, when Michael and me discussed about the sophisticated® backwards layer, we also did some changes to the new TokenStream API, to support backwards better.


I think with this proposal things like sophisticated(R) backwards layers would generally become a thing of the past... 



-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 22, 2010 at 10:48 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

>  Hi Robert,
>
>
>
> My main problem with devleoping new features on trunk first and then
> porting by adding backwards cruft is, that you first don’t care with
> backwards and then suddenly have to think about it. This may change the API
> on trunk again, to get nearer to backwards or maybe because a backwards
> layer is not possible. E.g. at the beginning of AttributeSource-TokenStream
> API, when Michael and me discussed about the sophisticated® backwards layer,
> we also did some changes to the new TokenStream API, to support backwards
> better.
>

I think with this proposal things like sophisticated(R) backwards layers
would generally become a thing of the past...

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Grant Ingersoll <gs...@apache.org>.

On Apr 22, 2010, at 5:58 PM, Mark Miller wrote:

> Well yes - throwing out stable releases and back compat is going to be much more easy to maintain, but I think that's besides the point...
> 
> Handling our current back compat policy is not something most have wanted to do for long either - that's never been a reason for tossing it.
> 
> I agree with back porting as necessary, as long as necessary means every change that doesn't make sense to only go into unstable.
> 
>>> Besides, it is essentially what we do now, minus back compat. maintenance on trunk.
> 
> Exactly - it essentially amounts to just throwing out back compat to a large degree. Stable will be a joke - its just the last trunk release - no different than what we do now, but we abandon back compat. That's a huge mistake IMO.

Yeah, I don't intend to throw out back compat.  I guess I wasn't understanding.  Must have been from reading 1000 replies on this since yesterday.

-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

Well yes - throwing out stable releases and back compat is going to be 
much more easy to maintain, but I think that's besides the point...

Handling our current back compat policy is not something most have 
wanted to do for long either - that's never been a reason for tossing it.

I agree with back porting as necessary, as long as necessary means every 
change that doesn't make sense to only go into unstable.

>>  Besides, it is essentially what we do now, minus back compat. maintenance on trunk.

Exactly - it essentially amounts to just throwing out back compat to a large degree. Stable will be a joke - its just the last trunk release - no different than what we do now, but we abandon back compat. That's a huge mistake IMO.



On 4/22/10 5:49 PM, Grant Ingersoll wrote:
> Jumping in late, but I have a hard time believing that committing to both trunk and stable is something people are going to want to do in practice for very long.  The other proposal (backporting when necessary) seems much more viable and easy to maintain and allows trunk to move ahead.  Besides, it is essentially what we do now, minus back compat. maintenance on trunk.  The tricky part is how to develop a back compat layer on the branches each time that works effectively.
>
> -Grant
>
> On Apr 22, 2010, at 3:26 PM, Earwin Burrfoot wrote:
>
>    
>>> My main problem with devleoping new features on trunk first and then porting
>>> by adding backwards cruft is, that you first don’t care with backwards and
>>> then suddenly have to think about it. This may change the API on trunk
>>> again, to get nearer to backwards or maybe because a backwards layer is not
>>> possible. E.g. at the beginning of AttributeSource-TokenStream API, when
>>> Michael and me discussed about the sophisticated® backwards layer, we also
>>> did some changes to the new TokenStream API, to support backwards better.
>>>        
>> I agree with Robert here. The whole damn point of unstable trunk is to
>> allow developers to NOT think about backwards-compatibility, and think
>> about best possible API instead.
>>
>> Backwards-compatibility is a sin, a necessary sin, but a sin
>> nonetheless. Each time you have such impure thoughts, you should
>> cleanse your soul by confessing at your local JUG.
>>
>> -- 
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>      
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>    


-- 
- Mark

http://www.lucidimagination.com


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Grant Ingersoll <gs...@apache.org>.

Jumping in late, but I have a hard time believing that committing to both trunk and stable is something people are going to want to do in practice for very long.  The other proposal (backporting when necessary) seems much more viable and easy to maintain and allows trunk to move ahead.  Besides, it is essentially what we do now, minus back compat. maintenance on trunk.  The tricky part is how to develop a back compat layer on the branches each time that works effectively.

-Grant

On Apr 22, 2010, at 3:26 PM, Earwin Burrfoot wrote:

>> My main problem with devleoping new features on trunk first and then porting
>> by adding backwards cruft is, that you first don’t care with backwards and
>> then suddenly have to think about it. This may change the API on trunk
>> again, to get nearer to backwards or maybe because a backwards layer is not
>> possible. E.g. at the beginning of AttributeSource-TokenStream API, when
>> Michael and me discussed about the sophisticated® backwards layer, we also
>> did some changes to the new TokenStream API, to support backwards better.
> 
> I agree with Robert here. The whole damn point of unstable trunk is to
> allow developers to NOT think about backwards-compatibility, and think
> about best possible API instead.
> 
> Backwards-compatibility is a sin, a necessary sin, but a sin
> nonetheless. Each time you have such impure thoughts, you should
> cleanse your soul by confessing at your local JUG.
> 
> -- 
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

> My main problem with devleoping new features on trunk first and then porting
> by adding backwards cruft is, that you first don’t care with backwards and
> then suddenly have to think about it. This may change the API on trunk
> again, to get nearer to backwards or maybe because a backwards layer is not
> possible. E.g. at the beginning of AttributeSource-TokenStream API, when
> Michael and me discussed about the sophisticated® backwards layer, we also
> did some changes to the new TokenStream API, to support backwards better.

I agree with Robert here. The whole damn point of unstable trunk is to
allow developers to NOT think about backwards-compatibility, and think
about best possible API instead.

Backwards-compatibility is a sin, a necessary sin, but a sin
nonetheless. Each time you have such impure thoughts, you should
cleanse your soul by confessing at your local JUG.

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

RE: Proposal about Version API "relaxation"

Posted by Uwe Schindler <uw...@thetaphi.de>.

Hi Robert,

 

My main problem with devleoping new features on trunk first and then porting by adding backwards cruft is, that you first don’t care with backwards and then suddenly have to think about it. This may change the API on trunk again, to get nearer to backwards or maybe because a backwards layer is not possible. E.g. at the beginning of AttributeSource-TokenStream API, when Michael and me discussed about the sophisticated® backwards layer, we also did some changes to the new TokenStream API, to support backwards better.

 

I personally always think about possible problems for implementing a backwards layer first. But of course for features like flex, that will never be backported, thinking about backwards layer is not needed, just hack and reinvent everyting!

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Robert Muir [mailto:rcmuir@gmail.com] 
Sent: Thursday, April 22, 2010 4:11 PM
To: dev@lucene.apache.org
Subject: Re: Proposal about Version API "relaxation"

 

 

On Thu, Apr 22, 2010 at 10:08 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

Okay, let's live with parallel development, but make sure we 'always'
port things from stable to trunk, and 'always' remove possible
back-compat layers when doing such a port?

 


Why wouldnt you commit to trunk, then merge to the stable branch? This could be nice for some patches, as you could first introduce the patch without back compat shims and make for easier review.


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 22, 2010 at 10:08 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> Okay, let's live with parallel development, but make sure we 'always'
> port things from stable to trunk, and 'always' remove possible
> back-compat layers when doing such a port?
>
>
Why wouldnt you commit to trunk, then merge to the stable branch? This could
be nice for some patches, as you could first introduce the patch without
back compat shims and make for easier review.


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Maybe we should not make a decision right now? We've talked over the
proposals, and there are pros and cons to each, and clearly two sides. I
think at this point we just have to "try it", whatever "it" is and see if
that works and how it plays. It's like software design - we've played w/
some ideas, now we need to do some coding and get back to the drawing board
and see what played well and what needs revisiting.

So maybe, as an exercise, we should not *really* change anything but do what
was raised on another thread --> branch off pre-flex. Allow some time for
people to backport contributions they've made post-flex to that branch.
Especially contributions which do not depend on flex. Then we call that
branch 3.1. The exercise will tell us (to some degree) how much interest is
there in backporting. If only few issues will be backported (I plan to do
some), then the result is not unequivocal. However if we see many of the
contributions are backported, we can at least assume it will be so in the
future as well, because there is interest. Then, we can come back to this
thread and decide what to do w/ flex (which will live in trunk, still) --
should be it the next 4.0, or maybe the decision will change and it will be
the next 3.2.

Since flex is already backported, we don't lose anything. If the decision
will be to make it 4.0, then some work will be done to get rid of the
back-compat layers, perhaps clean the API etc. Otherwise, it will leave in
trunk for some more time for people to more easily consume it.

If another such great feature comes along (which is expected to change a lot
of code), I think it'd be safe to start developing it w/ no back-compat in
mind anyway, irregardless of the policy ...

What do you think?

Shai

On Sun, Apr 25, 2010 at 11:31 PM, Mark Miller <ma...@gmail.com> wrote:

> On 4/25/10 4:10 PM, Shai Erera wrote:
>
>> I think that we agree in principal about the policy change. We seem to
>> disagree only on where should the default dev should be: trunk or
>> branch.
>>
>
> Right.
>
>
>
>> So why not put it up to the test? Let's declare that all dev happens
>> on trunk. If few features are backported - then it means (probably)
>> that there is not much interest in backporting. If many features are
>> backported, it means not only people want to backport, but also
>> committers are willing to help do that. Feels to me like a win-win
>> situation for both arguments.
>>
>
> I still don't like "just seeing what happens". What we all agree should be
> best for users as well as devs is not always going to be in alignment with
> letting the chips fall where they may. I don't think seeing whether
> committers just do something individually/naturally is a good test for
> whether we have the right goals for the project.
>
> Deciding if we should have the back compat police is not something we would
> do by saying "Okay, no more back compat policy - lets just see if devs do
> back compat on there own or not - if they do, we bring back back compat".
> Natural inclinations and unofficial 'policy' are two very different things -
> and I think both are important. Natural inclination doesn't sound exactly
> like the right test for official/unofficial/loose policies that are meant to
> guide natural inclinations. Despite Roberts complaining, we don't have many,
> but we do have some - without our back compat policy, upgrading code that
> extensively uses Lucene would have been extremely difficult. When you are
> avoiding deprecation, and deprecation releases and increasing the number and
> size of complicated changes, you make upgrading more and more difficult. 2.9
> to 3.0 was already no cakewalk, but if you did it right, Lucene walks you
> right through all the changes you need to grapple with, and points you along
> the path. This will all essentially go away. The least we can do is provide
> solid stable releases, and are informally dedicated to doing such - by
> putting all dev that fits into stable.
>
>
>
>> So instead of being strict right from the start, we try to be more
>> agile and responsive. We don't unnecessarily burden the committers
>> with the tedious job of merging svn every time, but instead verify
>> first that what we think is requested by the people, actually is
>> requested.
>>
>
> I'm totally against this whole "request" idea. Things that make sense to go
> to stable should go to both. Things that make sense to go to unstable should
> go there. Some people I think put dev annoyances over a good user experience
> - personally I think their should be more of a balance. What's easy for devs
> is not always easy for users - otherwise this would have been a free for all
> from the beginning. We would have done away with all of the 'burdens',
> 'annoyances', and 'tediousness'. And just shoved them off to users that
> where trying to upgrade their code to use the latest Lucene. The two need to
> be balanced.
>
>
>
>> As Mark indicated, the current back-compat policy was never voted on.
>> Instead it made sense and thus was applied. Maybe we're wrong :)?
>>
>
> No we where not wrong. That back compat policy has worked out for a long
> time. Evolving policy is going to be natural as the situation changes. That
> doesn't mean it should evolve to nothing or anything just because its
> changing. And the current back compat policy was not officially voted on
> (which only the PMC can do), but it was voted on with action over time. Same
> as its been loosened over time.
>
>
>
>> Let's be bold, do some change and then reflect back on it and decide
>> if it was smart or not. It's not like the process is irreversible - on
>> the contrary, what I'm suggesting is essentially what's done today
>> (svn-wise) and thus can easily be changed in the future. While if we
>> start w/ Mike's proposal, changing it would be more confusing to the
>> people, and will probably also generate such a huge thread …
>>
>
> But I don't agree that the path you want is the right path. So why be
> "bold" about it?
>
> What would be "confusing to the people"?
>
> How is this the process we currently use? The current process is that bugs
> that should be back ported are back ported. Sometimes their are requests for
> further back ports, but the general policy is to back port what makes sense,
> not what is requested. And that's been being done for some time.
>
>
>
>> Shai
>>
>> On Sunday, April 25, 2010, Mark Miller<ma...@gmail.com>  wrote:
>>
>>> On 4/25/10 1:43 PM, Michael McCandless wrote:
>>>
>>>
>>> Changes that go into stable need to be merged to unstable, maybe
>>> periodically sweeped or maybe merged up by the original committer or
>>> likely some combination (like flex).
>>>
>>> (And, yes we'll still use other branches for big new features that are
>>> in active development).
>>>
>>> Mike
>>>
>>>
>>>
>>> I'm still +1 on all the proposals you have made. And still -1 to most of
>>> the attempted tweaks on them that have been proposed :)
>>>
>>> --
>>> - Mark
>>>
>>> http://www.lucidimagination.com
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On Apr 25, 2010, at 4:10 PM, Shai Erera wrote:

> I think that we agree in principal about the policy change. We seem to
> disagree only on where should the default dev should be: trunk or
> branch.

I don't think it matters. Just document the decision in the wiki in a Development Roadmap. Maybe also have it in a root level README.txt. Anyone who is smart enough to check out code directly from SVN should have enough brains to figure out what they get. There might be some pain at first, but they'll get over it.

But please, don't refer to the two locations as "stable" and "unstable". The (un-voted upon??) policy that every piece of code has adequate, passed test cases should continue. Both should be stable for creating and subsequently using new indexes.

-- DM

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

OK I think a pretty clear proposal is taking shape -- I'll call a
vote.  We've kinda discussed it to death now...

Mike

On Mon, Apr 26, 2010 at 8:25 AM, Mark Miller <ma...@gmail.com> wrote:
> On 4/26/10 8:23 AM, Robert Muir wrote:
>>
>>
>> On Mon, Apr 26, 2010 at 8:15 AM, Mark Miller <markrmiller@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>>    It's not that simple. If you want to commit a patch without having
>>    it reverted, you *do* have to do certain things - currently, you
>>    have to attempt back compat. Or don't commit. You guys seem to think
>>    its a free for all. Its obviously not. Their are general guidelines
>>    we all follow, formed by consensus. Committers are not just doing
>>    whatever they want.
>>
>>
>> And we should make it easier for people to contribute. Perhaps back
>> compat is a feature that isn't everyone's itch to scratch, you know,
>> just like any other feature. This would make it easier for people to
>> contribute/commit improvements to lucene (into the unstable only), and
>> perhaps there would be more committers and contributors after a while.
>>
>> Right now its a pretty high bar for someone to contribute an
>> improvement, and I think the back compat requirement is a big part of
>> that.
>> --
>> Robert Muir
>> rcmuir@gmail.com <ma...@gmail.com>
>
> That's why we are talking about a proposal that would change our back compat
> commitments.
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/26/10 8:23 AM, Robert Muir wrote:
>
>
> On Mon, Apr 26, 2010 at 8:15 AM, Mark Miller <markrmiller@gmail.com
> <ma...@gmail.com>> wrote:
>
>     It's not that simple. If you want to commit a patch without having
>     it reverted, you *do* have to do certain things - currently, you
>     have to attempt back compat. Or don't commit. You guys seem to think
>     its a free for all. Its obviously not. Their are general guidelines
>     we all follow, formed by consensus. Committers are not just doing
>     whatever they want.
>
>
> And we should make it easier for people to contribute. Perhaps back
> compat is a feature that isn't everyone's itch to scratch, you know,
> just like any other feature. This would make it easier for people to
> contribute/commit improvements to lucene (into the unstable only), and
> perhaps there would be more committers and contributors after a while.
>
> Right now its a pretty high bar for someone to contribute an
> improvement, and I think the back compat requirement is a big part of that.
> --
> Robert Muir
> rcmuir@gmail.com <ma...@gmail.com>

That's why we are talking about a proposal that would change our back 
compat commitments.

-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Mon, Apr 26, 2010 at 8:15 AM, Mark Miller <ma...@gmail.com> wrote:
>
> It's not that simple. If you want to commit a patch without having it
> reverted, you *do* have to do certain things - currently, you have to
> attempt back compat. Or don't commit. You guys seem to think its a free for
> all. Its obviously not. Their are general guidelines we all follow, formed
> by consensus. Committers are not just doing whatever they want.
>
>
And we should make it easier for people to contribute. Perhaps back compat
is a feature that isn't everyone's itch to scratch, you know, just like any
other feature. This would make it easier for people to contribute/commit
improvements to lucene (into the unstable only), and perhaps there would be
more committers and contributors after a while.

Right now its a pretty high bar for someone to contribute an improvement,
and I think the back compat requirement is a big part of that.
-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/26/10 7:57 AM, Robert Muir wrote:
>
>
> On Sun, Apr 25, 2010 at 4:31 PM, Mark Miller <markrmiller@gmail.com
> <ma...@gmail.com>> wrote:
>
>
>     I still don't like "just seeing what happens". What we all agree
>     should be best for users as well as devs is not always going to be
>     in alignment with letting the chips fall where they may. I don't
>     think seeing whether committers just do something
>     individually/naturally is a good test for whether we have the right
>     goals for the project.
>
>
> Committers don't have to do anything at all.  Please remember, Lucene
> isn't a commercial product, its an open source project.
>

It's not that simple. If you want to commit a patch without having it 
reverted, you *do* have to do certain things - currently, you have to 
attempt back compat. Or don't commit. You guys seem to think its a free 
for all. Its obviously not. Their are general guidelines we all follow, 
formed by consensus. Committers are not just doing whatever they want.


-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Sun, Apr 25, 2010 at 4:31 PM, Mark Miller <ma...@gmail.com> wrote:

>
> I still don't like "just seeing what happens". What we all agree should be
> best for users as well as devs is not always going to be in alignment with
> letting the chips fall where they may. I don't think seeing whether
> committers just do something individually/naturally is a good test for
> whether we have the right goals for the project.
>
>
Committers don't have to do anything at all.  Please remember, Lucene isn't
a commercial product, its an open source project.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/25/10 4:10 PM, Shai Erera wrote:
> I think that we agree in principal about the policy change. We seem to
> disagree only on where should the default dev should be: trunk or
> branch.

Right.

>
> So why not put it up to the test? Let's declare that all dev happens
> on trunk. If few features are backported - then it means (probably)
> that there is not much interest in backporting. If many features are
> backported, it means not only people want to backport, but also
> committers are willing to help do that. Feels to me like a win-win
> situation for both arguments.

I still don't like "just seeing what happens". What we all agree should 
be best for users as well as devs is not always going to be in alignment 
with letting the chips fall where they may. I don't think seeing whether 
committers just do something individually/naturally is a good test for 
whether we have the right goals for the project.

Deciding if we should have the back compat police is not something we 
would do by saying "Okay, no more back compat policy - lets just see if 
devs do back compat on there own or not - if they do, we bring back back 
compat". Natural inclinations and unofficial 'policy' are two very 
different things - and I think both are important. Natural inclination 
doesn't sound exactly like the right test for official/unofficial/loose 
policies that are meant to guide natural inclinations. Despite Roberts 
complaining, we don't have many, but we do have some - without our back 
compat policy, upgrading code that extensively uses Lucene would have 
been extremely difficult. When you are avoiding deprecation, and 
deprecation releases and increasing the number and size of complicated 
changes, you make upgrading more and more difficult. 2.9 to 3.0 was 
already no cakewalk, but if you did it right, Lucene walks you right 
through all the changes you need to grapple with, and points you along 
the path. This will all essentially go away. The least we can do is 
provide solid stable releases, and are informally dedicated to doing 
such - by putting all dev that fits into stable.

>
> So instead of being strict right from the start, we try to be more
> agile and responsive. We don't unnecessarily burden the committers
> with the tedious job of merging svn every time, but instead verify
> first that what we think is requested by the people, actually is
> requested.

I'm totally against this whole "request" idea. Things that make sense to 
go to stable should go to both. Things that make sense to go to unstable 
should go there. Some people I think put dev annoyances over a good user 
experience - personally I think their should be more of a balance. 
What's easy for devs is not always easy for users - otherwise this would 
have been a free for all from the beginning. We would have done away 
with all of the 'burdens', 'annoyances', and 'tediousness'. And just 
shoved them off to users that where trying to upgrade their code to use 
the latest Lucene. The two need to be balanced.

>
> As Mark indicated, the current back-compat policy was never voted on.
> Instead it made sense and thus was applied. Maybe we're wrong :)?

No we where not wrong. That back compat policy has worked out for a long 
time. Evolving policy is going to be natural as the situation changes. 
That doesn't mean it should evolve to nothing or anything just because 
its changing. And the current back compat policy was not officially 
voted on (which only the PMC can do), but it was voted on with action 
over time. Same as its been loosened over time.

>
> Let's be bold, do some change and then reflect back on it and decide
> if it was smart or not. It's not like the process is irreversible - on
> the contrary, what I'm suggesting is essentially what's done today
> (svn-wise) and thus can easily be changed in the future. While if we
> start w/ Mike's proposal, changing it would be more confusing to the
> people, and will probably also generate such a huge thread …

But I don't agree that the path you want is the right path. So why be 
"bold" about it?

What would be "confusing to the people"?

How is this the process we currently use? The current process is that 
bugs that should be back ported are back ported. Sometimes their are 
requests for further back ports, but the general policy is to back port 
what makes sense, not what is requested. And that's been being done for 
some time.

>
> Shai
>
> On Sunday, April 25, 2010, Mark Miller<ma...@gmail.com>  wrote:
>> On 4/25/10 1:43 PM, Michael McCandless wrote:
>>
>>
>> Changes that go into stable need to be merged to unstable, maybe
>> periodically sweeped or maybe merged up by the original committer or
>> likely some combination (like flex).
>>
>> (And, yes we'll still use other branches for big new features that are
>> in active development).
>>
>> Mike
>>
>>
>>
>> I'm still +1 on all the proposals you have made. And still -1 to most of the attempted tweaks on them that have been proposed :)
>>
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

I think that we agree in principal about the policy change. We seem to
disagree only on where should the default dev should be: trunk or
branch.

So why not put it up to the test? Let's declare that all dev happens
on trunk. If few features are backported - then it means (probably)
that there is not much interest in backporting. If many features are
backported, it means not only people want to backport, but also
committers are willing to help do that. Feels to me like a win-win
situation for both arguments.

So instead of being strict right from the start, we try to be more
agile and responsive. We don't unnecessarily burden the committers
with the tedious job of merging svn every time, but instead verify
first that what we think is requested by the people, actually is
requested.

As Mark indicated, the current back-compat policy was never voted on.
Instead it made sense and thus was applied. Maybe we're wrong :)?

Let's be bold, do some change and then reflect back on it and decide
if it was smart or not. It's not like the process is irreversible - on
the contrary, what I'm suggesting is essentially what's done today
(svn-wise) and thus can easily be changed in the future. While if we
start w/ Mike's proposal, changing it would be more confusing to the
people, and will probably also generate such a huge thread …

Shai

On Sunday, April 25, 2010, Mark Miller <ma...@gmail.com> wrote:
> On 4/25/10 1:43 PM, Michael McCandless wrote:
>
>
> Changes that go into stable need to be merged to unstable, maybe
> periodically sweeped or maybe merged up by the original committer or
> likely some combination (like flex).
>
> (And, yes we'll still use other branches for big new features that are
> in active development).
>
> Mike
>
>
>
> I'm still +1 on all the proposals you have made. And still -1 to most of the attempted tweaks on them that have been proposed :)
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

Of course - its all a matter of degree. Trunk is not going to be 'super' 
unstable - of course it will have to compile. It's still going to be 
mostly like the current  trunk and of course pass tests. Stable will 
just be *more* stable than trunk. We are only using the terms 
relatively. Stable will have backwards compatibility and fewer large, 
hairy changes. It will be more evolutionary while trunk will be more 
revolutionary.


On 4/25/10 1:56 PM, Shai Erera wrote:
> A clarification - stable/unstable are not really related to the
> 'stability' of the branch, right? I mean both trunk and branch will be
> stable in the sense that tests pass and I can safely use either of
> them. We're not going to start checking in code which is in the middle
> of dev, does not compile, tests fail etc. Right? Or did I
> missunderstood the intention?
>
> Shai
>
> On Sunday, April 25, 2010, Michael McCandless<lu...@mikemccandless.com>  wrote:
>> OK, so forget that suggestion :)
>>
>> It sounds like the best way to manage the branches is trunk = unstable
>> (X.0 major release), and long lived branch for ongoing stable dev
>> (X.Y.0 minor releases), and possible bug fix branches off of that
>> (X.Y.Z), on demand if needed.
>>
>> But the intention here is not to abandon the stable branch as soon as
>> it's cut.  I expect many changes/devs will work mostly on the stable
>> branch since that branch does [minor] releases much more often than
>> the unstable branch.
>>
>> Changes that go into stable need to be merged to unstable, maybe
>> periodically sweeped or maybe merged up by the original committer or
>> likely some combination (like flex).
>>
>> (And, yes we'll still use other branches for big new features that are
>> in active development).
>>
>> Mike
>>
>> On Sun, Apr 25, 2010 at 1:21 PM, Earwin Burrfoot<ea...@gmail.com>  wrote:
>>>> And, it's not the committer's job to port each little commit to stable
>>>> over to the unstable branch.  Instead, we periodically re-sync stable
>>>> -->  unstable, like we did with the long-lived flex branch.
>>>>
>>>> So, then, little would change on how stable is developed, today.  And
>>>> stable would still be the primary source line for development.
>>>
>>> -1
>>>
>>> And now there's no place I can go for latest-and-greatest. Some
>>> features in stable, other features in unstable - do I have to merge
>>> locally if I need all of them?
>>> If we have some new flex-calibre developments, they warrant their own
>>> branch, as they are totally unusable whilst in the works.
>>>
>>> The shining point of unstable is not that you can shove some flexy
>>> stuff there. It's that you can tweak APIs without regard to backwards
>>> compat, and have generally cleaner codebase.
>>>
>>>
>>> --
>>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>> ICQ: 104465785
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

A clarification - stable/unstable are not really related to the
'stability' of the branch, right? I mean both trunk and branch will be
stable in the sense that tests pass and I can safely use either of
them. We're not going to start checking in code which is in the middle
of dev, does not compile, tests fail etc. Right? Or did I
missunderstood the intention?

Shai

On Sunday, April 25, 2010, Michael McCandless <lu...@mikemccandless.com> wrote:
> OK, so forget that suggestion :)
>
> It sounds like the best way to manage the branches is trunk = unstable
> (X.0 major release), and long lived branch for ongoing stable dev
> (X.Y.0 minor releases), and possible bug fix branches off of that
> (X.Y.Z), on demand if needed.
>
> But the intention here is not to abandon the stable branch as soon as
> it's cut.  I expect many changes/devs will work mostly on the stable
> branch since that branch does [minor] releases much more often than
> the unstable branch.
>
> Changes that go into stable need to be merged to unstable, maybe
> periodically sweeped or maybe merged up by the original committer or
> likely some combination (like flex).
>
> (And, yes we'll still use other branches for big new features that are
> in active development).
>
> Mike
>
> On Sun, Apr 25, 2010 at 1:21 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>> And, it's not the committer's job to port each little commit to stable
>>> over to the unstable branch.  Instead, we periodically re-sync stable
>>> --> unstable, like we did with the long-lived flex branch.
>>>
>>> So, then, little would change on how stable is developed, today.  And
>>> stable would still be the primary source line for development.
>>
>> -1
>>
>> And now there's no place I can go for latest-and-greatest. Some
>> features in stable, other features in unstable - do I have to merge
>> locally if I need all of them?
>> If we have some new flex-calibre developments, they warrant their own
>> branch, as they are totally unusable whilst in the works.
>>
>> The shining point of unstable is not that you can shove some flexy
>> stuff there. It's that you can tweak APIs without regard to backwards
>> compat, and have generally cleaner codebase.
>>
>>
>> --
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

Having read the entire thread as it's come in, my head is spinning. It is hard to keep up with the ideas and proposals. Through out this thread my mindset has changed, more than once. And may change again ;)

To that end I'd like to make some end-user observations and thoughts:
(I had thought that this would be a quick few bullets, but it isn't. Sorry for that...)

Most of this thread deals with implementation not requirement. As I see it the bw compat policy is the implementation of what I think the real requirement is: that user upgrades be straightforward, simple, obvious, ..... but I don't think the requirement should be that it be painless.

I think it is important to keep the cost of doing an upgrade fairly low. If not some might abandon Lucene. I've seen people abandon Windows Vista for Mac, InterViews 2.0 for other GUI toolkits, ..... I just think there are other ways to do it w/o maintaining the current bw compat policy.

There is a proposal of an index upgrade tool. I think that satisfies the bw compat for the index structure.

The practice of drop-in jar compatibility is nice, but I think it is potentially dangerous. It might lull end-users into a false sense of security and without a deep understanding of what or how to test it might yield surprising results. I think that this is especially true with the tokenizers/analyzers/filters.

And with the "drop-in" compat, it has a tendency to move away from the best class names. (My druthers is to keep the best class names with the best implementation, renaming the existing class to XxxYyyOld (maybe replace Old with a version number, e.g. XxxYyy3_0). Currently, we use the @deprecated for an upgrade advisor. Most deprecations give the suggested replacement. So in a way, it is merely a mapping from old signature/pattern to new signature/pattern. I don't think it would be too hard to create an upgrade advisor in perl or php that given a source tree, looks through Java code for old signatures and identifies potential problems and suggests new signatures. (I'm not volunteering;)

Also, having watched Robert's et al changes to the analyzers, I'm now of the mindset that one cannot maintain bw compat of analyzers (either core or contrib). There are too many conditions for it to be true. Several things conspire together to create a "token stream": the version of the OS, the version of Java (i.e. the version/implementation of Unicode) and the implementation of the tokenizers, stemmers, stop-words and filters and the order they are implemented in an analyzer. If any of these change all bets are off. It's likely to work for most situations, but not for others. A prime example of this is that if the machine's Locale changes, then all places that String.toLowerCase/toUpperCase is at risk. Fixing the bug is the right thing to do. An end-user upgrading without reindexing may be the wrong thing to do.

I think the suggestion that an end user mix and matches a particular "analyzer" from a prior release with the current release, is viable. As long as the API doesn't change. (BTW, I like the suggestion that the API be minimized and stabilized as an iterator over tokens with attributes. Something that the current bw compat policy prevented or made too hard.)

I don't think it matters whether trunk or branch is used for "best" code to an end-user. Most will wait until a release. I don't think it matters much what pattern other projects use other than as a learning experience. A development roadmap should be sufficient to explain to what the choice is. I think anyone who will get pre-release Lucene from source code control are smart enough to figure it out.

The concern that "stable" and "unstable" will drift is probably over-blown. (I don't like these labels, and would rather have "next/best" and "current" as descriptions.) I'm really impressed with the caliper of commitment the committers have to keeping Lucene the best and most useful search engine. The practice of consensus via patches in Jira just reinforce this. It is a simple matter for someone to say, please don't commit this patch to "best" until a corresponding "current" patch is created. I guess I expect that important changes to "best" will be wanted in "current".

I think that it may be best to have a 4.0 release to cause this to happen.

That is to say:
o Keep a clear, well-defined migration path that a competent engineer can perform in a short period of time.
o Provide tools, as needed, to assist in the migration (e.g. an index structure upgrade tool, an "upgrade advisor")
o Maintain index structure compatibility w/in a major release cycle.

-- DM

On Apr 25, 2010, at 3:01 PM, Mark Miller wrote:

> On 4/25/10 1:43 PM, Michael McCandless wrote:
> 
>> Changes that go into stable need to be merged to unstable, maybe
>> periodically sweeped or maybe merged up by the original committer or
>> likely some combination (like flex).
>> 
>> (And, yes we'll still use other branches for big new features that are
>> in active development).
>> 
>> Mike
>> 
> 
> I'm still +1 on all the proposals you have made. And still -1 to most of the attempted tweaks on them that have been proposed :)
> 
> -- 
> - Mark
> 
> http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/25/10 1:43 PM, Michael McCandless wrote:

> Changes that go into stable need to be merged to unstable, maybe
> periodically sweeped or maybe merged up by the original committer or
> likely some combination (like flex).
>
> (And, yes we'll still use other branches for big new features that are
> in active development).
>
> Mike
>

I'm still +1 on all the proposals you have made. And still -1 to most of 
the attempted tweaks on them that have been proposed :)

-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

OK, so forget that suggestion :)

It sounds like the best way to manage the branches is trunk = unstable
(X.0 major release), and long lived branch for ongoing stable dev
(X.Y.0 minor releases), and possible bug fix branches off of that
(X.Y.Z), on demand if needed.

But the intention here is not to abandon the stable branch as soon as
it's cut.  I expect many changes/devs will work mostly on the stable
branch since that branch does [minor] releases much more often than
the unstable branch.

Changes that go into stable need to be merged to unstable, maybe
periodically sweeped or maybe merged up by the original committer or
likely some combination (like flex).

(And, yes we'll still use other branches for big new features that are
in active development).

Mike

On Sun, Apr 25, 2010 at 1:21 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>> And, it's not the committer's job to port each little commit to stable
>> over to the unstable branch.  Instead, we periodically re-sync stable
>> --> unstable, like we did with the long-lived flex branch.
>>
>> So, then, little would change on how stable is developed, today.  And
>> stable would still be the primary source line for development.
>
> -1
>
> And now there's no place I can go for latest-and-greatest. Some
> features in stable, other features in unstable - do I have to merge
> locally if I need all of them?
> If we have some new flex-calibre developments, they warrant their own
> branch, as they are totally unusable whilst in the works.
>
> The shining point of unstable is not that you can shove some flexy
> stuff there. It's that you can tweak APIs without regard to backwards
> compat, and have generally cleaner codebase.
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Sun, Apr 25, 2010 at 11:48 AM, Shai Erera <se...@gmail.com> wrote:

>
> From what I see, we behave very much like a product. The new release
> is always bw compatible, exceptions are allowed when they are well
> documented, and no one makes any decisions lightly around here.
>
> This is not very open sourcy :). No one here gets paid, we don't share
> Lucene dividents w/ the committers and contributors ...
>
>
Thanks Shai, I think you said it best. This is the problem I have with
policies, rules, or whatever for back-compat or stable-porting.

People should be able to contribute what they are able to.

And another problem I have with 'stable' being trunk, in the past trunk has
become 'frozen' for releases and such (I think pretty much real all
development stopped around the 3.0 timeframe for a really long time).

I think this is extremely unhealthy and we should never do this again.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

> And, it's not the committer's job to port each little commit to stable
> over to the unstable branch.  Instead, we periodically re-sync stable
> --> unstable, like we did with the long-lived flex branch.
>
> So, then, little would change on how stable is developed, today.  And
> stable would still be the primary source line for development.

-1

And now there's no place I can go for latest-and-greatest. Some
features in stable, other features in unstable - do I have to merge
locally if I need all of them?
If we have some new flex-calibre developments, they warrant their own
branch, as they are totally unusable whilst in the works.

The shining point of unstable is not that you can shove some flexy
stuff there. It's that you can tweak APIs without regard to backwards
compat, and have generally cleaner codebase.


-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/25/10 11:48 AM, Shai Erera wrote:
> I think that it's a bit naive to think that stable and unstable will
> remain that close to each other such that merging them every so often
> is going to be an even remotely easy task.
>
> Mark - you recommend that we look at other projects and how it's done
> there. Well, I don't know too many open source projects, but I've
> participated in several products development. From my experience
> there, after a release is cut, a new branch is created (for that
> release). All new development happens on the next release's branch
> (trunk in our case) and usually only bug fixes are developed on the
> previous branch (and often on several such branches). Sometimes, a new
> feature/enhancement which is being developed on trunk is pulled to the
> previous branch on demand.

That really depends on varies. If you want to look at what others do you 
can probably find close to whatever you want. Just remember that Lucene 
is a library.

>
> Now, backward compatibility on trunk is a must, but at a slightly
> different level than we know it. The customer must be able to upgrade
> his env. to the newest release  w/o losing functionality. Exceptions
> are allowed, but they have to be well documented.
>
>  From what I see, we behave very much like a product. The new release
> is always bw compatible, exceptions are allowed when they are well
> documented, and no one makes any decisions lightly around here.

I think we behave like a widely used library.

>
> This is not very open sourcy :). No one here gets paid, we don't share
> Lucene dividents w/ the committers and contributors ...

I don't follow...

>
> I'm not saying that we should throw back compat out the door entirely,
> but some relaxation must be allowed … today, trunk is already back
> compat w/ exceptions. Can't we decide instead of developing on two
> branches that we will sometimes allow for larger bw breaks, when it
> makes sense? Why are we trying to have an "all or nothing" here?

If you follow what I have argued, this makes no sense. When did I ever 
say all or nothing? I'm so far from that, I just don't know what to make 
of this paragraph.

>
> And Mark, as soon as we allow trunk to break loose, it won't be like
> merging flex and trunk today - both were developed w/ bw support in
> mind …

Most patches are not that difficult to port. Back compat with flex was 
many times two different paths. So you would merge changes almost in two 
steps - first almost without considering back compat, and the back 
compat way. The flex branch experience tells us a lot about how things 
might work with different branches in the future. The situations won't 
be identical, but there is certainly a lot of overlap. IMO most patches 
will be easy to handle - much easier than dealing with the old back 
compat "policy".

>
> One other option - let's adopt the way of projects like Java. You
> don't see every 1.6 feature being ported to 1.5 right? Not every 7
> gets to 6 … rather, developmen happens in parallel on several
> branches. Each is fully managed on its own and at some point some
> versions are simply not supported anymore…

This works better with larger projects. This could be a first step 
towards such a world, but with the number of devs we have, starting at 
two main branches seems like a good starting point.

>
> it would make a lot of sense to declare flex as 4.0, along w/
> analyzers, parallel indexing and even incremental field updates. It
> doesn't mean though that every new feature that is contributed to
> trunk MUST be backported to 3.x. It can be done on demand, on a
> volunteering basis or simply an interest. We may also want to prevent
> such a thing from happening for several features …

I don't agree with that obviously. This is what I argue against.

>
> Just tossing ideas …
>
> Shai
>
> On Sunday, April 25, 2010, Mark Miller<ma...@gmail.com>  wrote:
>> On 4/25/10 9:55 AM, Robert Muir wrote:
>>
>>
>>
>> On Sun, Apr 25, 2010 at 9:30 AM, Mark Miller<markrmiller@gmail.com
>> <ma...@gmail.com>>  wrote:
>>
>>
>>      Could you elaborate on "it doesn't help anything"? That's an
>>      interesting argument, but not very persuasive :) "It doesn't help
>>      anything other than easing Mark's paranoia" :)
>>
>>
>> The only "advantage" to this idea is it seems to try to enforce putting
>> features in stable, but thats stupid. At the end you still have two
>> branches, you can call whichever one trunk you want, it doesn't really
>> matter. if someone doesn't want to do the work to backport something to
>> stable, they just aren't going to do it.
>>
>>
>> I may be misunderstanding, but this sounds like a call for "free for all" because everyone will do what they want anyway. But that's not generally how things work. Devs don't do whatever they want. They largely stick to some common practices (largely back compat). Not everyone has always agreed with how things have worked, but most have, and it has framed development.
>>
>> I think you take policy too seriously. From what I was told, our back compat policy was simply extracted from an email from Doug when in the early days of Lucene. It's just happened to have made its way to the wiki, and enough devs have tried to stick to what it says. It's not some all powerful policy - we have subverted it all the time - but it has framed development and created a lot of really good Lucene releases that where pretty easy to migrate across. People have generally agreed on the back compat policy due to a large amount of discussion in the past. Its been argued back and forth, but to a large extent we have stuck with it for whatever reason. There is no doubt its had a powerful affect on Lucene over the years, whether positive or negative is up for debate, but I've been pretty happy with how Lucene has progressed myself. Now it looks like its time to change how we frame development, but I don't find myself thinking, "who cares how we do it - devs will do whate
ver they want anyway". Because they won't. They will do what the majority of others are doing - so as we talk about making this change its important to learn which way the other devs are leaning, and hammer out some common goals. Figure out a little consensus. If I'm the only one that's "paranoid" about this, doesn't seem you have much to worry about.
>>
>> It would be easy to see different results from this change - we could go the way some are talking about and do very few back ports to stable, and essentially every release breaks back compat as it wants. Or we could concentrate more on stable releases, while doing more radical dev on trunk. It almost sounds to me that you think that it doesn't matter which way people prefer, because everyone will do what they want anyway. Well I disagree. I think its important to discuss which way we may end up with, because I think one of the ways is better for Lucene - and I don't think devs do whatever they want. The general common agreement about how things are done largely drives what devs do. We are talking about changing that agreement - I don't have paranoia - I want to discuss where we will end up because I think its an important change to Lucene, and its important to try and see how different devs feel, and what frame of mind they are going to go into this with. That will help gu
ide what actually happens. I know you don't think that's important, and I apologize for disagreeing with you.
>>
>> i'm waiting for the proposal
>>
>> that adds some "policy" about this, that would be very lucene-like.
>>
>>
>> Yeah, because Lucene has so many polices. The backcompat policy is called 'policy' for convenience - its never been voted in, its not an 'official' policy, we break that policy all the time. Its more consensus on how things are done than policy - you've seen that by now I hope. This discussion is also about coming up with consensus. I''m going to call you paranoid about policies in a minute :)
>>
>>
>>
>> and for any feature where someone is willing to do the work for it to be
>> in stable or unstable, its gonna have to be committed twice, by someone,
>> somewhere.
>>
>>
>> Yeah, well sounds like right now we have a couple options to talk about - consensus that we generally commit to both at the same time, or consenus that we merge occasionally instead. The models actually have a lot of differences. And likely there would be some mergers that did it often (like with flex), so that fewer devs might be backporting. The other way you would generally be counted on to back port all your own stuff.
>>
>>
>>
>>
>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com<ma...@gmail.com>
>>
>>
>>
>>
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

I think that it's a bit naive to think that stable and unstable will
remain that close to each other such that merging them every so often
is going to be an even remotely easy task.

Mark - you recommend that we look at other projects and how it's done
there. Well, I don't know too many open source projects, but I've
participated in several products development. From my experience
there, after a release is cut, a new branch is created (for that
release). All new development happens on the next release's branch
(trunk in our case) and usually only bug fixes are developed on the
previous branch (and often on several such branches). Sometimes, a new
feature/enhancement which is being developed on trunk is pulled to the
previous branch on demand.

Now, backward compatibility on trunk is a must, but at a slightly
different level than we know it. The customer must be able to upgrade
his env. to the newest release  w/o losing functionality. Exceptions
are allowed, but they have to be well documented.

>From what I see, we behave very much like a product. The new release
is always bw compatible, exceptions are allowed when they are well
documented, and no one makes any decisions lightly around here.

This is not very open sourcy :). No one here gets paid, we don't share
Lucene dividents w/ the committers and contributors ...

I'm not saying that we should throw back compat out the door entirely,
but some relaxation must be allowed … today, trunk is already back
compat w/ exceptions. Can't we decide instead of developing on two
branches that we will sometimes allow for larger bw breaks, when it
makes sense? Why are we trying to have an "all or nothing" here?

And Mark, as soon as we allow trunk to break loose, it won't be like
merging flex and trunk today - both were developed w/ bw support in
mind …

One other option - let's adopt the way of projects like Java. You
don't see every 1.6 feature being ported to 1.5 right? Not every 7
gets to 6 … rather, developmen happens in parallel on several
branches. Each is fully managed on its own and at some point some
versions are simply not supported anymore…

it would make a lot of sense to declare flex as 4.0, along w/
analyzers, parallel indexing and even incremental field updates. It
doesn't mean though that every new feature that is contributed to
trunk MUST be backported to 3.x. It can be done on demand, on a
volunteering basis or simply an interest. We may also want to prevent
such a thing from happening for several features …

Just tossing ideas …

Shai

On Sunday, April 25, 2010, Mark Miller <ma...@gmail.com> wrote:
> On 4/25/10 9:55 AM, Robert Muir wrote:
>
>
>
> On Sun, Apr 25, 2010 at 9:30 AM, Mark Miller <markrmiller@gmail.com
> <ma...@gmail.com>> wrote:
>
>
>     Could you elaborate on "it doesn't help anything"? That's an
>     interesting argument, but not very persuasive :) "It doesn't help
>     anything other than easing Mark's paranoia" :)
>
>
> The only "advantage" to this idea is it seems to try to enforce putting
> features in stable, but thats stupid. At the end you still have two
> branches, you can call whichever one trunk you want, it doesn't really
> matter. if someone doesn't want to do the work to backport something to
> stable, they just aren't going to do it.
>
>
> I may be misunderstanding, but this sounds like a call for "free for all" because everyone will do what they want anyway. But that's not generally how things work. Devs don't do whatever they want. They largely stick to some common practices (largely back compat). Not everyone has always agreed with how things have worked, but most have, and it has framed development.
>
> I think you take policy too seriously. From what I was told, our back compat policy was simply extracted from an email from Doug when in the early days of Lucene. It's just happened to have made its way to the wiki, and enough devs have tried to stick to what it says. It's not some all powerful policy - we have subverted it all the time - but it has framed development and created a lot of really good Lucene releases that where pretty easy to migrate across. People have generally agreed on the back compat policy due to a large amount of discussion in the past. Its been argued back and forth, but to a large extent we have stuck with it for whatever reason. There is no doubt its had a powerful affect on Lucene over the years, whether positive or negative is up for debate, but I've been pretty happy with how Lucene has progressed myself. Now it looks like its time to change how we frame development, but I don't find myself thinking, "who cares how we do it - devs will do whatever they want anyway". Because they won't. They will do what the majority of others are doing - so as we talk about making this change its important to learn which way the other devs are leaning, and hammer out some common goals. Figure out a little consensus. If I'm the only one that's "paranoid" about this, doesn't seem you have much to worry about.
>
> It would be easy to see different results from this change - we could go the way some are talking about and do very few back ports to stable, and essentially every release breaks back compat as it wants. Or we could concentrate more on stable releases, while doing more radical dev on trunk. It almost sounds to me that you think that it doesn't matter which way people prefer, because everyone will do what they want anyway. Well I disagree. I think its important to discuss which way we may end up with, because I think one of the ways is better for Lucene - and I don't think devs do whatever they want. The general common agreement about how things are done largely drives what devs do. We are talking about changing that agreement - I don't have paranoia - I want to discuss where we will end up because I think its an important change to Lucene, and its important to try and see how different devs feel, and what frame of mind they are going to go into this with. That will help guide what actually happens. I know you don't think that's important, and I apologize for disagreeing with you.
>
> i'm waiting for the proposal
>
> that adds some "policy" about this, that would be very lucene-like.
>
>
> Yeah, because Lucene has so many polices. The backcompat policy is called 'policy' for convenience - its never been voted in, its not an 'official' policy, we break that policy all the time. Its more consensus on how things are done than policy - you've seen that by now I hope. This discussion is also about coming up with consensus. I''m going to call you paranoid about policies in a minute :)
>
>
>
> and for any feature where someone is willing to do the work for it to be
> in stable or unstable, its gonna have to be committed twice, by someone,
> somewhere.
>
>
> Yeah, well sounds like right now we have a couple options to talk about - consensus that we generally commit to both at the same time, or consenus that we merge occasionally instead. The models actually have a lot of differences. And likely there would be some mergers that did it often (like with flex), so that fewer devs might be backporting. The other way you would generally be counted on to back port all your own stuff.
>
>
>
>
>
>
>
> --
> Robert Muir
> rcmuir@gmail.com <ma...@gmail.com>
>
>
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/25/10 9:55 AM, Robert Muir wrote:
>
>
> On Sun, Apr 25, 2010 at 9:30 AM, Mark Miller <markrmiller@gmail.com
> <ma...@gmail.com>> wrote:
>
>
>     Could you elaborate on "it doesn't help anything"? That's an
>     interesting argument, but not very persuasive :) "It doesn't help
>     anything other than easing Mark's paranoia" :)
>
>
> The only "advantage" to this idea is it seems to try to enforce putting
> features in stable, but thats stupid. At the end you still have two
> branches, you can call whichever one trunk you want, it doesn't really
> matter. if someone doesn't want to do the work to backport something to
> stable, they just aren't going to do it.

I may be misunderstanding, but this sounds like a call for "free for 
all" because everyone will do what they want anyway. But that's not 
generally how things work. Devs don't do whatever they want. They 
largely stick to some common practices (largely back compat). Not 
everyone has always agreed with how things have worked, but most have, 
and it has framed development.

I think you take policy too seriously. From what I was told, our back 
compat policy was simply extracted from an email from Doug when in the 
early days of Lucene. It's just happened to have made its way to the 
wiki, and enough devs have tried to stick to what it says. It's not some 
all powerful policy - we have subverted it all the time - but it has 
framed development and created a lot of really good Lucene releases that 
where pretty easy to migrate across. People have generally agreed on the 
back compat policy due to a large amount of discussion in the past. Its 
been argued back and forth, but to a large extent we have stuck with it 
for whatever reason. There is no doubt its had a powerful affect on 
Lucene over the years, whether positive or negative is up for debate, 
but I've been pretty happy with how Lucene has progressed myself. Now it 
looks like its time to change how we frame development, but I don't find 
myself thinking, "who cares how we do it - devs will do whatever they 
want anyway". Because they won't. They will do what the majority of 
others are doing - so as we talk about making this change its important 
to learn which way the other devs are leaning, and hammer out some 
common goals. Figure out a little consensus. If I'm the only one that's 
"paranoid" about this, doesn't seem you have much to worry about.

It would be easy to see different results from this change - we could go 
the way some are talking about and do very few back ports to stable, and 
essentially every release breaks back compat as it wants. Or we could 
concentrate more on stable releases, while doing more radical dev on 
trunk. It almost sounds to me that you think that it doesn't matter 
which way people prefer, because everyone will do what they want anyway. 
Well I disagree. I think its important to discuss which way we may end 
up with, because I think one of the ways is better for Lucene - and I 
don't think devs do whatever they want. The general common agreement 
about how things are done largely drives what devs do. We are talking 
about changing that agreement - I don't have paranoia - I want to 
discuss where we will end up because I think its an important change to 
Lucene, and its important to try and see how different devs feel, and 
what frame of mind they are going to go into this with. That will help 
guide what actually happens. I know you don't think that's important, 
and I apologize for disagreeing with you.

i'm waiting for the proposal
> that adds some "policy" about this, that would be very lucene-like.

Yeah, because Lucene has so many polices. The backcompat policy is 
called 'policy' for convenience - its never been voted in, its not an 
'official' policy, we break that policy all the time. Its more consensus 
on how things are done than policy - you've seen that by now I hope. 
This discussion is also about coming up with consensus. I''m going to 
call you paranoid about policies in a minute :)

>
> and for any feature where someone is willing to do the work for it to be
> in stable or unstable, its gonna have to be committed twice, by someone,
> somewhere.

Yeah, well sounds like right now we have a couple options to talk about 
- consensus that we generally commit to both at the same time, or 
consenus that we merge occasionally instead. The models actually have a 
lot of differences. And likely there would be some mergers that did it 
often (like with flex), so that fewer devs might be backporting. The 
other way you would generally be counted on to back port all your own stuff.

>
>
>
>
>
> --
> Robert Muir
> rcmuir@gmail.com <ma...@gmail.com>

-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Sun, Apr 25, 2010 at 9:30 AM, Mark Miller <ma...@gmail.com> wrote:

>
> Could you elaborate on "it doesn't help anything"? That's an interesting
> argument, but not very persuasive :) "It doesn't help anything other than
> easing Mark's paranoia" :)

The only "advantage" to this idea is it seems to try to enforce putting
features in stable, but thats stupid. At the end you still have two
branches, you can call whichever one trunk you want, it doesn't really
matter. if someone doesn't want to do the work to backport something to
stable, they just aren't going to do it. i'm waiting for the proposal that
adds some "policy" about this, that would be very lucene-like.

and for any feature where someone is willing to do the work for it to be in
stable or unstable, its gonna have to be committed twice, by someone,
somewhere.

>
>

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/25/10 8:42 AM, Robert Muir wrote:
>
>
> On Sun, Apr 25, 2010 at 8:24 AM, Mark Miller <markrmiller@gmail.com
> <ma...@gmail.com>> wrote:
>
>     That sounds good to me too.
>
>
>
> This doesn't sound good to me. It doesn't help anything, except Mark's
> paranoia about stable getting features.

I'm up for the other way as well - but yeah, both make me feel fuzzy.

Could you elaborate on "it doesn't help anything"? That's an interesting 
argument, but not very persuasive :) "It doesn't help anything other 
than easing Mark's paranoia" :)

> And it hinders development and
> community by creating a fake trunk.

Developing on multiple branches in general, hinders development - but 
personally, I didn't find the flex branch to be hindered that badly. I'm 
not too worried about the hinderence of committing to both branches at 
the same time either - when it makes sense its going to be fairly quick 
- merging *everything* from trunk to flex didn't really take that long. 
Many things will be much less work than the old system.

"fake trunk" depends on your definition of trunk. It has its downsides 
in comparison to keeping trunk extremely synced to stable, but it has 
its upsides as well. Patching to both branches makes me feel just as 
fuzzy as merging occasionally - take your pick.

>
> We could do the opposite instead, and merge a bunch of changes at once
> to stable, and not have a fake trunk like we did with flex.

Flex wasn't a fake trunk - it was a branch. Trunk was trunk. Its not an 
uncommon way to develop?

In my opinion, the way we did the Flex branch was very successful. So 
its easy to agree with using some of that formula.

>
>
> --
> Robert Muir
> rcmuir@gmail.com <ma...@gmail.com>

-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Sun, Apr 25, 2010 at 8:24 AM, Mark Miller <ma...@gmail.com> wrote:

> That sounds good to me too.
>
>
>
This doesn't sound good to me. It doesn't help anything, except Mark's
paranoia about stable getting features. And it hinders development and
community by creating a fake trunk.

We could do the opposite instead, and merge a bunch of changes at once to
stable, and not have a fake trunk like we did with flex.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

That sounds good to me too.

On 4/25/10 6:58 AM, Michael McCandless wrote:
> Maybe we can leave unstable on its own branch, and stable remains on
> trunk, like it is today?
>
> And, it's not the committer's job to port each little commit to stable
> over to the unstable branch.  Instead, we periodically re-sync stable
> -->  unstable, like we did with the long-lived flex branch.
>
> So, then, little would change on how stable is developed, today.  And
> stable would still be the primary source line for development.
>
> Unstable changes would happen on the unstable branch, and only be
> merged back to trunk when it's time for the next major release.
>
> Mike
>
> On Thu, Apr 22, 2010 at 11:31 AM, Mark Miller<ma...@gmail.com>  wrote:
>> Right - that sounds good to me. And when its a hairy change to back port, or
>> the change is just really invasive, or breaks back compat in way you have to
>> jump over hoops to put into stable - then you just put it in unstable. But
>> generally that is not most changes.
>>
>> On 04/22/2010 10:08 AM, Earwin Burrfoot wrote:
>>>
>>> Okay, let's live with parallel development, but make sure we 'always'
>>> port things from stable to trunk, and 'always' remove possible
>>> back-compat layers when doing such a port?
>>>
>>> On Thu, Apr 22, 2010 at 18:04, Mark Miller<ma...@gmail.com>    wrote:
>>>
>>>>
>>>> I'd vote -1 on Shai's variation and +1 on Mike's proposal.
>>>>
>>>> I don't think features should be backported to stable on request. If we
>>>> go
>>>> this route, I think it should be a matter of course unless the feature is
>>>> hairy enough to warrant unstable.
>>>>
>>>> Saying we should do all dev on unstable, and only back port on request
>>>> (who
>>>> will police that? everyone will accept all requests?) and that we should
>>>> just release trunk more often to accommodate, is like saying, lets just
>>>> throw back compat out the window, every release will be free to break
>>>> back
>>>> compat, we will just release more often...
>>>>
>>>> Working on two branches won't be 100% joy, but loosening the existing
>>>> much
>>>> larger annoyance of back compat is not going to be free IMO. To me,
>>>> Shai's
>>>> proposal is essentially - lets keep everything the same, but release more
>>>> often (we have decided to that 100 times) and lose back compat
>>>> requirements.
>>>> Then if a dev takes pity on a user, perhaps one of the unstable releases
>>>> will get a backport of a feature.
>>>>
>>>> If we take that route, I am vehemently against changing our policy.
>>>>
>>>> On 4/22/10 9:52 AM, Shai Erera wrote:
>>>>
>>>> I was advocating that we always develop on trunk w/ no back-compat
>>>> support,
>>>> API-wise ... you could have developed flex w/ no bw support.
>>>>
>>>> Currently what you're proposing would cause most features to be developed
>>>> on
>>>> stable w/ bw support and trunk w/o. I propose to leave 'stable', develop
>>>> on
>>>> trunk w/ no bw support (except for index format) and back port features
>>>> "on
>>>> demand" to stable w/ bw support.
>>>>
>>>> So instead of forcing all development to go through stable + trunk, I
>>>> propose to go through trunk, and back port to stable only if requested.
>>>> In
>>>> the end we'll be in the same position (trunk having all features) except
>>>> for
>>>> stable which will include just those features of interest to other
>>>> people.
>>>>
>>>> Shai
>>>>
>>>> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
>>>> <lu...@mikemccandless.com>    wrote:
>>>>
>>>>>
>>>>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera<se...@gmail.com>    wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> The only downside is that we will need to do everything twice: once on
>>>>>> stable and once on trunk. I still think that most of the issues and
>>>>>> development don't affect bw at all and thus we'll always say "this
>>>>>> needs to go to stable and trunk" which will just be an annoyance and
>>>>>> complicate the life of the developers even more because not only will
>>>>>> we need to keep bw compat, we'll need to write the code for trunk as
>>>>>> well.
>>>>>>
>>>>>
>>>>> Well, most things.  Some features (eg flex would've been such a
>>>>> feature) will only happen in trunk.
>>>>>
>>>>> But, yes, this is a downside -- stable changes will have to be merged
>>>>> up to trunk.
>>>>>
>>>>>
>>>>>>
>>>>>> What if we always develop on trunk, release it more often, and if
>>>>>> requested or a committer needs it, we backport a certain feature to
>>>>>> stable?
>>>>>>
>>>>>
>>>>> This is what we do today, and I think what's broken about it is we are
>>>>> unable to make a big change that has major breaks from the start.
>>>>> Every big change is required to land on trunk with back compat intact.
>>>>>
>>>>> This is terribly costly for changes like the new analyzer API (Token
>>>>> ->    AttrSource migration), and flex.
>>>>>
>>>>> So with the new model, a big change like flex could land on trunk with
>>>>> no back compat, and age for a long time, along with other such
>>>>> changes, before being included in a major release.
>>>>>
>>>>> I'm not sure we'll release trunk (major releases) more often.  I think
>>>>> it could go both ways...
>>>>>
>>>>> For small changes, I think whether a given dev works on trunk and
>>>>> merges back to stable, or stable and merges forwards to trunk, is an
>>>>> individual choice...
>>>>>
>>>>> Mike
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


-- 
- Mark

http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

RE: Proposal about Version API "relaxation"

Posted by Uwe Schindler <uw...@thetaphi.de>.

Hi Mike,

> Maybe we can leave unstable on its own branch, and stable remains on
> trunk, like it is today?

This somehow reverses the idea behind the folders in SVN. Everybody looking for the latest and coolest things would always look in trunk, never in a branch. Almost all projects have the orginal SVN structure: in A branches folder they do stable development and in trunk the newest things.

>From the view of the merging its not different at all. For the SVN below the folders trunk, branches, tags are just folders no other meaning. These folders are some standard way to maintain an svn repository and we should do it like that. Merging works always, regardless of how you structure the dirs. You can even merge two branches or whatever.

So I am -1 on this and I am +1 on simply creating a stable branch like it always was and everybody exspects. The only difference to the past is, that we no also do development on the stable branch.

> And, it's not the committer's job to port each little commit to stable
> over to the unstable branch.  Instead, we periodically re-sync stable
> --> unstable, like we did with the long-lived flex branch.
> 
> So, then, little would change on how stable is developed, today.  And
> stable would still be the primary source line for development.

That is also true for a branch. See e.g. PHP development. They have a trunk but at the moment nobody commit there (including me), we are maintaining branches 5.2 and 5.3 at the moment.

> Unstable changes would happen on the unstable branch, and only be
> merged back to trunk when it's time for the next major release.

See above.

Uwe

> On Thu, Apr 22, 2010 at 11:31 AM, Mark Miller <ma...@gmail.com>
> wrote:
> > Right - that sounds good to me. And when its a hairy change to back
> port, or
> > the change is just really invasive, or breaks back compat in way you
> have to
> > jump over hoops to put into stable - then you just put it in
> unstable. But
> > generally that is not most changes.
> >
> > On 04/22/2010 10:08 AM, Earwin Burrfoot wrote:
> >>
> >> Okay, let's live with parallel development, but make sure we
> 'always'
> >> port things from stable to trunk, and 'always' remove possible
> >> back-compat layers when doing such a port?
> >>
> >> On Thu, Apr 22, 2010 at 18:04, Mark Miller<ma...@gmail.com>
>  wrote:
> >>
> >>>
> >>> I'd vote -1 on Shai's variation and +1 on Mike's proposal.
> >>>
> >>> I don't think features should be backported to stable on request.
> If we
> >>> go
> >>> this route, I think it should be a matter of course unless the
> feature is
> >>> hairy enough to warrant unstable.
> >>>
> >>> Saying we should do all dev on unstable, and only back port on
> request
> >>> (who
> >>> will police that? everyone will accept all requests?) and that we
> should
> >>> just release trunk more often to accommodate, is like saying, lets
> just
> >>> throw back compat out the window, every release will be free to
> break
> >>> back
> >>> compat, we will just release more often...
> >>>
> >>> Working on two branches won't be 100% joy, but loosening the
> existing
> >>> much
> >>> larger annoyance of back compat is not going to be free IMO. To me,
> >>> Shai's
> >>> proposal is essentially - lets keep everything the same, but
> release more
> >>> often (we have decided to that 100 times) and lose back compat
> >>> requirements.
> >>> Then if a dev takes pity on a user, perhaps one of the unstable
> releases
> >>> will get a backport of a feature.
> >>>
> >>> If we take that route, I am vehemently against changing our policy.
> >>>
> >>> On 4/22/10 9:52 AM, Shai Erera wrote:
> >>>
> >>> I was advocating that we always develop on trunk w/ no back-compat
> >>> support,
> >>> API-wise ... you could have developed flex w/ no bw support.
> >>>
> >>> Currently what you're proposing would cause most features to be
> developed
> >>> on
> >>> stable w/ bw support and trunk w/o. I propose to leave 'stable',
> develop
> >>> on
> >>> trunk w/ no bw support (except for index format) and back port
> features
> >>> "on
> >>> demand" to stable w/ bw support.
> >>>
> >>> So instead of forcing all development to go through stable + trunk,
> I
> >>> propose to go through trunk, and back port to stable only if
> requested.
> >>> In
> >>> the end we'll be in the same position (trunk having all features)
> except
> >>> for
> >>> stable which will include just those features of interest to other
> >>> people.
> >>>
> >>> Shai
> >>>
> >>> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
> >>> <lu...@mikemccandless.com>  wrote:
> >>>
> >>>>
> >>>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera<se...@gmail.com>
>  wrote:
> >>>>
> >>>>
> >>>>>
> >>>>> The only downside is that we will need to do everything twice:
> once on
> >>>>> stable and once on trunk. I still think that most of the issues
> and
> >>>>> development don't affect bw at all and thus we'll always say
> "this
> >>>>> needs to go to stable and trunk" which will just be an annoyance
> and
> >>>>> complicate the life of the developers even more because not only
> will
> >>>>> we need to keep bw compat, we'll need to write the code for trunk
> as
> >>>>> well.
> >>>>>
> >>>>
> >>>> Well, most things.  Some features (eg flex would've been such a
> >>>> feature) will only happen in trunk.
> >>>>
> >>>> But, yes, this is a downside -- stable changes will have to be
> merged
> >>>> up to trunk.
> >>>>
> >>>>
> >>>>>
> >>>>> What if we always develop on trunk, release it more often, and if
> >>>>> requested or a committer needs it, we backport a certain feature
> to
> >>>>> stable?
> >>>>>
> >>>>
> >>>> This is what we do today, and I think what's broken about it is we
> are
> >>>> unable to make a big change that has major breaks from the start.
> >>>> Every big change is required to land on trunk with back compat
> intact.
> >>>>
> >>>> This is terribly costly for changes like the new analyzer API
> (Token
> >>>> ->  AttrSource migration), and flex.
> >>>>
> >>>> So with the new model, a big change like flex could land on trunk
> with
> >>>> no back compat, and age for a long time, along with other such
> >>>> changes, before being included in a major release.
> >>>>
> >>>> I'm not sure we'll release trunk (major releases) more often.  I
> think
> >>>> it could go both ways...
> >>>>
> >>>> For small changes, I think whether a given dev works on trunk and
> >>>> merges back to stable, or stable and merges forwards to trunk, is
> an
> >>>> individual choice...
> >>>>
> >>>> Mike
> >>>>
> >>>> ------------------------------------------------------------------
> ---
> >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
> > --
> > - Mark
> >
> > http://www.lucidimagination.com
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

Maybe we can leave unstable on its own branch, and stable remains on
trunk, like it is today?

And, it's not the committer's job to port each little commit to stable
over to the unstable branch.  Instead, we periodically re-sync stable
--> unstable, like we did with the long-lived flex branch.

So, then, little would change on how stable is developed, today.  And
stable would still be the primary source line for development.

Unstable changes would happen on the unstable branch, and only be
merged back to trunk when it's time for the next major release.

Mike

On Thu, Apr 22, 2010 at 11:31 AM, Mark Miller <ma...@gmail.com> wrote:
> Right - that sounds good to me. And when its a hairy change to back port, or
> the change is just really invasive, or breaks back compat in way you have to
> jump over hoops to put into stable - then you just put it in unstable. But
> generally that is not most changes.
>
> On 04/22/2010 10:08 AM, Earwin Burrfoot wrote:
>>
>> Okay, let's live with parallel development, but make sure we 'always'
>> port things from stable to trunk, and 'always' remove possible
>> back-compat layers when doing such a port?
>>
>> On Thu, Apr 22, 2010 at 18:04, Mark Miller<ma...@gmail.com>  wrote:
>>
>>>
>>> I'd vote -1 on Shai's variation and +1 on Mike's proposal.
>>>
>>> I don't think features should be backported to stable on request. If we
>>> go
>>> this route, I think it should be a matter of course unless the feature is
>>> hairy enough to warrant unstable.
>>>
>>> Saying we should do all dev on unstable, and only back port on request
>>> (who
>>> will police that? everyone will accept all requests?) and that we should
>>> just release trunk more often to accommodate, is like saying, lets just
>>> throw back compat out the window, every release will be free to break
>>> back
>>> compat, we will just release more often...
>>>
>>> Working on two branches won't be 100% joy, but loosening the existing
>>> much
>>> larger annoyance of back compat is not going to be free IMO. To me,
>>> Shai's
>>> proposal is essentially - lets keep everything the same, but release more
>>> often (we have decided to that 100 times) and lose back compat
>>> requirements.
>>> Then if a dev takes pity on a user, perhaps one of the unstable releases
>>> will get a backport of a feature.
>>>
>>> If we take that route, I am vehemently against changing our policy.
>>>
>>> On 4/22/10 9:52 AM, Shai Erera wrote:
>>>
>>> I was advocating that we always develop on trunk w/ no back-compat
>>> support,
>>> API-wise ... you could have developed flex w/ no bw support.
>>>
>>> Currently what you're proposing would cause most features to be developed
>>> on
>>> stable w/ bw support and trunk w/o. I propose to leave 'stable', develop
>>> on
>>> trunk w/ no bw support (except for index format) and back port features
>>> "on
>>> demand" to stable w/ bw support.
>>>
>>> So instead of forcing all development to go through stable + trunk, I
>>> propose to go through trunk, and back port to stable only if requested.
>>> In
>>> the end we'll be in the same position (trunk having all features) except
>>> for
>>> stable which will include just those features of interest to other
>>> people.
>>>
>>> Shai
>>>
>>> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
>>> <lu...@mikemccandless.com>  wrote:
>>>
>>>>
>>>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera<se...@gmail.com>  wrote:
>>>>
>>>>
>>>>>
>>>>> The only downside is that we will need to do everything twice: once on
>>>>> stable and once on trunk. I still think that most of the issues and
>>>>> development don't affect bw at all and thus we'll always say "this
>>>>> needs to go to stable and trunk" which will just be an annoyance and
>>>>> complicate the life of the developers even more because not only will
>>>>> we need to keep bw compat, we'll need to write the code for trunk as
>>>>> well.
>>>>>
>>>>
>>>> Well, most things.  Some features (eg flex would've been such a
>>>> feature) will only happen in trunk.
>>>>
>>>> But, yes, this is a downside -- stable changes will have to be merged
>>>> up to trunk.
>>>>
>>>>
>>>>>
>>>>> What if we always develop on trunk, release it more often, and if
>>>>> requested or a committer needs it, we backport a certain feature to
>>>>> stable?
>>>>>
>>>>
>>>> This is what we do today, and I think what's broken about it is we are
>>>> unable to make a big change that has major breaks from the start.
>>>> Every big change is required to land on trunk with back compat intact.
>>>>
>>>> This is terribly costly for changes like the new analyzer API (Token
>>>> ->  AttrSource migration), and flex.
>>>>
>>>> So with the new model, a big change like flex could land on trunk with
>>>> no back compat, and age for a long time, along with other such
>>>> changes, before being included in a major release.
>>>>
>>>> I'm not sure we'll release trunk (major releases) more often.  I think
>>>> it could go both ways...
>>>>
>>>> For small changes, I think whether a given dev works on trunk and
>>>> merges back to stable, or stable and merges forwards to trunk, is an
>>>> individual choice...
>>>>
>>>> Mike
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

Right - that sounds good to me. And when its a hairy change to back 
port, or the change is just really invasive, or breaks back compat in 
way you have to jump over hoops to put into stable - then you just put 
it in unstable. But generally that is not most changes.

On 04/22/2010 10:08 AM, Earwin Burrfoot wrote:
> Okay, let's live with parallel development, but make sure we 'always'
> port things from stable to trunk, and 'always' remove possible
> back-compat layers when doing such a port?
>
> On Thu, Apr 22, 2010 at 18:04, Mark Miller<ma...@gmail.com>  wrote:
>    
>> I'd vote -1 on Shai's variation and +1 on Mike's proposal.
>>
>> I don't think features should be backported to stable on request. If we go
>> this route, I think it should be a matter of course unless the feature is
>> hairy enough to warrant unstable.
>>
>> Saying we should do all dev on unstable, and only back port on request (who
>> will police that? everyone will accept all requests?) and that we should
>> just release trunk more often to accommodate, is like saying, lets just
>> throw back compat out the window, every release will be free to break back
>> compat, we will just release more often...
>>
>> Working on two branches won't be 100% joy, but loosening the existing much
>> larger annoyance of back compat is not going to be free IMO. To me, Shai's
>> proposal is essentially - lets keep everything the same, but release more
>> often (we have decided to that 100 times) and lose back compat requirements.
>> Then if a dev takes pity on a user, perhaps one of the unstable releases
>> will get a backport of a feature.
>>
>> If we take that route, I am vehemently against changing our policy.
>>
>> On 4/22/10 9:52 AM, Shai Erera wrote:
>>
>> I was advocating that we always develop on trunk w/ no back-compat support,
>> API-wise ... you could have developed flex w/ no bw support.
>>
>> Currently what you're proposing would cause most features to be developed on
>> stable w/ bw support and trunk w/o. I propose to leave 'stable', develop on
>> trunk w/ no bw support (except for index format) and back port features "on
>> demand" to stable w/ bw support.
>>
>> So instead of forcing all development to go through stable + trunk, I
>> propose to go through trunk, and back port to stable only if requested. In
>> the end we'll be in the same position (trunk having all features) except for
>> stable which will include just those features of interest to other people.
>>
>> Shai
>>
>> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
>> <lu...@mikemccandless.com>  wrote:
>>      
>>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera<se...@gmail.com>  wrote:
>>>
>>>        
>>>> The only downside is that we will need to do everything twice: once on
>>>> stable and once on trunk. I still think that most of the issues and
>>>> development don't affect bw at all and thus we'll always say "this
>>>> needs to go to stable and trunk" which will just be an annoyance and
>>>> complicate the life of the developers even more because not only will
>>>> we need to keep bw compat, we'll need to write the code for trunk as
>>>> well.
>>>>          
>>> Well, most things.  Some features (eg flex would've been such a
>>> feature) will only happen in trunk.
>>>
>>> But, yes, this is a downside -- stable changes will have to be merged
>>> up to trunk.
>>>
>>>        
>>>> What if we always develop on trunk, release it more often, and if
>>>> requested or a committer needs it, we backport a certain feature to
>>>> stable?
>>>>          
>>> This is what we do today, and I think what's broken about it is we are
>>> unable to make a big change that has major breaks from the start.
>>> Every big change is required to land on trunk with back compat intact.
>>>
>>> This is terribly costly for changes like the new analyzer API (Token
>>> ->  AttrSource migration), and flex.
>>>
>>> So with the new model, a big change like flex could land on trunk with
>>> no back compat, and age for a long time, along with other such
>>> changes, before being included in a major release.
>>>
>>> I'm not sure we'll release trunk (major releases) more often.  I think
>>> it could go both ways...
>>>
>>> For small changes, I think whether a given dev works on trunk and
>>> merges back to stable, or stable and merges forwards to trunk, is an
>>> individual choice...
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>        
>>
>>
>>      
>
>
>    


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

Okay, let's live with parallel development, but make sure we 'always'
port things from stable to trunk, and 'always' remove possible
back-compat layers when doing such a port?

On Thu, Apr 22, 2010 at 18:04, Mark Miller <ma...@gmail.com> wrote:
> I'd vote -1 on Shai's variation and +1 on Mike's proposal.
>
> I don't think features should be backported to stable on request. If we go
> this route, I think it should be a matter of course unless the feature is
> hairy enough to warrant unstable.
>
> Saying we should do all dev on unstable, and only back port on request (who
> will police that? everyone will accept all requests?) and that we should
> just release trunk more often to accommodate, is like saying, lets just
> throw back compat out the window, every release will be free to break back
> compat, we will just release more often...
>
> Working on two branches won't be 100% joy, but loosening the existing much
> larger annoyance of back compat is not going to be free IMO. To me, Shai's
> proposal is essentially - lets keep everything the same, but release more
> often (we have decided to that 100 times) and lose back compat requirements.
> Then if a dev takes pity on a user, perhaps one of the unstable releases
> will get a backport of a feature.
>
> If we take that route, I am vehemently against changing our policy.
>
> On 4/22/10 9:52 AM, Shai Erera wrote:
>
> I was advocating that we always develop on trunk w/ no back-compat support,
> API-wise ... you could have developed flex w/ no bw support.
>
> Currently what you're proposing would cause most features to be developed on
> stable w/ bw support and trunk w/o. I propose to leave 'stable', develop on
> trunk w/ no bw support (except for index format) and back port features "on
> demand" to stable w/ bw support.
>
> So instead of forcing all development to go through stable + trunk, I
> propose to go through trunk, and back port to stable only if requested. In
> the end we'll be in the same position (trunk having all features) except for
> stable which will include just those features of interest to other people.
>
> Shai
>
> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless
> <lu...@mikemccandless.com> wrote:
>>
>> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <se...@gmail.com> wrote:
>>
>> > The only downside is that we will need to do everything twice: once on
>> > stable and once on trunk. I still think that most of the issues and
>> > development don't affect bw at all and thus we'll always say "this
>> > needs to go to stable and trunk" which will just be an annoyance and
>> > complicate the life of the developers even more because not only will
>> > we need to keep bw compat, we'll need to write the code for trunk as
>> > well.
>>
>> Well, most things.  Some features (eg flex would've been such a
>> feature) will only happen in trunk.
>>
>> But, yes, this is a downside -- stable changes will have to be merged
>> up to trunk.
>>
>> > What if we always develop on trunk, release it more often, and if
>> > requested or a committer needs it, we backport a certain feature to
>> > stable?
>>
>> This is what we do today, and I think what's broken about it is we are
>> unable to make a big change that has major breaks from the start.
>> Every big change is required to land on trunk with back compat intact.
>>
>> This is terribly costly for changes like the new analyzer API (Token
>> -> AttrSource migration), and flex.
>>
>> So with the new model, a big change like flex could land on trunk with
>> no back compat, and age for a long time, along with other such
>> changes, before being included in a major release.
>>
>> I'm not sure we'll release trunk (major releases) more often.  I think
>> it could go both ways...
>>
>> For small changes, I think whether a given dev works on trunk and
>> merges back to stable, or stable and merges forwards to trunk, is an
>> individual choice...
>>
>> Mike
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

I'd vote -1 on Shai's variation and +1 on Mike's proposal.

I don't think features should be backported to stable on request. If we 
go this route, I think it should be a matter of course unless the 
feature is hairy enough to warrant unstable.

Saying we should do all dev on unstable, and only back port on request 
(who will police that? everyone will accept all requests?) and that we 
should just release trunk more often to accommodate, is like saying, 
lets just throw back compat out the window, every release will be free 
to break back compat, we will just release more often...

Working on two branches won't be 100% joy, but loosening the existing 
much larger annoyance of back compat is not going to be free IMO. To me, 
Shai's proposal is essentially - lets keep everything the same, but 
release more often (we have decided to that 100 times) and lose back 
compat requirements. Then if a dev takes pity on a user, perhaps one of 
the unstable releases will get a backport of a feature.

If we take that route, I am vehemently against changing our policy.

On 4/22/10 9:52 AM, Shai Erera wrote:
> I was advocating that we always develop on trunk w/ no back-compat 
> support, API-wise ... you could have developed flex w/ no bw support.
>
> Currently what you're proposing would cause most features to be 
> developed on stable w/ bw support and trunk w/o. I propose to leave 
> 'stable', develop on trunk w/ no bw support (except for index format) 
> and back port features "on demand" to stable w/ bw support.
>
> So instead of forcing all development to go through stable + trunk, I 
> propose to go through trunk, and back port to stable only if 
> requested. In the end we'll be in the same position (trunk having all 
> features) except for stable which will include just those features of 
> interest to other people.
>
> Shai
>
> On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless 
> <lucene@mikemccandless.com <ma...@mikemccandless.com>> wrote:
>
>     On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <serera@gmail.com
>     <ma...@gmail.com>> wrote:
>
>     > The only downside is that we will need to do everything twice:
>     once on
>     > stable and once on trunk. I still think that most of the issues and
>     > development don't affect bw at all and thus we'll always say "this
>     > needs to go to stable and trunk" which will just be an annoyance and
>     > complicate the life of the developers even more because not only
>     will
>     > we need to keep bw compat, we'll need to write the code for trunk as
>     > well.
>
>     Well, most things.  Some features (eg flex would've been such a
>     feature) will only happen in trunk.
>
>     But, yes, this is a downside -- stable changes will have to be merged
>     up to trunk.
>
>     > What if we always develop on trunk, release it more often, and if
>     > requested or a committer needs it, we backport a certain feature to
>     > stable?
>
>     This is what we do today, and I think what's broken about it is we are
>     unable to make a big change that has major breaks from the start.
>     Every big change is required to land on trunk with back compat intact.
>
>     This is terribly costly for changes like the new analyzer API (Token
>     -> AttrSource migration), and flex.
>
>     So with the new model, a big change like flex could land on trunk with
>     no back compat, and age for a long time, along with other such
>     changes, before being included in a major release.
>
>     I'm not sure we'll release trunk (major releases) more often.  I think
>     it could go both ways...
>
>     For small changes, I think whether a given dev works on trunk and
>     merges back to stable, or stable and merges forwards to trunk, is an
>     individual choice...
>
>     Mike
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>     <ma...@lucene.apache.org>
>     For additional commands, e-mail: dev-help@lucene.apache.org
>     <ma...@lucene.apache.org>
>
>

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

I was advocating that we always develop on trunk w/ no back-compat support,
API-wise ... you could have developed flex w/ no bw support.

Currently what you're proposing would cause most features to be developed on
stable w/ bw support and trunk w/o. I propose to leave 'stable', develop on
trunk w/ no bw support (except for index format) and back port features "on
demand" to stable w/ bw support.

So instead of forcing all development to go through stable + trunk, I
propose to go through trunk, and back port to stable only if requested. In
the end we'll be in the same position (trunk having all features) except for
stable which will include just those features of interest to other people.

Shai

On Thu, Apr 22, 2010 at 4:12 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <se...@gmail.com> wrote:
>
> > The only downside is that we will need to do everything twice: once on
> > stable and once on trunk. I still think that most of the issues and
> > development don't affect bw at all and thus we'll always say "this
> > needs to go to stable and trunk" which will just be an annoyance and
> > complicate the life of the developers even more because not only will
> > we need to keep bw compat, we'll need to write the code for trunk as
> > well.
>
> Well, most things.  Some features (eg flex would've been such a
> feature) will only happen in trunk.
>
> But, yes, this is a downside -- stable changes will have to be merged
> up to trunk.
>
> > What if we always develop on trunk, release it more often, and if
> > requested or a committer needs it, we backport a certain feature to
> > stable?
>
> This is what we do today, and I think what's broken about it is we are
> unable to make a big change that has major breaks from the start.
> Every big change is required to land on trunk with back compat intact.
>
> This is terribly costly for changes like the new analyzer API (Token
> -> AttrSource migration), and flex.
>
> So with the new model, a big change like flex could land on trunk with
> no back compat, and age for a long time, along with other such
> changes, before being included in a major release.
>
> I'm not sure we'll release trunk (major releases) more often.  I think
> it could go both ways...
>
> For small changes, I think whether a given dev works on trunk and
> merges back to stable, or stable and merges forwards to trunk, is an
> individual choice...
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Wed, Apr 21, 2010 at 1:56 PM, Shai Erera <se...@gmail.com> wrote:

> The only downside is that we will need to do everything twice: once on
> stable and once on trunk. I still think that most of the issues and
> development don't affect bw at all and thus we'll always say "this
> needs to go to stable and trunk" which will just be an annoyance and
> complicate the life of the developers even more because not only will
> we need to keep bw compat, we'll need to write the code for trunk as
> well.

Well, most things.  Some features (eg flex would've been such a
feature) will only happen in trunk.

But, yes, this is a downside -- stable changes will have to be merged
up to trunk.

> What if we always develop on trunk, release it more often, and if
> requested or a committer needs it, we backport a certain feature to
> stable?

This is what we do today, and I think what's broken about it is we are
unable to make a big change that has major breaks from the start.
Every big change is required to land on trunk with back compat intact.

This is terribly costly for changes like the new analyzer API (Token
-> AttrSource migration), and flex.

So with the new model, a big change like flex could land on trunk with
no back compat, and age for a long time, along with other such
changes, before being included in a major release.

I'm not sure we'll release trunk (major releases) more often.  I think
it could go both ways...

For small changes, I think whether a given dev works on trunk and
merges back to stable, or stable and merges forwards to trunk, is an
individual choice...

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

Analysis API (like any other public API) definitely should not be
broken within a major release.

I think we should also minimize the API surface area required of
analysis by the indexer & query parser.

Eg, indexer doesn't need to know about an analyzer -- instead, it
should interact with an iterator that provides fields w/ a token
stream (minus close/reset).

Whether the field is a Reader, pre-created token stream, String, etc.,
not-analyzed, etc., should live outside of indexer...

Mike

On Wed, Apr 21, 2010 at 2:32 PM, Mark Miller <ma...@gmail.com> wrote:
> On 4/21/10 2:28 PM, Robert Muir wrote:
>
> On Wed, Apr 21, 2010 at 2:20 PM, Mark Miller <ma...@gmail.com> wrote:
>>
>> What about api back breaks? Seems like an issue when trunk will be free to
>> break. How will you know what versions of analyzers can be used by which
>> versions of Lucene? Just a readme? Are their any guarantee's? How will I
>> know when I get locked out of upgrading Lucene because of the analyzer
>> version choice I made?
>
> In my opinion the analysis API should not be backwards broken at least
> within a major release... or else this could prevent someone from using
> analyzers-4.2.jar with lucene-core-4.8.jar.
> In general under this scheme we should be able to avoid backwards breaks
> better I think (e.g. dont backport things to stable that backwards break).
>
> If you want analyzers to actually work across major releases that seems to
> be more challenging, but maybe minimizing the interface between
> analyzers<->queryparser and analyzers<->indexer as much as possible could
> help.
>
> --
> Robert Muir
> rcmuir@gmail.com
>
> That sounds good to me - I'm personally not very worried about back compat
> over a major release either.
>
> - Mark
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/21/10 2:28 PM, Robert Muir wrote:
>
>
> On Wed, Apr 21, 2010 at 2:20 PM, Mark Miller <markrmiller@gmail.com 
> <ma...@gmail.com>> wrote:
>
>
>     What about api back breaks? Seems like an issue when trunk will be
>     free to break. How will you know what versions of analyzers can be
>     used by which versions of Lucene? Just a readme? Are their any
>     guarantee's? How will I know when I get locked out of upgrading
>     Lucene because of the analyzer version choice I made?
>
>
> In my opinion the analysis API should not be backwards broken at least 
> within a major release... or else this could prevent someone from 
> using analyzers-4.2.jar with lucene-core-4.8.jar.
> In general under this scheme we should be able to avoid backwards 
> breaks better I think (e.g. dont backport things to stable that 
> backwards break).
>
> If you want analyzers to actually work across major releases that 
> seems to be more challenging, but maybe minimizing the interface 
> between analyzers<->queryparser and analyzers<->indexer as much as 
> possible could help.
>
> -- 
> Robert Muir
> rcmuir@gmail.com <ma...@gmail.com>

That sounds good to me - I'm personally not very worried about back 
compat over a major release either.

- Mark

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Wed, Apr 21, 2010 at 2:20 PM, Mark Miller <ma...@gmail.com> wrote:

>
> What about api back breaks? Seems like an issue when trunk will be free to
> break. How will you know what versions of analyzers can be used by which
> versions of Lucene? Just a readme? Are their any guarantee's? How will I
> know when I get locked out of upgrading Lucene because of the analyzer
> version choice I made?
>
>
In my opinion the analysis API should not be backwards broken at least
within a major release... or else this could prevent someone from using
analyzers-4.2.jar with lucene-core-4.8.jar.
In general under this scheme we should be able to avoid backwards breaks
better I think (e.g. dont backport things to stable that backwards break).

If you want analyzers to actually work across major releases that seems to
be more challenging, but maybe minimizing the interface between
analyzers<->queryparser and analyzers<->indexer as much as possible could
help.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

+1 for developing in a single place (trunk) and backporting on on-demand basis.

The other points are fine.

On Wed, Apr 21, 2010 at 21:56, Shai Erera <se...@gmail.com> wrote:
> So basically, API-wise, the stable branch will remain like it is
> today: API changes under deprecation path, bw breaks as long as they
> are documented in CHANGES etc. Trunk will be allowed to change the API
> as it sees fit (but still document the changes in CHANGES).
>
> Index-format wise, we adopt Doron's proposal of the 3 support levels
> for trunk (for stable it's always L1).
>
> The only downside is that we will need to do everything twice: once on
> stable and once on trunk. I still think that most of the issues and
> development don't affect bw at all and thus we'll always say "this
> needs to go to stable and trunk" which will just be an annoyance and
> complicate the life of the developers even more because not only will
> we need to keep bw compat, we'll need to write the code for trunk as
> well.
>
> What if we always develop on trunk, release it more often, and if
> requested or a committer needs it, we backport a certain feature to
> stable? That way, stable includes really what's been specifically
> needed and trunk gets the latest and greatest API and features set.
> Since we still keep index format bw (pending levels) most apps should
> not have any problem upgrading to a latest major, given that they
> adapt to the new API …
>
> Shai
>
> On Wednesday, April 21, 2010, Michael McCandless
> <lu...@mikemccandless.com> wrote:
>> Trying to summarize what we seem to be roughly converging to, here:
>>
>>   * Up front: consolidate all Solr core, Lucene core, contrib
>>     analyzers into one place (contrib/analyzers).  Don't use Version
>>     in there; instead, the released JAR is versioned.  The app picks
>>     its required version compatibility by picking the right analyzers
>>     JAR to use.
>>
>>   * Switch to two active branches for ongoing development (stable &
>>     trunk) -- stable gets features/bug fixes that are low risk / don't
>>     change (too many?) APIs.  We make minor releases off of stable
>>     (3.0, 3.1, 3.2, and possibly also bug-fix only .Z release like
>>     3.0.1), while trunk has ongoing non-back-compatible changes.
>>     Development should be active on both.
>>
>>   * Maybe release 3.1 today, by branching off trunk before flex
>>     landed, maybe minus a few changes.  This would be the start of the
>>     stable branch for 3.x releases, and trunk becomes experimental.
>>
>>   * Index compatibility on a major release falls into 3 levels:
>>
>>      - Level 1: index is read/written "live" (this is what we have
>>        today).
>>
>>      - Level 2: we provide a migration tool (this is what we'd do for
>>        the flex changes), to carry the index forward.
>>
>>      - Level 3: app must re-index.
>>
>>     I would expect level 3 to be very rare.... (it's never happened so
>>     far!).
>>
>> Does this sound right?  Objections?
>>
>> Mike
>>
>> On Mon, Apr 19, 2010 at 4:37 AM, Shai Erera <se...@gmail.com> wrote:
>>> The 'unless' part is good and in place IMO. Certainly, if sometimes in the
>>> future Lucene moves away from segmented indexing approach into something
>>> else, I wouldn't expect a migration tool to be introduced. So overhauling
>>> index file format might be ok to go w/o any migration tool introduced.
>>>
>>> But I think we tend to take it as a "all or nothing" deal. When the index
>>> file format changed (and Mike can correct me if I'm wrong), it usually
>>> didn't introduce such overhauling changes. For example, "flex scoring" - we
>>> can say that a flex-scoring index should read older indexes, and if one did
>>> not want to take advantage of that feature, one should not reindex just
>>> because he upgrade to Lucene 4.0, for other reasons. I believe that should
>>> be relatively easy to support.
>>> And I think that people understand that they cannot take advantage of a new
>>> feature until they reindex (features like flex-scoring, numeric queries
>>> etc.) -- I just think we shouldn't *force* them to reindex just because the
>>> indexing code now 'expects' those files/terms to be in place for regular
>>> indexing behavior (unrelated to those advanced features).
>>>
>>> I'm +1 for that proposal.
>>>
>>> Shai
>>>
>>> On Mon, Apr 19, 2010 at 11:28 AM, Doron Cohen <cd...@gmail.com> wrote:
>>>>
>>>> Late joining... could we agree on an "intention" to provide an index
>>>> migration tool when/if format back comp. has to be broken? It is not clear
>>>> to me that this was agreed... So here is a suggestion for a revised index
>>>> format backwards compatibility policy:
>>>> Starting release 4.0, Lucene has a limited file formats back-compatibility
>>>> between major versions, falling into one of the three possible levels:
>>>> (Level 1) When possible, Version X is would be able to read indexes
>>>> generated by any X-1 version after and including version X-1.0. (Level 2) If
>>>> version X cannot read indexes of version X-1, the release of version X would
>>>> be accompanied by a tool for migrating indexes from X-1 to X, unless (Level
>>>> 3) the nature of the specific change does not allow for the development of
>>>> such a migration tool. For the exact level of file back compatibility of a
>>>> release see the specific release notes.
>>>> Not sure if the "unless" part (no. 3) would ever materialize, but I think
>>>> it provides a required freedom.
>>>> Doron
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

So basically, API-wise, the stable branch will remain like it is
today: API changes under deprecation path, bw breaks as long as they
are documented in CHANGES etc. Trunk will be allowed to change the API
as it sees fit (but still document the changes in CHANGES).

Index-format wise, we adopt Doron's proposal of the 3 support levels
for trunk (for stable it's always L1).

The only downside is that we will need to do everything twice: once on
stable and once on trunk. I still think that most of the issues and
development don't affect bw at all and thus we'll always say "this
needs to go to stable and trunk" which will just be an annoyance and
complicate the life of the developers even more because not only will
we need to keep bw compat, we'll need to write the code for trunk as
well.

What if we always develop on trunk, release it more often, and if
requested or a committer needs it, we backport a certain feature to
stable? That way, stable includes really what's been specifically
needed and trunk gets the latest and greatest API and features set.
Since we still keep index format bw (pending levels) most apps should
not have any problem upgrading to a latest major, given that they
adapt to the new API …

Shai

On Wednesday, April 21, 2010, Michael McCandless
<lu...@mikemccandless.com> wrote:
> Trying to summarize what we seem to be roughly converging to, here:
>
>   * Up front: consolidate all Solr core, Lucene core, contrib
>     analyzers into one place (contrib/analyzers).  Don't use Version
>     in there; instead, the released JAR is versioned.  The app picks
>     its required version compatibility by picking the right analyzers
>     JAR to use.
>
>   * Switch to two active branches for ongoing development (stable &
>     trunk) -- stable gets features/bug fixes that are low risk / don't
>     change (too many?) APIs.  We make minor releases off of stable
>     (3.0, 3.1, 3.2, and possibly also bug-fix only .Z release like
>     3.0.1), while trunk has ongoing non-back-compatible changes.
>     Development should be active on both.
>
>   * Maybe release 3.1 today, by branching off trunk before flex
>     landed, maybe minus a few changes.  This would be the start of the
>     stable branch for 3.x releases, and trunk becomes experimental.
>
>   * Index compatibility on a major release falls into 3 levels:
>
>      - Level 1: index is read/written "live" (this is what we have
>        today).
>
>      - Level 2: we provide a migration tool (this is what we'd do for
>        the flex changes), to carry the index forward.
>
>      - Level 3: app must re-index.
>
>     I would expect level 3 to be very rare.... (it's never happened so
>     far!).
>
> Does this sound right?  Objections?
>
> Mike
>
> On Mon, Apr 19, 2010 at 4:37 AM, Shai Erera <se...@gmail.com> wrote:
>> The 'unless' part is good and in place IMO. Certainly, if sometimes in the
>> future Lucene moves away from segmented indexing approach into something
>> else, I wouldn't expect a migration tool to be introduced. So overhauling
>> index file format might be ok to go w/o any migration tool introduced.
>>
>> But I think we tend to take it as a "all or nothing" deal. When the index
>> file format changed (and Mike can correct me if I'm wrong), it usually
>> didn't introduce such overhauling changes. For example, "flex scoring" - we
>> can say that a flex-scoring index should read older indexes, and if one did
>> not want to take advantage of that feature, one should not reindex just
>> because he upgrade to Lucene 4.0, for other reasons. I believe that should
>> be relatively easy to support.
>> And I think that people understand that they cannot take advantage of a new
>> feature until they reindex (features like flex-scoring, numeric queries
>> etc.) -- I just think we shouldn't *force* them to reindex just because the
>> indexing code now 'expects' those files/terms to be in place for regular
>> indexing behavior (unrelated to those advanced features).
>>
>> I'm +1 for that proposal.
>>
>> Shai
>>
>> On Mon, Apr 19, 2010 at 11:28 AM, Doron Cohen <cd...@gmail.com> wrote:
>>>
>>> Late joining... could we agree on an "intention" to provide an index
>>> migration tool when/if format back comp. has to be broken? It is not clear
>>> to me that this was agreed... So here is a suggestion for a revised index
>>> format backwards compatibility policy:
>>> Starting release 4.0, Lucene has a limited file formats back-compatibility
>>> between major versions, falling into one of the three possible levels:
>>> (Level 1) When possible, Version X is would be able to read indexes
>>> generated by any X-1 version after and including version X-1.0. (Level 2) If
>>> version X cannot read indexes of version X-1, the release of version X would
>>> be accompanied by a tool for migrating indexes from X-1 to X, unless (Level
>>> 3) the nature of the specific change does not allow for the development of
>>> such a migration tool. For the exact level of file back compatibility of a
>>> release see the specific release notes.
>>> Not sure if the "unless" part (no. 3) would ever materialize, but I think
>>> it provides a required freedom.
>>> Doron
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

On 4/21/10 11:25 AM, Michael McCandless wrote:
> Trying to summarize what we seem to be roughly converging to, here:
>
>    * Up front: consolidate all Solr core, Lucene core, contrib
>      analyzers into one place (contrib/analyzers).  Don't use Version
>      in there; instead, the released JAR is versioned.  The app picks
>      its required version compatibility by picking the right analyzers
>      JAR to use.
>    
What about api back breaks? Seems like an issue when trunk will be free 
to break. How will you know what versions of analyzers can be used by 
which versions of Lucene? Just a readme? Are their any guarantee's? How 
will I know when I get locked out of upgrading Lucene because of the 
analyzer version choice I made?

>    * Switch to two active branches for ongoing development (stable&
>      trunk) -- stable gets features/bug fixes that are low risk / don't
>      change (too many?) APIs.  We make minor releases off of stable
>      (3.0, 3.1, 3.2, and possibly also bug-fix only .Z release like
>      3.0.1), while trunk has ongoing non-back-compatible changes.
>      Development should be active on both.
>
>    * Maybe release 3.1 today, by branching off trunk before flex
>      landed, maybe minus a few changes.  This would be the start of the
>      stable branch for 3.x releases, and trunk becomes experimental.
>
>    * Index compatibility on a major release falls into 3 levels:
>
>       - Level 1: index is read/written "live" (this is what we have
>         today).
>
>       - Level 2: we provide a migration tool (this is what we'd do for
>         the flex changes), to carry the index forward.
>
>       - Level 3: app must re-index.
>
>      I would expect level 3 to be very rare.... (it's never happened so
>      far!).
>
> Does this sound right?  Objections?
>
> Mike
>    
Sounds good to me - but I still think it should include that stable is 
still the default dev branch - with trunk used for cases where it is needed.

- Mark

> On Mon, Apr 19, 2010 at 4:37 AM, Shai Erera<se...@gmail.com>  wrote:
>    
>> The 'unless' part is good and in place IMO. Certainly, if sometimes in the
>> future Lucene moves away from segmented indexing approach into something
>> else, I wouldn't expect a migration tool to be introduced. So overhauling
>> index file format might be ok to go w/o any migration tool introduced.
>>
>> But I think we tend to take it as a "all or nothing" deal. When the index
>> file format changed (and Mike can correct me if I'm wrong), it usually
>> didn't introduce such overhauling changes. For example, "flex scoring" - we
>> can say that a flex-scoring index should read older indexes, and if one did
>> not want to take advantage of that feature, one should not reindex just
>> because he upgrade to Lucene 4.0, for other reasons. I believe that should
>> be relatively easy to support.
>> And I think that people understand that they cannot take advantage of a new
>> feature until they reindex (features like flex-scoring, numeric queries
>> etc.) -- I just think we shouldn't *force* them to reindex just because the
>> indexing code now 'expects' those files/terms to be in place for regular
>> indexing behavior (unrelated to those advanced features).
>>
>> I'm +1 for that proposal.
>>
>> Shai
>>
>> On Mon, Apr 19, 2010 at 11:28 AM, Doron Cohen<cd...@gmail.com>  wrote:
>>      
>>> Late joining... could we agree on an "intention" to provide an index
>>> migration tool when/if format back comp. has to be broken? It is not clear
>>> to me that this was agreed... So here is a suggestion for a revised index
>>> format backwards compatibility policy:
>>> Starting release 4.0, Lucene has a limited file formats back-compatibility
>>> between major versions, falling into one of the three possible levels:
>>> (Level 1) When possible, Version X is would be able to read indexes
>>> generated by any X-1 version after and including version X-1.0. (Level 2) If
>>> version X cannot read indexes of version X-1, the release of version X would
>>> be accompanied by a tool for migrating indexes from X-1 to X, unless (Level
>>> 3) the nature of the specific change does not allow for the development of
>>> such a migration tool. For the exact level of file back compatibility of a
>>> release see the specific release notes.
>>> Not sure if the "unless" part (no. 3) would ever materialize, but I think
>>> it provides a required freedom.
>>> Doron
>>>        
>>      
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>    


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

Trying to summarize what we seem to be roughly converging to, here:

  * Up front: consolidate all Solr core, Lucene core, contrib
    analyzers into one place (contrib/analyzers).  Don't use Version
    in there; instead, the released JAR is versioned.  The app picks
    its required version compatibility by picking the right analyzers
    JAR to use.

  * Switch to two active branches for ongoing development (stable &
    trunk) -- stable gets features/bug fixes that are low risk / don't
    change (too many?) APIs.  We make minor releases off of stable
    (3.0, 3.1, 3.2, and possibly also bug-fix only .Z release like
    3.0.1), while trunk has ongoing non-back-compatible changes.
    Development should be active on both.

  * Maybe release 3.1 today, by branching off trunk before flex
    landed, maybe minus a few changes.  This would be the start of the
    stable branch for 3.x releases, and trunk becomes experimental.

  * Index compatibility on a major release falls into 3 levels:

     - Level 1: index is read/written "live" (this is what we have
       today).

     - Level 2: we provide a migration tool (this is what we'd do for
       the flex changes), to carry the index forward.

     - Level 3: app must re-index.

    I would expect level 3 to be very rare.... (it's never happened so
    far!).

Does this sound right?  Objections?

Mike

On Mon, Apr 19, 2010 at 4:37 AM, Shai Erera <se...@gmail.com> wrote:
> The 'unless' part is good and in place IMO. Certainly, if sometimes in the
> future Lucene moves away from segmented indexing approach into something
> else, I wouldn't expect a migration tool to be introduced. So overhauling
> index file format might be ok to go w/o any migration tool introduced.
>
> But I think we tend to take it as a "all or nothing" deal. When the index
> file format changed (and Mike can correct me if I'm wrong), it usually
> didn't introduce such overhauling changes. For example, "flex scoring" - we
> can say that a flex-scoring index should read older indexes, and if one did
> not want to take advantage of that feature, one should not reindex just
> because he upgrade to Lucene 4.0, for other reasons. I believe that should
> be relatively easy to support.
> And I think that people understand that they cannot take advantage of a new
> feature until they reindex (features like flex-scoring, numeric queries
> etc.) -- I just think we shouldn't *force* them to reindex just because the
> indexing code now 'expects' those files/terms to be in place for regular
> indexing behavior (unrelated to those advanced features).
>
> I'm +1 for that proposal.
>
> Shai
>
> On Mon, Apr 19, 2010 at 11:28 AM, Doron Cohen <cd...@gmail.com> wrote:
>>
>> Late joining... could we agree on an "intention" to provide an index
>> migration tool when/if format back comp. has to be broken? It is not clear
>> to me that this was agreed... So here is a suggestion for a revised index
>> format backwards compatibility policy:
>> Starting release 4.0, Lucene has a limited file formats back-compatibility
>> between major versions, falling into one of the three possible levels:
>> (Level 1) When possible, Version X is would be able to read indexes
>> generated by any X-1 version after and including version X-1.0. (Level 2) If
>> version X cannot read indexes of version X-1, the release of version X would
>> be accompanied by a tool for migrating indexes from X-1 to X, unless (Level
>> 3) the nature of the specific change does not allow for the development of
>> such a migration tool. For the exact level of file back compatibility of a
>> release see the specific release notes.
>> Not sure if the "unless" part (no. 3) would ever materialize, but I think
>> it provides a required freedom.
>> Doron
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

The 'unless' part is good and in place IMO. Certainly, if sometimes in the
future Lucene moves away from segmented indexing approach into something
else, I wouldn't expect a migration tool to be introduced. So overhauling
index file format might be ok to go w/o any migration tool introduced.

But I think we tend to take it as a "all or nothing" deal. When the index
file format changed (and Mike can correct me if I'm wrong), it usually
didn't introduce such overhauling changes. For example, "flex scoring" - we
can say that a flex-scoring index should read older indexes, and if one did
not want to take advantage of that feature, one should not reindex just
because he upgrade to Lucene 4.0, for other reasons. I believe that should
be relatively easy to support.
And I think that people understand that they cannot take advantage of a new
feature until they reindex (features like flex-scoring, numeric queries
etc.) -- I just think we shouldn't *force* them to reindex just because the
indexing code now 'expects' those files/terms to be in place for regular
indexing behavior (unrelated to those advanced features).

I'm +1 for that proposal.

Shai

On Mon, Apr 19, 2010 at 11:28 AM, Doron Cohen <cd...@gmail.com> wrote:

> Late joining... could we agree on an "intention" to provide an index
> migration tool when/if format back comp. has to be broken? It is not clear
> to me that this was agreed... So here is a suggestion for a revised index
> format backwards compatibility policy:
>
> Starting release 4.0, Lucene has a limited file formats back-compatibility
> between major versions, falling into one of the three possible levels:
> (Level 1) When possible, Version X is would be able to read indexes
> generated by any X-1 version after and including version X-1.0. (Level 2) If
> version X cannot read indexes of version X-1, the release of version X would
> be accompanied by a tool for migrating indexes from X-1 to X, unless (Level
> 3) the nature of the specific change does not allow for the development of
> such a migration tool. For the exact level of file back compatibility of a
> release see the specific release notes.
>
> Not sure if the "unless" part (no. 3) would ever materialize, but I think
> it provides a required freedom.
>
> Doron
>

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

I like these separate levels, to characterize index compatibility.

As far as I know we've never had a level 3 major release :)

Mike

On Mon, Apr 19, 2010 at 4:28 AM, Doron Cohen <cd...@gmail.com> wrote:
> Late joining... could we agree on an "intention" to provide an index
> migration tool when/if format back comp. has to be broken? It is not clear
> to me that this was agreed... So here is a suggestion for a revised index
> format backwards compatibility policy:
> Starting release 4.0, Lucene has a limited file formats back-compatibility
> between major versions, falling into one of the three possible levels:
> (Level 1) When possible, Version X is would be able to read indexes
> generated by any X-1 version after and including version X-1.0. (Level 2) If
> version X cannot read indexes of version X-1, the release of version X would
> be accompanied by a tool for migrating indexes from X-1 to X, unless (Level
> 3) the nature of the specific change does not allow for the development of
> such a migration tool. For the exact level of file back compatibility of a
> release see the specific release notes.
> Not sure if the "unless" part (no. 3) would ever materialize, but I think it
> provides a required freedom.
> Doron

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Doron Cohen <cd...@gmail.com>.

Late joining... could we agree on an "intention" to provide an index
migration tool when/if format back comp. has to be broken? It is not clear
to me that this was agreed... So here is a suggestion for a revised index
format backwards compatibility policy:

Starting release 4.0, Lucene has a limited file formats back-compatibility
between major versions, falling into one of the three possible levels:
(Level 1) When possible, Version X is would be able to read indexes
generated by any X-1 version after and including version X-1.0. (Level 2) If
version X cannot read indexes of version X-1, the release of version X would
be accompanied by a tool for migrating indexes from X-1 to X, unless (Level
3) the nature of the specific change does not allow for the development of
such a migration tool. For the exact level of file back compatibility of a
release see the specific release notes.

Not sure if the "unless" part (no. 3) would ever materialize, but I think it
provides a required freedom.

Doron

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Fri, Apr 16, 2010 at 1:00 PM, Robert Muir <rc...@gmail.com> wrote:

> And I think backwards compatibility should be more community-driven instead of a "policy". If no one wants to put things in a stable branch I really do think thats a sign of something (mostly that its not as important as you seem to think)

I also suspect many devs will want to work on the stable branch,
because the changes are released much more frequently.  Ie, we should
in general cut a 3.1, 3.2, 3.3, etc. much sooner than a 4.0.

I think the default should be that a new feature is done on the stable
branch unless it's going to break APIs / require too much work for
back compat.  It'd be a judgement call on each feature, and we'll
obviously have to see how things pan out over time, but I would expect
alot of work happens on the stable branch...

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Fri, Apr 16, 2010 at 12:45 PM, Mark Miller <ma...@gmail.com> wrote:

>
> It's not a sign that users don't care about it. Lately I think you have
> taken the stance, users be damned, Lucene dev should just be geared towards
> devs. I'm not a fan of that kind of attitude when it comes to Lucene dev
> myself.
>
>
Its ok that we disagree too.

I just don't think users are babies and can make some of their own
decisions, like which analyzers jar file to use.

And I think backwards compatibility should be more community-driven instead
of a "policy". If no one wants to put things in a stable branch I really do
think thats a sign of something (mostly that its not as important as you
seem to think)

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.


On 04/16/2010 12:16 PM, Robert Muir wrote:
>
>
> On Fri, Apr 16, 2010 at 12:12 PM, Mark Miller <markrmiller@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     I'd be for this plan if I really thought the stable branch would
>     get similar attention to the experimental branch - but I have some
>     doubts about that. Its a fairly small dev community in comparison
>     to other projects that do this ...
>
>     Dev on the experimental latest greatest fun branch, or the more in
>     the past, back compat hassle stable branch? Port most patches to
>     two somewhat diverging code bases?
>
>     If that was actually how things worked out, I'd be +1. I just
>     wonder ... with the right framing I do think its possible though.
>
>
> But this is an open source project still right? So if you want more 
> attention paid to the stable branch, put your patches where your mouth 
> is (no offense).

I don't think that's how things should work. The project should be 
framed to guide devs towards what's best for everybody. Right now all 
devs work on a stable branch because we have policies that encourage 
that. We could also make policies that encourage every dev for himself 
crap development.

>
> If no one wants to put new features in the back-compat hassle branch, 
> well, then thats a sign that no one cares about it.

It's not a sign that users don't care about it. Lately I think you have 
taken the stance, users be damned, Lucene dev should just be geared 
towards devs. I'm not a fan of that kind of attitude when it comes to 
Lucene dev myself.

>
>
> -- 
> Robert Muir
> rcmuir@gmail.com <ma...@gmail.com>


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Fri, Apr 16, 2010 at 12:12 PM, Mark Miller <ma...@gmail.com> wrote:

> I'd be for this plan if I really thought the stable branch would get
> similar attention to the experimental branch - but I have some doubts about
> that. Its a fairly small dev community in comparison to other projects that
> do this ...
>
> Dev on the experimental latest greatest fun branch, or the more in the
> past, back compat hassle stable branch? Port most patches to two somewhat
> diverging code bases?
>
> If that was actually how things worked out, I'd be +1. I just wonder ...
> with the right framing I do think its possible though.
>
>
But this is an open source project still right? So if you want more
attention paid to the stable branch, put your patches where your mouth is
(no offense).

If no one wants to put new features in the back-compat hassle branch, well,
then thats a sign that no one cares about it.


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

I agree that's a risk... but this being open source, I think it'd balance out?

So eg right now I'm looking @ speeding up PhraseQuery (thanks to
Robert's prodding ;) ).  These changes are all under the hood, so, I
would do this on the stable branch.  There's no reason not to.

There are also good incentives to do something on stable branch, eg,
the feature will see the light of day sooner than it will on
experimental.  For people working on features that their "sponsors"
really need, doing so on the stable branch means they are available in
a real release sooner.

And by having a "safe" place for experimental stuff to land, we save
the risk of destabilizing the stable branch, that we face today.  And
we would no longer need to go to herculean efforts (like Uwe's back
compat layer for analysis).  Flex would've finished much sooner too :)

Mike

On Fri, Apr 16, 2010 at 12:12 PM, Mark Miller <ma...@gmail.com> wrote:
> I'd be for this plan if I really thought the stable branch would get similar
> attention to the experimental branch - but I have some doubts about that.
> Its a fairly small dev community in comparison to other projects that do
> this ...
>
> Dev on the experimental latest greatest fun branch, or the more in the past,
> back compat hassle stable branch? Port most patches to two somewhat
> diverging code bases?
>
> If that was actually how things worked out, I'd be +1. I just wonder ...
> with the right framing I do think its possible though.
>
>
> On 04/16/2010 11:45 AM, Michael McCandless wrote:
>>
>> Getting back to the stable/experimental branches...
>>
>> I think, with separate stable&  experimental branches, development
>> would/should be active on both branches.  It'd depend on the
>> feature...
>>
>> Eg today we'd have 3.x stable branch and the experimental branch
>> (= trunk).
>>
>> Small features, bug fixes, would be ported to both branches.  I think
>> features that deprecate some APIs would still be fine on the stable
>> branch.  Major changes (eg flex) would only be done on the
>> experimental branch.
>>
>> This empowers us on a feature by feature case to decide whether it'll
>> be in the stable release or not.  The stable branch releases would
>> be 3.0, 3.1, etc., but we could still do the .Z releases (3.0.1,
>> 3.0.2) for bug fixes, if we need to.
>>
>> And we could do alpha releases off the experimental branch as we think
>> we're getting close to cutting a new stable release (4.0).
>>
>> Mike
>>
>> On Thu, Apr 15, 2010 at 6:58 PM, Robert Muir<rc...@gmail.com>  wrote:
>>
>>>
>>> On Thu, Apr 15, 2010 at 6:50 PM, DM Smith<dm...@gmail.com>  wrote:
>>>
>>>>
>>>> Robert has already started one. (1488 I think).
>>>>
>>>
>>> and it could work with this new scheme... because then you could use an
>>> older icu jar file with an older lucene-analyzer-icu.jar or whatever and
>>> you
>>> have it more under control.
>>> under the "existing scheme" you cant really improve back compat with ICU,
>>> because they make API changes and backwards breaks and such themselves,
>>> so
>>> you cant make one "Tokenizer" say that does anything meaningful that
>>> works
>>> with all versions of it...
>>> but it would be cool to say: here is lucene-analyzer-icu-4.0.jar that
>>> works
>>> with icu 4.4. and you could keep using that as long as you have to
>>> (meanwhile trunk could start using icu 4.6)
>>>
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

I'd be for this plan if I really thought the stable branch would get 
similar attention to the experimental branch - but I have some doubts 
about that. Its a fairly small dev community in comparison to other 
projects that do this ...

Dev on the experimental latest greatest fun branch, or the more in the 
past, back compat hassle stable branch? Port most patches to two 
somewhat diverging code bases?

If that was actually how things worked out, I'd be +1. I just wonder ... 
with the right framing I do think its possible though.


On 04/16/2010 11:45 AM, Michael McCandless wrote:
> Getting back to the stable/experimental branches...
>
> I think, with separate stable&  experimental branches, development
> would/should be active on both branches.  It'd depend on the
> feature...
>
> Eg today we'd have 3.x stable branch and the experimental branch
> (= trunk).
>
> Small features, bug fixes, would be ported to both branches.  I think
> features that deprecate some APIs would still be fine on the stable
> branch.  Major changes (eg flex) would only be done on the
> experimental branch.
>
> This empowers us on a feature by feature case to decide whether it'll
> be in the stable release or not.  The stable branch releases would
> be 3.0, 3.1, etc., but we could still do the .Z releases (3.0.1,
> 3.0.2) for bug fixes, if we need to.
>
> And we could do alpha releases off the experimental branch as we think
> we're getting close to cutting a new stable release (4.0).
>
> Mike
>
> On Thu, Apr 15, 2010 at 6:58 PM, Robert Muir<rc...@gmail.com>  wrote:
>    
>> On Thu, Apr 15, 2010 at 6:50 PM, DM Smith<dm...@gmail.com>  wrote:
>>      
>>>
>>> Robert has already started one. (1488 I think).
>>>        
>> and it could work with this new scheme... because then you could use an
>> older icu jar file with an older lucene-analyzer-icu.jar or whatever and you
>> have it more under control.
>> under the "existing scheme" you cant really improve back compat with ICU,
>> because they make API changes and backwards breaks and such themselves, so
>> you cant make one "Tokenizer" say that does anything meaningful that works
>> with all versions of it...
>> but it would be cool to say: here is lucene-analyzer-icu-4.0.jar that works
>> with icu 4.4. and you could keep using that as long as you have to
>> (meanwhile trunk could start using icu 4.6)
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>>      
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>    


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

Getting back to the stable/experimental branches...

I think, with separate stable & experimental branches, development
would/should be active on both branches.  It'd depend on the
feature...

Eg today we'd have 3.x stable branch and the experimental branch
(= trunk).

Small features, bug fixes, would be ported to both branches.  I think
features that deprecate some APIs would still be fine on the stable
branch.  Major changes (eg flex) would only be done on the
experimental branch.

This empowers us on a feature by feature case to decide whether it'll
be in the stable release or not.  The stable branch releases would
be 3.0, 3.1, etc., but we could still do the .Z releases (3.0.1,
3.0.2) for bug fixes, if we need to.

And we could do alpha releases off the experimental branch as we think
we're getting close to cutting a new stable release (4.0).

Mike

On Thu, Apr 15, 2010 at 6:58 PM, Robert Muir <rc...@gmail.com> wrote:
>
> On Thu, Apr 15, 2010 at 6:50 PM, DM Smith <dm...@gmail.com> wrote:
>>
>>
>> Robert has already started one. (1488 I think).
>
> and it could work with this new scheme... because then you could use an
> older icu jar file with an older lucene-analyzer-icu.jar or whatever and you
> have it more under control.
> under the "existing scheme" you cant really improve back compat with ICU,
> because they make API changes and backwards breaks and such themselves, so
> you cant make one "Tokenizer" say that does anything meaningful that works
> with all versions of it...
> but it would be cool to say: here is lucene-analyzer-icu-4.0.jar that works
> with icu 4.4. and you could keep using that as long as you have to
> (meanwhile trunk could start using icu 4.6)
>
> --
> Robert Muir
> rcmuir@gmail.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 15, 2010 at 6:50 PM, DM Smith <dm...@gmail.com> wrote:

>
>
> Robert has already started one. (1488 I think).
>

and it could work with this new scheme... because then you could use an
older icu jar file with an older lucene-analyzer-icu.jar or whatever and you
have it more under control.

under the "existing scheme" you cant really improve back compat with ICU,
because they make API changes and backwards breaks and such themselves, so
you cant make one "Tokenizer" say that does anything meaningful that works
with all versions of it...

but it would be cool to say: here is lucene-analyzer-icu-4.0.jar that works
with icu 4.4. and you could keep using that as long as you have to
(meanwhile trunk could start using icu 4.6)

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On Apr 15, 2010, at 5:28 PM, Shai Erera wrote:

> DM I think ICU is great. But currently we use JFlex and you can run Java 10 if you want, but as long as JFlex is compiled w/ Java 1.4, that's what you'll get. Luckily Uwe and Robert recently bumped it up to Java 1.5. Such a change should be clearly documented in CHANGES so people are aware of this, and at least until they figure out what they want to do with it, they should take the pre-3.1 analyzers (assuming that's the next release w/ JFlex 1.5 tokenizers) and use them.

I'm not sure I understand. Is JFlex used by every tokenizer?

> 
> Alternatively, we can think of writing an ICU analyzer/tokenizer, but we're still using JFlex, so I don't know how much control we have on that ...

Robert has already started one. (1488 I think).

> 
> Shai
> 
> On Fri, Apr 16, 2010 at 12:21 AM, DM Smith <dm...@gmail.com> wrote:
> 
> On Apr 15, 2010, at 4:50 PM, Shai Erera wrote:
> 
> > Robert ... I'm sorry but changes to Analyzers don't *force* people to reindex. They can simply choose not to use the latest version. They can choose not to upgrade a Unicode version. They can copy the entire Analyzer code to match their needs. Index format changes is what I'm worried about because that *forces* people to reindex.
> 
> In several threads and issues it has been pointed out that upgrading Unicode versions is not an obvious choice or even controllable. It is dictated by the version of Java, the version of the OS and any Unicode specific libraries.
> 
> A desktop application which internally uses lucene has no control over the automatic update of Java (yes it can detect the version change and refuse to run or force an upgrade) or when the user feels like upgrading the OS (not sure how to detect the Unicode version of an arbitrary OS. Not sure I want to).
> 
> Even with server applications, some shared servers have one version of Java that all use. And the owner of an individual application might have no say in if or when that is upgraded.
> 
> This is to say that one needs to be ready to re-index at all times unless it can be controlled.
> 
> One way to handle the Java/Unicode is to use ICU at a specific version and control its upgrade.
> 
> One way to handle the OS problem (which really is one of user input) is to keep up with the changes to Unicode and create a filter that handles the differences normalizing to the Unicode version of the index (if that's even possible).
> 
> Still goes to your point. The onus is on the application not on Lucene.
> 
> -- DM
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
>

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

DM I think ICU is great. But currently we use JFlex and you can run Java 10
if you want, but as long as JFlex is compiled w/ Java 1.4, that's what
you'll get. Luckily Uwe and Robert recently bumped it up to Java 1.5. Such a
change should be clearly documented in CHANGES so people are aware of this,
and at least until they figure out what they want to do with it, they should
take the pre-3.1 analyzers (assuming that's the next release w/ JFlex 1.5
tokenizers) and use them.

Alternatively, we can think of writing an ICU analyzer/tokenizer, but we're
still using JFlex, so I don't know how much control we have on that ...

Shai

On Fri, Apr 16, 2010 at 12:21 AM, DM Smith <dm...@gmail.com> wrote:

>
> On Apr 15, 2010, at 4:50 PM, Shai Erera wrote:
>
> > Robert ... I'm sorry but changes to Analyzers don't *force* people to
> reindex. They can simply choose not to use the latest version. They can
> choose not to upgrade a Unicode version. They can copy the entire Analyzer
> code to match their needs. Index format changes is what I'm worried about
> because that *forces* people to reindex.
>
> In several threads and issues it has been pointed out that upgrading
> Unicode versions is not an obvious choice or even controllable. It is
> dictated by the version of Java, the version of the OS and any Unicode
> specific libraries.
>
> A desktop application which internally uses lucene has no control over the
> automatic update of Java (yes it can detect the version change and refuse to
> run or force an upgrade) or when the user feels like upgrading the OS (not
> sure how to detect the Unicode version of an arbitrary OS. Not sure I want
> to).
>
> Even with server applications, some shared servers have one version of Java
> that all use. And the owner of an individual application might have no say
> in if or when that is upgraded.
>
> This is to say that one needs to be ready to re-index at all times unless
> it can be controlled.
>
> One way to handle the Java/Unicode is to use ICU at a specific version and
> control its upgrade.
>
> One way to handle the OS problem (which really is one of user input) is to
> keep up with the changes to Unicode and create a filter that handles the
> differences normalizing to the Unicode version of the index (if that's even
> possible).
>
> Still goes to your point. The onus is on the application not on Lucene.
>
> -- DM
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On Apr 15, 2010, at 4:50 PM, Shai Erera wrote:

> Robert ... I'm sorry but changes to Analyzers don't *force* people to reindex. They can simply choose not to use the latest version. They can choose not to upgrade a Unicode version. They can copy the entire Analyzer code to match their needs. Index format changes is what I'm worried about because that *forces* people to reindex.

In several threads and issues it has been pointed out that upgrading Unicode versions is not an obvious choice or even controllable. It is dictated by the version of Java, the version of the OS and any Unicode specific libraries.

A desktop application which internally uses lucene has no control over the automatic update of Java (yes it can detect the version change and refuse to run or force an upgrade) or when the user feels like upgrading the OS (not sure how to detect the Unicode version of an arbitrary OS. Not sure I want to).

Even with server applications, some shared servers have one version of Java that all use. And the owner of an individual application might have no say in if or when that is upgraded.

This is to say that one needs to be ready to re-index at all times unless it can be controlled.

One way to handle the Java/Unicode is to use ICU at a specific version and control its upgrade.

One way to handle the OS problem (which really is one of user input) is to keep up with the changes to Unicode and create a filter that handles the differences normalizing to the Unicode version of the index (if that's even possible).

Still goes to your point. The onus is on the application not on Lucene.

-- DM
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

If you really believe this. then you have no problem if i remove all Version
from all core and contrib analyzers right now.

On Thu, Apr 15, 2010 at 4:50 PM, Shai Erera <se...@gmail.com> wrote:

> Robert ... I'm sorry but changes to Analyzers don't *force* people to
> reindex. They can simply choose not to use the latest version. They can
> choose not to upgrade a Unicode version. They can copy the entire Analyzer
> code to match their needs. Index format changes is what I'm worried about
> because that *forces* people to reindex.
>
> Analyzers, believe it or not, are just a tool, an out of the box tool even,
> we're giving users to analyze their stuff. Probably a tool used by most of
> our users, but not all. Some have their own tools, that are currently
> wrapped as a Lucene Analyzer just because the API mandates. But we were
> talking about that too recently no? Ripping Analyzer off IndexWriter?
>
> Just to be clear - I think your work on Analyzers is fantastic ! Really !
> Seriously !
> But it's a choice someone can make ... whereas index format is a given -
> you have to live with it, or never upgrade Lucene.
>
> But I think we've chewed that way too much. I am all for removing bw on
> Analyzers, and 2396 is a great step towards it (or maybe it is IT?). Even
> index format - I don't see when it will change next (but I think I have an
> idea ...), so we can tackle it then.
>
> Shai
>
>
> On Thu, Apr 15, 2010 at 11:33 PM, Robert Muir <rc...@gmail.com> wrote:
>
>>
>>
>> On Thu, Apr 15, 2010 at 4:21 PM, Shai Erera <se...@gmail.com> wrote:
>>
>>> Actually, I'd like to know if people like Robert (basically those who
>>> have no problem to reindex and don't understand the fuss around it) will
>>> want to change the index format - can I count on them to be asked to provide
>>> such tool? That's to me a policy we should decide on ... whatever the
>>> consequences.
>>>
>>
>> just look at the 1.8MB of backwards compat code in contrib/analyzers i
>> want to remove in LUCENE-2396?
>> are you serious? I wrote most of that cruft to prevent reindexing and you
>> are trying to say I "don't understand the fuss about it"?
>>
>> We shouldnt make people reindex, but we should have the chance, even if we
>> only do it ONE TIME, to reset Lucene to a new "Major Version" that has a
>> bunch of stuff fixed we couldnt fix before, and more flexibility.
>>
>> because with the current policy, its like we are in 1.x forever.... our
>> version numbers are a joke!
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Robert ... I'm sorry but changes to Analyzers don't *force* people to
reindex. They can simply choose not to use the latest version. They can
choose not to upgrade a Unicode version. They can copy the entire Analyzer
code to match their needs. Index format changes is what I'm worried about
because that *forces* people to reindex.

Analyzers, believe it or not, are just a tool, an out of the box tool even,
we're giving users to analyze their stuff. Probably a tool used by most of
our users, but not all. Some have their own tools, that are currently
wrapped as a Lucene Analyzer just because the API mandates. But we were
talking about that too recently no? Ripping Analyzer off IndexWriter?

Just to be clear - I think your work on Analyzers is fantastic ! Really !
Seriously !
But it's a choice someone can make ... whereas index format is a given - you
have to live with it, or never upgrade Lucene.

But I think we've chewed that way too much. I am all for removing bw on
Analyzers, and 2396 is a great step towards it (or maybe it is IT?). Even
index format - I don't see when it will change next (but I think I have an
idea ...), so we can tackle it then.

Shai

On Thu, Apr 15, 2010 at 11:33 PM, Robert Muir <rc...@gmail.com> wrote:

>
>
> On Thu, Apr 15, 2010 at 4:21 PM, Shai Erera <se...@gmail.com> wrote:
>
>> Actually, I'd like to know if people like Robert (basically those who have
>> no problem to reindex and don't understand the fuss around it) will want to
>> change the index format - can I count on them to be asked to provide such
>> tool? That's to me a policy we should decide on ... whatever the
>> consequences.
>>
>
> just look at the 1.8MB of backwards compat code in contrib/analyzers i want
> to remove in LUCENE-2396?
> are you serious? I wrote most of that cruft to prevent reindexing and you
> are trying to say I "don't understand the fuss about it"?
>
> We shouldnt make people reindex, but we should have the chance, even if we
> only do it ONE TIME, to reset Lucene to a new "Major Version" that has a
> bunch of stuff fixed we couldnt fix before, and more flexibility.
>
> because with the current policy, its like we are in 1.x forever.... our
> version numbers are a joke!
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 15, 2010 at 4:21 PM, Shai Erera <se...@gmail.com> wrote:

> Actually, I'd like to know if people like Robert (basically those who have
> no problem to reindex and don't understand the fuss around it) will want to
> change the index format - can I count on them to be asked to provide such
> tool? That's to me a policy we should decide on ... whatever the
> consequences.
>

just look at the 1.8MB of backwards compat code in contrib/analyzers i want
to remove in LUCENE-2396?
are you serious? I wrote most of that cruft to prevent reindexing and you
are trying to say I "don't understand the fuss about it"?

We shouldnt make people reindex, but we should have the chance, even if we
only do it ONE TIME, to reset Lucene to a new "Major Version" that has a
bunch of stuff fixed we couldnt fix before, and more flexibility.

because with the current policy, its like we are in 1.x forever.... our
version numbers are a joke!
-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Grant ... you've made it - the 100th response to that thread. Do we keep
records somewhere? :)

Ok I'm simply proposing to define 'index back-compat' as index format
back-compat. With that, we don't 'wait' for something to happen, we just say
up front that if that changes, we provide a migration tool for the latest
index format version. Simple as that. The rest, we can 'see what happens'
...

Shai

On Thu, Apr 15, 2010 at 11:29 PM, Grant Ingersoll <gs...@apache.org>wrote:

>
> On Apr 15, 2010, at 4:21 PM, Shai Erera wrote:
>
> > +1 on the Analyzers as well.
> >
> > Earwin, I think I don't mind if we introduce migrate() elsewhere rather
> than on IW. What I meant to say is that if we stick w/ index format
> back-compat and ongoing migration, then such a method would be useful on IW
> for customers to call to ensure they're on the latest version.
> > But if the majority here agree w/ a standalone tool, then I'm ok if it
> sits elsewhere.
> >
> > Grant, I'm all for 'just doing it and see what happens'. But I think we
> need to at least decide what we're going to do so it's clear to everyone.
> Because I'd like to know if I'm about to propose an index format change,
> whether I need to build migration tool or not. Actually, I'd like to know if
> people like Robert (basically those who have no problem to reindex and don't
> understand the fuss around it) will want to change the index format - can I
> count on them to be asked to provide such tool? That's to me a policy we
> should decide on ... whatever the consequences.
>
> As I said, we should strive for index compatibility, but even in the past,
> we said we did, but the implications weren't always clear.   I think index
> compatibility is very important.  I've seen plenty of times where reindexing
> is not possible.  But even then, you still have the option of testing to
> find out whether you can update or not.  If you can't update, then don't
> until you can figure out how to do it.  FWIW, I think our approach is much
> more proactive than "see what happens".  I'd argue, that in the past, our
> approach was "see what happens", only the "seeing" didn't happen until after
> the release!
>
> -Grant
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Sanne Grinovero <sa...@gmail.com>.

+1 on the Analyzers split,
But would like to point out that it's not very different than having a
non final static "version" field.
Just a much better solution as you keep your code manageable.

2010/4/15 Grant Ingersoll <gs...@apache.org>:
>
> On Apr 15, 2010, at 4:21 PM, Shai Erera wrote:
>
>> +1 on the Analyzers as well.
>>
>> Earwin, I think I don't mind if we introduce migrate() elsewhere rather than on IW. What I meant to say is that if we stick w/ index format back-compat and ongoing migration, then such a method would be useful on IW for customers to call to ensure they're on the latest version.
>> But if the majority here agree w/ a standalone tool, then I'm ok if it sits elsewhere.
>>
>> Grant, I'm all for 'just doing it and see what happens'. But I think we need to at least decide what we're going to do so it's clear to everyone. Because I'd like to know if I'm about to propose an index format change, whether I need to build migration tool or not. Actually, I'd like to know if people like Robert (basically those who have no problem to reindex and don't understand the fuss around it) will want to change the index format - can I count on them to be asked to provide such tool? That's to me a policy we should decide on ... whatever the consequences.
>
> As I said, we should strive for index compatibility, but even in the past, we said we did, but the implications weren't always clear.   I think index compatibility is very important.  I've seen plenty of times where reindexing is not possible.  But even then, you still have the option of testing to find out whether you can update or not.  If you can't update, then don't until you can figure out how to do it.  FWIW, I think our approach is much more proactive than "see what happens".  I'd argue, that in the past, our approach was "see what happens", only the "seeing" didn't happen until after the release!
>
> -Grant
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Grant Ingersoll <gs...@apache.org>.

On Apr 15, 2010, at 4:21 PM, Shai Erera wrote:

> +1 on the Analyzers as well.
> 
> Earwin, I think I don't mind if we introduce migrate() elsewhere rather than on IW. What I meant to say is that if we stick w/ index format back-compat and ongoing migration, then such a method would be useful on IW for customers to call to ensure they're on the latest version.
> But if the majority here agree w/ a standalone tool, then I'm ok if it sits elsewhere.
> 
> Grant, I'm all for 'just doing it and see what happens'. But I think we need to at least decide what we're going to do so it's clear to everyone. Because I'd like to know if I'm about to propose an index format change, whether I need to build migration tool or not. Actually, I'd like to know if people like Robert (basically those who have no problem to reindex and don't understand the fuss around it) will want to change the index format - can I count on them to be asked to provide such tool? That's to me a policy we should decide on ... whatever the consequences.

As I said, we should strive for index compatibility, but even in the past, we said we did, but the implications weren't always clear.   I think index compatibility is very important.  I've seen plenty of times where reindexing is not possible.  But even then, you still have the option of testing to find out whether you can update or not.  If you can't update, then don't until you can figure out how to do it.  FWIW, I think our approach is much more proactive than "see what happens".  I'd argue, that in the past, our approach was "see what happens", only the "seeing" didn't happen until after the release!

-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

+1 on the Analyzers as well.

Earwin, I think I don't mind if we introduce migrate() elsewhere rather than
on IW. What I meant to say is that if we stick w/ index format back-compat
and ongoing migration, then such a method would be useful on IW for
customers to call to ensure they're on the latest version.
But if the majority here agree w/ a standalone tool, then I'm ok if it sits
elsewhere.

Grant, I'm all for 'just doing it and see what happens'. But I think we need
to at least decide what we're going to do so it's clear to everyone. Because
I'd like to know if I'm about to propose an index format change, whether I
need to build migration tool or not. Actually, I'd like to know if people
like Robert (basically those who have no problem to reindex and don't
understand the fuss around it) will want to change the index format - can I
count on them to be asked to provide such tool? That's to me a policy we
should decide on ... whatever the consequences.

But +1 for changing something ! Analyzers at first, API second.

Shai

On Thu, Apr 15, 2010 at 10:52 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Thu, Apr 15, 2010 at 3:50 PM, Robert Muir <rc...@gmail.com> wrote:
> > for now simply moving analyzers to its own jar filE would be a great
> step!
>
> +1 -- why not consolidate all analyzers now?  (And fix indexer to
> require a minimal API = TokenStream minus reset & close).
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Thu, Apr 15, 2010 at 3:50 PM, Robert Muir <rc...@gmail.com> wrote:
> for now simply moving analyzers to its own jar filE would be a great step!

+1 -- why not consolidate all analyzers now?  (And fix indexer to
require a minimal API = TokenStream minus reset & close).

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

>> 3. put analyzers in their own versioned jar files.
>>
>
> Yes, every analyzer needs to have its own version and thus, jar file.
> Putting all analyzers into one versioned jar file joins them at the hip and
> suffers from the same versioning and compat problems we're currently facing
> in core.
>
> Andi..
>
>
that was actually a typo, sorry :) But maybe not a bad idea for the future.

for now simply moving analyzers to its own jar filE would be a great step!

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Andi Vajda <va...@osafoundation.org>.

On Thu, 15 Apr 2010, Robert Muir wrote:

> 
> 
> 2010/4/15 Michael McCandless <lu...@mikemccandless.com>
>
>       I realize the migration tool has issues -- it fixes the hard
>       changes
>       but silently allows the soft changes to break (ie, your
>       analyzers my
>       not produce the same tokens, until we move all core analyzers
>       outside
>       of core, so they are separately versioned), but it seems like a
>       good
>       compromise here?
> 
> 
> Well, lets consider doing that too. Since analyzers have this tough problem
> of being "soft changes", I propose the following:
> 1. get rid of version
> 2. minimize the interface between the indexer and analysis
> 3. put analyzers in their own versioned jar files.

Yes, every analyzer needs to have its own version and thus, jar file.
Putting all analyzers into one versioned jar file joins them at the hip and 
suffers from the same versioning and compat problems we're currently facing 
in core.

Andi..

> 
> this way, we could provide a realistic capability for users to use
> lucene-3.5.jar with lucene-3.2-analyzers.jar, and possibly have STRONGER
> analyzer back compat (e.g. if we minimize the damn thing enough, perhaps
> very old analyzers.jar's could even work across major releases).
> 
> its also much safer when you are using the same bytecodes you used before,
> instead of hairy back compat layers. I don't refer to Uwe's code here: its
> perfect, but we cant force Uwe into writing the back compat for every big
> feature.
> 
> --
> Robert Muir
> rcmuir@gmail.com
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

2010/4/15 Michael McCandless <lu...@mikemccandless.com>

>
> I realize the migration tool has issues -- it fixes the hard changes
> but silently allows the soft changes to break (ie, your analyzers my
> not produce the same tokens, until we move all core analyzers outside
> of core, so they are separately versioned), but it seems like a good
> compromise here?
>
>
Well, lets consider doing that too. Since analyzers have this tough problem
of being "soft changes", I propose the following:
1. get rid of version
2. minimize the interface between the indexer and analysis
3. put analyzers in their own versioned jar files.

this way, we could provide a realistic capability for users to use
lucene-3.5.jar with lucene-3.2-analyzers.jar, and possibly have STRONGER
analyzer back compat (e.g. if we minimize the damn thing enough, perhaps
very old analyzers.jar's could even work across major releases).

its also much safer when you are using the same bytecodes you used before,
instead of hairy back compat layers. I don't refer to Uwe's code here: its
perfect, but we cant force Uwe into writing the back compat for every big
feature.


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

Unfortunately, live searching against an old index can get very hairy.
EG look at what I had to do for the "flex API on pre-flex index" flex
emulation layer.

It's also not great because it gives the illusion that all is good,
yet, you've taken a silent hit (up to ~10% or so) in your search
perf.

Whereas building & maintaining a one-time index migration tool, in
contrast, is much less work.

I realize the migration tool has issues -- it fixes the hard changes
but silently allows the soft changes to break (ie, your analyzers my
not produce the same tokens, until we move all core analyzers outside
of core, so they are separately versioned), but it seems like a good
compromise here?

Mike

2010/4/15 Shai Erera <se...@gmail.com>:
> The reason Earwin why online migration is faster is because when u
> finally need to *fully* migrate your index, most chances are that most
> of the segments are already on the newer format. Offline migration
> will just keep the application idle for some amount of time until ALL
> segments are migrated.
>
> During the lifecycle of the index, segments are merged anyway, so
> migrating them on the fly virtually costs nothing. At the end, when u
> upgrade to a Lucene version which doesn't support the previous index
> format, you'll on the worse case need to migrate few large segments
> which were never merged. I don't know how many of those there will be
> as it really depends on the application, but I'd bet this process will
> touch just a few segments. And hence, throughput wise it will be a lot
> faster.
>
> We should create a migrate() API on IW which will touch just those
> segments and not incur a full optimize. That API can also be used for
> an offline migration tool, if we decide that's what we want.
>
> Shai
>
> On Thursday, April 15, 2010, jm <jm...@gmail.com> wrote:
>> Not sure if plain users are allowed/encouraged to post in this list,
>> but wanted to mention (just an opinion from a happy user), as other
>> users have, that not all of us can reindex just like that. It would
>> not be 10 min for one of our installations for sure...
>>
>> First, i would need to implement some code to reindex, cause my source
>> data is postprocessed/compressed/encrypted/moved after it arrives to
>> the application, so I would need to retrieve all etc. And then
>> reindexing it would take days.
>> javier
>>
>> On Thu, Apr 15, 2010 at 9:04 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>>>> manual migration on the segments that are still on old versions.
>>>> That's not the point about whether optimize() is good or not. It is
>>>> the difference between telling the customer to run a 5-day migration
>>>> process, or a couple of hours. At the end of the day, the same
>>>> migration code will need to be written whether for the manual or
>>>> automatic case. And probably by the same developer which changed the
>>>> index format. It's the difference of when does it happen.
>>>
>>> Converting stuff is easier then emulating, that's exactly why I want a
>>> separate tool.
>>> There's no need to support cross-version merging, nor to emulate old APIs.
>>>
>>> I also don't understand why offline migration is going to take days
>>> instead of hours for online migration??
>>> WTF, it's gonna be even faster, as it doesn't have to merge things.
>>>
>>> --
>>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>> ICQ: 104465785
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

2010/4/15 Shai Erera <se...@gmail.com>:
> The reason Earwin why online migration is faster is because when u
> finally need to *fully* migrate your index, most chances are that most
> of the segments are already on the newer format. Offline migration
> will just keep the application idle for some amount of time until ALL
> segments are migrated.
>
> During the lifecycle of the index, segments are merged anyway, so
> migrating them on the fly virtually costs nothing. At the end, when u
> upgrade to a Lucene version which doesn't support the previous index
> format, you'll on the worse case need to migrate few large segments
> which were never merged. I don't know how many of those there will be
> as it really depends on the application, but I'd bet this process will
> touch just a few segments. And hence, throughput wise it will be a lot
> faster.
>
> We should create a migrate() API on IW which will touch just those
> segments and not incur a full optimize. That API can also be used for
> an offline migration tool, if we decide that's what we want.

We should not create such an API on IW, and we should build offline
migration tool as a separate thing :)
Because otherwise we have to keep all back-compat stuff within IW, SR
and friends as it is.

Look at current SegmentReader.Norm code - there's three freaking
places they can be loaded from.
I will also reiterate the issue of the API. Fat index changes are
almost certainly accompanied by API changes.
With online migration we have to emulate new APIs over old segments,
which is really cumbersome.
With offline migration we only need to be able to read said segments
in one or another manner.


-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On 04/15/2010 03:25 PM, Shai Erera wrote:
> We should create a migrate() API on IW which will touch just those
> segments and not incur a full optimize. That API can also be used for
> an offline migration tool, if we decide that's what we want.
>
>    
What about an index that has already called optimize()? I presume it 
will be upgraded with what ever is decided?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

The reason Earwin why online migration is faster is because when u
finally need to *fully* migrate your index, most chances are that most
of the segments are already on the newer format. Offline migration
will just keep the application idle for some amount of time until ALL
segments are migrated.

During the lifecycle of the index, segments are merged anyway, so
migrating them on the fly virtually costs nothing. At the end, when u
upgrade to a Lucene version which doesn't support the previous index
format, you'll on the worse case need to migrate few large segments
which were never merged. I don't know how many of those there will be
as it really depends on the application, but I'd bet this process will
touch just a few segments. And hence, throughput wise it will be a lot
faster.

We should create a migrate() API on IW which will touch just those
segments and not incur a full optimize. That API can also be used for
an offline migration tool, if we decide that's what we want.

Shai

On Thursday, April 15, 2010, jm <jm...@gmail.com> wrote:
> Not sure if plain users are allowed/encouraged to post in this list,
> but wanted to mention (just an opinion from a happy user), as other
> users have, that not all of us can reindex just like that. It would
> not be 10 min for one of our installations for sure...
>
> First, i would need to implement some code to reindex, cause my source
> data is postprocessed/compressed/encrypted/moved after it arrives to
> the application, so I would need to retrieve all etc. And then
> reindexing it would take days.
> javier
>
> On Thu, Apr 15, 2010 at 9:04 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>>> manual migration on the segments that are still on old versions.
>>> That's not the point about whether optimize() is good or not. It is
>>> the difference between telling the customer to run a 5-day migration
>>> process, or a couple of hours. At the end of the day, the same
>>> migration code will need to be written whether for the manual or
>>> automatic case. And probably by the same developer which changed the
>>> index format. It's the difference of when does it happen.
>>
>> Converting stuff is easier then emulating, that's exactly why I want a
>> separate tool.
>> There's no need to support cross-version merging, nor to emulate old APIs.
>>
>> I also don't understand why offline migration is going to take days
>> instead of hours for online migration??
>> WTF, it's gonna be even faster, as it doesn't have to merge things.
>>
>> --
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by jm <jm...@gmail.com>.

Not sure if plain users are allowed/encouraged to post in this list,
but wanted to mention (just an opinion from a happy user), as other
users have, that not all of us can reindex just like that. It would
not be 10 min for one of our installations for sure...

First, i would need to implement some code to reindex, cause my source
data is postprocessed/compressed/encrypted/moved after it arrives to
the application, so I would need to retrieve all etc. And then
reindexing it would take days.
javier

On Thu, Apr 15, 2010 at 9:04 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>> manual migration on the segments that are still on old versions.
>> That's not the point about whether optimize() is good or not. It is
>> the difference between telling the customer to run a 5-day migration
>> process, or a couple of hours. At the end of the day, the same
>> migration code will need to be written whether for the manual or
>> automatic case. And probably by the same developer which changed the
>> index format. It's the difference of when does it happen.
>
> Converting stuff is easier then emulating, that's exactly why I want a
> separate tool.
> There's no need to support cross-version merging, nor to emulate old APIs.
>
> I also don't understand why offline migration is going to take days
> instead of hours for online migration??
> WTF, it's gonna be even faster, as it doesn't have to merge things.
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Sanne Grinovero <sa...@gmail.com>.

Hello,
I think some compatibility breaks should really be accepted, otherwise
these requirements are going to kill the technological advancement:
the effort in backwards compatibility will grow and be more
timeconsuming and harder every day.

A mayor release won't happen every day, likely not even every year, so
it seems acceptable to have milestones defining compatibility
boundaries: you need to be able to "reset" the complexity curve
occasionally.

Backporting a feature would benefit from being merged in the correct
testsuite, and avoid the explosion of this matrix-like backwards
compatibility test suite. BTW the current testsuite is likely covering
all kinds of combinations which nobody is actually using or caring
about.

Also if I where to discover a nice improvement in an Analyzer, and you
where telling me that to contribute it I would have to face this
amount of complexity.. I would think twice before trying; honestly the
current requirements are scary.

+1

Sanne

2010/4/15 Earwin Burrfoot <ea...@gmail.com>:
> I'd like to remind that Mike's proposal has stable branches.
>
> We can branch off preflex trunk right now and wrap it up as 3.1.
> Current trunk is declared as future 4.0 and all backcompat cruft is
> removed from it.
> If some new features/bugfixes appear in trunk, and they don't break
> stuff - we backport them to 3.x branch, eventually releasing 3.2, 3.3,
> etc
>
> Thus, devs are free to work without back-compat burden, bleeding edge
> users get their blood, conservative users get their stability + a
> subset of new features from stable branches.
>
>
> On Thu, Apr 15, 2010 at 22:02, DM Smith <dm...@gmail.com> wrote:
>> On 04/15/2010 01:50 PM, Earwin Burrfoot wrote:
>>>>
>>>> First, the index format. IMHO, it is a good thing for a major release to
>>>> be
>>>> able to read the prior major release's index. And the ability to convert
>>>> it
>>>> to the current format via optimize is also good. Whatever is decided on
>>>> this
>>>> thread should take this seriously.
>>>>
>>>
>>> Optimize is a bad way to convert to current.
>>> 1. conversion is not guaranteed, optimizing already optimized index is a
>>> noop
>>> 2. it merges all your segments. if you use BalancedSegmentMergePolicy,
>>> that destroys your segment size distribution
>>>
>>> Dedicated upgrade tool (available both from command-line and
>>> programmatically) is a good way to convert to current.
>>> 1. conversion happens exactly when you need it, conversion happens for
>>> sure, no additional checks needed
>>> 2. it should leave all your segments as is, only changing their format
>>>
>>>
>>>>
>>>> It is my observation, though possibly not correct, that core only has
>>>> rudimentary analysis capabilities, handling English very well. To handle
>>>> other languages well "contrib/analyzers" is required. Until recently it
>>>> did
>>>> not get much love. There have been many bw compat breaking changes
>>>> (though
>>>> w/ version one can probably get the prior behavior). IMHO, most of
>>>> contrib/analyzers should be core. My guess is that most non-trivial
>>>> applications will use contrib/analyzers.
>>>>
>>>
>>> I counter - most non-trivial applications will use their own analyzers.
>>> The more modules - the merrier. You can choose precisely what you need.
>>>
>>
>> By and large an analyzer is a simple wrapper for a tokenizer and some
>> filters. Are you suggesting that most non-trivial apps write their own
>> tokenizers and filters?
>>
>> I'd find that hard to believe. For example, I don't know enough Chinese,
>> Farsi, Arabic, Polish, ... to come up with anything better than what Lucene
>> has to tokenize, stem or filter these.
>>
>>>
>>>>
>>>> Our user base are those with ancient,
>>>> underpowered laptops in 3-rd world countries. On those machines it might
>>>> take 10 minutes to create an index and during that time the machine is
>>>> fairly unresponsive. There is no opportunity to "do it in the
>>>> background."
>>>>
>>>
>>> Major Lucene releases (feature-wise, not version-wise) happen like
>>> once in a year, or year-and-a-half.
>>> Is it that hard for your users to wait ten minutes once a year?
>>>
>>
>>  I said that was for one index. Multiply that times the number of books
>> available (300+) and yes, it is too much to ask. Even if a small subset is
>> indexed, say 30, that's around 5 hours of waiting.
>>
>> Under consideration is the frequency of breakage. Some are suggesting a
>> greater frequency than yearly.
>>
>> DM
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 15, 2010 at 1:30 PM, DM Smith <dm...@gmail.com> wrote:

>
> Another behavior change is an upgrade in Java version. By forcing users to
> go to Java 5 with Lucene 3, the version of Unicode changed. This in itself
> causes a change in some token streams.
>
> ...

>
> It is my observation, though possibly not correct, that core only has
> rudimentary analysis capabilities, handling English very well.
>

DM brings up some interesting points here. For example, the Porter Stemmer
in core from 1970 or whenever, is essentially "frozen" to all changes for
some time now, it says so on Porter's site.

This is not the case for non-english, things are very much in flux,
including how the characters themselves are encoded on a computer. If we
want to support languages other than english in lucene, we have to make it
possible to iterate and improve things without making 20 copies of something
or scattering Version everywhere.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On 04/15/2010 03:12 PM, Earwin Burrfoot wrote:
> On Thu, Apr 15, 2010 at 23:07, DM Smith<dm...@gmail.com>  wrote:
>    
>> On 04/15/2010 03:04 PM, Earwin Burrfoot wrote:
>>      
>>>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>>>> manual migration on the segments that are still on old versions.
>>>> That's not the point about whether optimize() is good or not. It is
>>>> the difference between telling the customer to run a 5-day migration
>>>> process, or a couple of hours. At the end of the day, the same
>>>> migration code will need to be written whether for the manual or
>>>> automatic case. And probably by the same developer which changed the
>>>> index format. It's the difference of when does it happen.
>>>>
>>>>          
>>> Converting stuff is easier then emulating, that's exactly why I want a
>>> separate tool.
>>> There's no need to support cross-version merging, nor to emulate old APIs.
>>>
>>> I also don't understand why offline migration is going to take days
>>> instead of hours for online migration??
>>> WTF, it's gonna be even faster, as it doesn't have to merge things.
>>>
>>>
>>>        
>> Will it be able to be used within a client application that creates and uses
>> local indexes?
>>
>> I;m assuming it will be faster than re-indexing.
>>      
> As I said earlier in the topic, it is obvious the tool has to have
> both programmatic and command-line interfaces.
> I will also reiterate - it only upgrades the index structurally. If
> you changed your analyzers - that's your problem and you have to deal
> with it
Good. (Sorry I missed that. There's just too much in the thread to keep 
track of ;)

As long as my "old" analyzers will still work with the new lucene-core 
jar, I'm fat, dumb and happy with the upgraded index.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

On Thu, Apr 15, 2010 at 23:07, DM Smith <dm...@gmail.com> wrote:
> On 04/15/2010 03:04 PM, Earwin Burrfoot wrote:
>>>
>>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>>> manual migration on the segments that are still on old versions.
>>> That's not the point about whether optimize() is good or not. It is
>>> the difference between telling the customer to run a 5-day migration
>>> process, or a couple of hours. At the end of the day, the same
>>> migration code will need to be written whether for the manual or
>>> automatic case. And probably by the same developer which changed the
>>> index format. It's the difference of when does it happen.
>>>
>>
>> Converting stuff is easier then emulating, that's exactly why I want a
>> separate tool.
>> There's no need to support cross-version merging, nor to emulate old APIs.
>>
>> I also don't understand why offline migration is going to take days
>> instead of hours for online migration??
>> WTF, it's gonna be even faster, as it doesn't have to merge things.
>>
>>
>
> Will it be able to be used within a client application that creates and uses
> local indexes?
>
> I;m assuming it will be faster than re-indexing.

As I said earlier in the topic, it is obvious the tool has to have
both programmatic and command-line interfaces.
I will also reiterate - it only upgrades the index structurally. If
you changed your analyzers - that's your problem and you have to deal
with it.


-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On 04/15/2010 03:04 PM, Earwin Burrfoot wrote:
>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>> manual migration on the segments that are still on old versions.
>> That's not the point about whether optimize() is good or not. It is
>> the difference between telling the customer to run a 5-day migration
>> process, or a couple of hours. At the end of the day, the same
>> migration code will need to be written whether for the manual or
>> automatic case. And probably by the same developer which changed the
>> index format. It's the difference of when does it happen.
>>      
> Converting stuff is easier then emulating, that's exactly why I want a
> separate tool.
> There's no need to support cross-version merging, nor to emulate old APIs.
>
> I also don't understand why offline migration is going to take days
> instead of hours for online migration??
> WTF, it's gonna be even faster, as it doesn't have to merge things.
>
>    
Will it be able to be used within a client application that creates and 
uses local indexes?

I;m assuming it will be faster than re-indexing.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
> manual migration on the segments that are still on old versions.
> That's not the point about whether optimize() is good or not. It is
> the difference between telling the customer to run a 5-day migration
> process, or a couple of hours. At the end of the day, the same
> migration code will need to be written whether for the manual or
> automatic case. And probably by the same developer which changed the
> index format. It's the difference of when does it happen.

Converting stuff is easier then emulating, that's exactly why I want a
separate tool.
There's no need to support cross-version merging, nor to emulate old APIs.

I also don't understand why offline migration is going to take days
instead of hours for online migration??
WTF, it's gonna be even faster, as it doesn't have to merge things.

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

I seriously don't understand the fuss around index format back compat.
How many times is this changed such that it is too much to ask to keep
X support X-1?

I prefer to have ongoing segment merging but can live w/ a manual
converter tool. Thing is - I'll probably not be able to develop one
myself outside the scope of Lucene because I'll miss tons of API. So
having Lucene declare and adhere to it seems reasonable to me.

BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
manual migration on the segments that are still on old versions.
That's not the point about whether optimize() is good or not. It is
the difference between telling the customer to run a 5-day migration
process, or a couple of hours. At the end of the day, the same
migration code will need to be written whether for the manual or
automatic case. And probably by the same developer which changed the
index format. It's the difference of when does it happen.

And I also think that a manual migration tool will need access to some
lower level API which is not exposed today, and will generally not
perform as well as online migration. But that's a side note...

Shai

On Thursday, April 15, 2010, Earwin Burrfoot <ea...@gmail.com> wrote:
> I'd like to remind that Mike's proposal has stable branches.
>
> We can branch off preflex trunk right now and wrap it up as 3.1.
> Current trunk is declared as future 4.0 and all backcompat cruft is
> removed from it.
> If some new features/bugfixes appear in trunk, and they don't break
> stuff - we backport them to 3.x branch, eventually releasing 3.2, 3.3,
> etc
>
> Thus, devs are free to work without back-compat burden, bleeding edge
> users get their blood, conservative users get their stability + a
> subset of new features from stable branches.
>
>
> On Thu, Apr 15, 2010 at 22:02, DM Smith <dm...@gmail.com> wrote:
>> On 04/15/2010 01:50 PM, Earwin Burrfoot wrote:
>>>>
>>>> First, the index format. IMHO, it is a good thing for a major release to
>>>> be
>>>> able to read the prior major release's index. And the ability to convert
>>>> it
>>>> to the current format via optimize is also good. Whatever is decided on
>>>> this
>>>> thread should take this seriously.
>>>>
>>>
>>> Optimize is a bad way to convert to current.
>>> 1. conversion is not guaranteed, optimizing already optimized index is a
>>> noop
>>> 2. it merges all your segments. if you use BalancedSegmentMergePolicy,
>>> that destroys your segment size distribution
>>>
>>> Dedicated upgrade tool (available both from command-line and
>>> programmatically) is a good way to convert to current.
>>> 1. conversion happens exactly when you need it, conversion happens for
>>> sure, no additional checks needed
>>> 2. it should leave all your segments as is, only changing their format
>>>
>>>
>>>>
>>>> It is my observation, though possibly not correct, that core only has
>>>> rudimentary analysis capabilities, handling English very well. To handle
>>>> other languages well "contrib/analyzers" is required. Until recently it
>>>> did
>>>> not get much love. There have been many bw compat breaking changes
>>>> (though
>>>> w/ version one can probably get the prior behavior). IMHO, most of
>>>> contrib/analyzers should be core. My guess is that most non-trivial
>>>> applications will use contrib/analyzers.
>>>>
>>>
>>> I counter - most non-trivial applications will use their own analyzers.
>>> The more modules - the merrier. You can choose precisely what you need.
>>>
>>
>> By and large an analyzer is a simple wrapper for a tokenizer and some
>> filters. Are you suggesting that most non-trivial apps write their own
>> tokenizers and filters?
>>
>> I'd find that hard to believe. For example, I don't know enough Chinese,
>> Farsi, Arabic, Polish, ... to come up with anything better than what Lucene
>> has to tokenize, stem or filter these.
>>
>>>
>>>>
>>>> Our user base are those with ancient,
>>>> underpowered laptops in 3-rd world countries. On those machines it might
>>>> take 10 minutes to create an index and during that time the machine is
>>>> fairly unresponsive. There is no opportunity to "do it in the
>>>> background."
>>>>
>>>
>>> Major Lucene releases (feature-wise, not version-wise) happen like
>>> once in a year, or year-and-a-half.
>>> Is it that hard for your users to wait ten minutes once a year?
>>>
>>
>>  I said that was for one index. Multiply that times the number of books
>> available (300+) and yes, it is too much to ask. Even if a small subset is
>> indexed, say 30, that's around 5 hours of waiting.
>>
>> Under consideration is the frequency of breakage. Some are suggesting a
>> greater frequency than yearly.
>>
>> DM
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

I think this should split off the mega-thread :)

On Thu, Apr 15, 2010 at 23:28, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi Earwin,
>
> I am strongly +1 on this. I would also make the Release Manager for 3.1, if nobody else wants to do this. I would like to take the preflex tag or some revisions before (maybe without the IndexWriterConfig, which is a really new API) to be 3.1 branch. And after that port some of my post-flex-changes like the StandardTokenizer refactoring back (so we can produce the old analyzer still without Java 1.4).
>
> So +1 on branching pre-flex and release as 3.1 soon. The Unicode improvements rectify a new release. I think also s1monw wants to have this.
>
> Uwe

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

RE: Proposal about Version API "relaxation"

Posted by Uwe Schindler <uw...@thetaphi.de>.

Hi Earwin,

I am strongly +1 on this. I would also make the Release Manager for 3.1, if nobody else wants to do this. I would like to take the preflex tag or some revisions before (maybe without the IndexWriterConfig, which is a really new API) to be 3.1 branch. And after that port some of my post-flex-changes like the StandardTokenizer refactoring back (so we can produce the old analyzer still without Java 1.4).

So +1 on branching pre-flex and release as 3.1 soon. The Unicode improvements rectify a new release. I think also s1monw wants to have this.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Earwin Burrfoot [mailto:earwin@gmail.com]
> Sent: Thursday, April 15, 2010 8:15 PM
> To: java-dev@lucene.apache.org
> Subject: Re: Proposal about Version API "relaxation"
> 
> I'd like to remind that Mike's proposal has stable branches.
> 
> We can branch off preflex trunk right now and wrap it up as 3.1.
> Current trunk is declared as future 4.0 and all backcompat cruft is
> removed from it.
> If some new features/bugfixes appear in trunk, and they don't break
> stuff - we backport them to 3.x branch, eventually releasing 3.2, 3.3,
> etc
> 
> Thus, devs are free to work without back-compat burden, bleeding edge
> users get their blood, conservative users get their stability + a
> subset of new features from stable branches.
> 
> 
> On Thu, Apr 15, 2010 at 22:02, DM Smith <dm...@gmail.com> wrote:
> > On 04/15/2010 01:50 PM, Earwin Burrfoot wrote:
> >>>
> >>> First, the index format. IMHO, it is a good thing for a major
> release to
> >>> be
> >>> able to read the prior major release's index. And the ability to
> convert
> >>> it
> >>> to the current format via optimize is also good. Whatever is
> decided on
> >>> this
> >>> thread should take this seriously.
> >>>
> >>
> >> Optimize is a bad way to convert to current.
> >> 1. conversion is not guaranteed, optimizing already optimized index
> is a
> >> noop
> >> 2. it merges all your segments. if you use
> BalancedSegmentMergePolicy,
> >> that destroys your segment size distribution
> >>
> >> Dedicated upgrade tool (available both from command-line and
> >> programmatically) is a good way to convert to current.
> >> 1. conversion happens exactly when you need it, conversion happens
> for
> >> sure, no additional checks needed
> >> 2. it should leave all your segments as is, only changing their
> format
> >>
> >>
> >>>
> >>> It is my observation, though possibly not correct, that core only
> has
> >>> rudimentary analysis capabilities, handling English very well. To
> handle
> >>> other languages well "contrib/analyzers" is required. Until
> recently it
> >>> did
> >>> not get much love. There have been many bw compat breaking changes
> >>> (though
> >>> w/ version one can probably get the prior behavior). IMHO, most of
> >>> contrib/analyzers should be core. My guess is that most non-trivial
> >>> applications will use contrib/analyzers.
> >>>
> >>
> >> I counter - most non-trivial applications will use their own
> analyzers.
> >> The more modules - the merrier. You can choose precisely what you
> need.
> >>
> >
> > By and large an analyzer is a simple wrapper for a tokenizer and some
> > filters. Are you suggesting that most non-trivial apps write their
> own
> > tokenizers and filters?
> >
> > I'd find that hard to believe. For example, I don't know enough
> Chinese,
> > Farsi, Arabic, Polish, ... to come up with anything better than what
> Lucene
> > has to tokenize, stem or filter these.
> >
> >>
> >>>
> >>> Our user base are those with ancient,
> >>> underpowered laptops in 3-rd world countries. On those machines it
> might
> >>> take 10 minutes to create an index and during that time the machine
> is
> >>> fairly unresponsive. There is no opportunity to "do it in the
> >>> background."
> >>>
> >>
> >> Major Lucene releases (feature-wise, not version-wise) happen like
> >> once in a year, or year-and-a-half.
> >> Is it that hard for your users to wait ten minutes once a year?
> >>
> >
> >  I said that was for one index. Multiply that times the number of
> books
> > available (300+) and yes, it is too much to ask. Even if a small
> subset is
> > indexed, say 30, that's around 5 hours of waiting.
> >
> > Under consideration is the frequency of breakage. Some are suggesting
> a
> > greater frequency than yearly.
> >
> > DM
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> 
> 
> 
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

I'd like to remind that Mike's proposal has stable branches.

We can branch off preflex trunk right now and wrap it up as 3.1.
Current trunk is declared as future 4.0 and all backcompat cruft is
removed from it.
If some new features/bugfixes appear in trunk, and they don't break
stuff - we backport them to 3.x branch, eventually releasing 3.2, 3.3,
etc

Thus, devs are free to work without back-compat burden, bleeding edge
users get their blood, conservative users get their stability + a
subset of new features from stable branches.


On Thu, Apr 15, 2010 at 22:02, DM Smith <dm...@gmail.com> wrote:
> On 04/15/2010 01:50 PM, Earwin Burrfoot wrote:
>>>
>>> First, the index format. IMHO, it is a good thing for a major release to
>>> be
>>> able to read the prior major release's index. And the ability to convert
>>> it
>>> to the current format via optimize is also good. Whatever is decided on
>>> this
>>> thread should take this seriously.
>>>
>>
>> Optimize is a bad way to convert to current.
>> 1. conversion is not guaranteed, optimizing already optimized index is a
>> noop
>> 2. it merges all your segments. if you use BalancedSegmentMergePolicy,
>> that destroys your segment size distribution
>>
>> Dedicated upgrade tool (available both from command-line and
>> programmatically) is a good way to convert to current.
>> 1. conversion happens exactly when you need it, conversion happens for
>> sure, no additional checks needed
>> 2. it should leave all your segments as is, only changing their format
>>
>>
>>>
>>> It is my observation, though possibly not correct, that core only has
>>> rudimentary analysis capabilities, handling English very well. To handle
>>> other languages well "contrib/analyzers" is required. Until recently it
>>> did
>>> not get much love. There have been many bw compat breaking changes
>>> (though
>>> w/ version one can probably get the prior behavior). IMHO, most of
>>> contrib/analyzers should be core. My guess is that most non-trivial
>>> applications will use contrib/analyzers.
>>>
>>
>> I counter - most non-trivial applications will use their own analyzers.
>> The more modules - the merrier. You can choose precisely what you need.
>>
>
> By and large an analyzer is a simple wrapper for a tokenizer and some
> filters. Are you suggesting that most non-trivial apps write their own
> tokenizers and filters?
>
> I'd find that hard to believe. For example, I don't know enough Chinese,
> Farsi, Arabic, Polish, ... to come up with anything better than what Lucene
> has to tokenize, stem or filter these.
>
>>
>>>
>>> Our user base are those with ancient,
>>> underpowered laptops in 3-rd world countries. On those machines it might
>>> take 10 minutes to create an index and during that time the machine is
>>> fairly unresponsive. There is no opportunity to "do it in the
>>> background."
>>>
>>
>> Major Lucene releases (feature-wise, not version-wise) happen like
>> once in a year, or year-and-a-half.
>> Is it that hard for your users to wait ten minutes once a year?
>>
>
>  I said that was for one index. Multiply that times the number of books
> available (300+) and yes, it is too much to ask. Even if a small subset is
> indexed, say 30, that's around 5 hours of waiting.
>
> Under consideration is the frequency of breakage. Some are suggesting a
> greater frequency than yearly.
>
> DM
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On 04/15/2010 01:50 PM, Earwin Burrfoot wrote:
>> First, the index format. IMHO, it is a good thing for a major release to be
>> able to read the prior major release's index. And the ability to convert it
>> to the current format via optimize is also good. Whatever is decided on this
>> thread should take this seriously.
>>      
> Optimize is a bad way to convert to current.
> 1. conversion is not guaranteed, optimizing already optimized index is a noop
> 2. it merges all your segments. if you use BalancedSegmentMergePolicy,
> that destroys your segment size distribution
>
> Dedicated upgrade tool (available both from command-line and
> programmatically) is a good way to convert to current.
> 1. conversion happens exactly when you need it, conversion happens for
> sure, no additional checks needed
> 2. it should leave all your segments as is, only changing their format
>
>    
>> It is my observation, though possibly not correct, that core only has
>> rudimentary analysis capabilities, handling English very well. To handle
>> other languages well "contrib/analyzers" is required. Until recently it did
>> not get much love. There have been many bw compat breaking changes (though
>> w/ version one can probably get the prior behavior). IMHO, most of
>> contrib/analyzers should be core. My guess is that most non-trivial
>> applications will use contrib/analyzers.
>>      
> I counter - most non-trivial applications will use their own analyzers.
> The more modules - the merrier. You can choose precisely what you need.
>    
By and large an analyzer is a simple wrapper for a tokenizer and some 
filters. Are you suggesting that most non-trivial apps write their own 
tokenizers and filters?

I'd find that hard to believe. For example, I don't know enough Chinese, 
Farsi, Arabic, Polish, ... to come up with anything better than what 
Lucene has to tokenize, stem or filter these.

>    
>> Our user base are those with ancient,
>> underpowered laptops in 3-rd world countries. On those machines it might
>> take 10 minutes to create an index and during that time the machine is
>> fairly unresponsive. There is no opportunity to "do it in the background."
>>      
> Major Lucene releases (feature-wise, not version-wise) happen like
> once in a year, or year-and-a-half.
> Is it that hard for your users to wait ten minutes once a year?
>    
  I said that was for one index. Multiply that times the number of books 
available (300+) and yes, it is too much to ask. Even if a small subset 
is indexed, say 30, that's around 5 hours of waiting.

Under consideration is the frequency of breakage. Some are suggesting a 
greater frequency than yearly.

DM

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

> First, the index format. IMHO, it is a good thing for a major release to be
> able to read the prior major release's index. And the ability to convert it
> to the current format via optimize is also good. Whatever is decided on this
> thread should take this seriously.
Optimize is a bad way to convert to current.
1. conversion is not guaranteed, optimizing already optimized index is a noop
2. it merges all your segments. if you use BalancedSegmentMergePolicy,
that destroys your segment size distribution

Dedicated upgrade tool (available both from command-line and
programmatically) is a good way to convert to current.
1. conversion happens exactly when you need it, conversion happens for
sure, no additional checks needed
2. it should leave all your segments as is, only changing their format

> It is my observation, though possibly not correct, that core only has
> rudimentary analysis capabilities, handling English very well. To handle
> other languages well "contrib/analyzers" is required. Until recently it did
> not get much love. There have been many bw compat breaking changes (though
> w/ version one can probably get the prior behavior). IMHO, most of
> contrib/analyzers should be core. My guess is that most non-trivial
> applications will use contrib/analyzers.
I counter - most non-trivial applications will use their own analyzers.
The more modules - the merrier. You can choose precisely what you need.

> Our user base are those with ancient,
> underpowered laptops in 3-rd world countries. On those machines it might
> take 10 minutes to create an index and during that time the machine is
> fairly unresponsive. There is no opportunity to "do it in the background."
Major Lucene releases (feature-wise, not version-wise) happen like
once in a year, or year-and-a-half.
Is it that hard for your users to wait ten minutes once a year?

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

On 04/15/2010 09:49 AM, Robert Muir wrote:
> wrong, it doesnt fix the analyzers problem.
>
> you need to reindex.
>
> On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot <earwin@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     On Thu, Apr 15, 2010 at 17:17, Yonik Seeley
>     <yonik@lucidimagination.com <ma...@lucidimagination.com>>
>     wrote:
>     > Seamless online upgrades have their place too... say you are
>     upgrading
>     > one server at a time in a cluster.
>
>     Nothing here that can't be solved with an upgrade tool. Down one
>     server, upgrade index, upgrade sofware, up.
>

Having read the thread, I have a few comments. Much of it is summary.

The current proposal requires re-index on every upgrade to Lucene. Plain 
and simple.

Robert is right about the analyzers.

There are three levels of backward compatibility, though we talk about 2.

First, the index format. IMHO, it is a good thing for a major release to 
be able to read the prior major release's index. And the ability to 
convert it to the current format via optimize is also good. Whatever is 
decided on this thread should take this seriously.

Second, the API. The current mechanism to use deprecations to migrate 
users to a new API is both a blessing and a curse. It is a blessing to 
end users so that they have a clear migration path. It is a curse to 
development because the API is bloated with the old and the new. Further 
it causes unfortunate class naming, with the tendency to migrate away 
from the good name. It is a curse to end users because it can cause 
confusion.

While I like the mechanism of deprecations to migrate me from one 
release to another, I'd be open to another mechanism.  So much effort is 
put into API bw compat that might be better spent on another mechanism. 
E.g. thorough documentation.

Third, the behavior. WRT, Analyzers (consisting of tokenizers, stemmers, 
stop words, ...) if the token stream changes, the index is no longer 
valid. It may appear to work, but it is broken. The token stream applies 
not only to the indexed documents, but also to the user supplied query. 
A simple example, if from one release to another the stop word 'a' is 
dropped, then phrase searches including 'a' won't work as 'a' is not in 
the index. Even a simple, obvious bug fix that changes the stream is bad.

Another behavior change is an upgrade in Java version. By forcing users 
to go to Java 5 with Lucene 3, the version of Unicode changed. This in 
itself causes a change in some token streams.

With a change to a token stream, the index must be re-created to ensure 
expected behavior. If the original input is no longer available or the 
index cannot be rebuilt for whatever reason, then lucene should not be 
upgraded.

It is my observation, though possibly not correct, that core only has 
rudimentary analysis capabilities, handling English very well. To handle 
other languages well "contrib/analyzers" is required. Until recently it 
did not get much love. There have been many bw compat breaking changes 
(though w/ version one can probably get the prior behavior). IMHO, most 
of contrib/analyzers should be core. My guess is that most non-trivial 
applications will use contrib/analyzers.

The other problem I have is the assumption that re-index is feasible and 
that indexes are always server based. Re-index feasibility has already 
been well-discussed on this thread from a server side perspective. There 
are many client side applications, like mine, where the index is built 
and used on the clients computer. In my scenario the user builds indexes 
individually for books. From the index perspective, the sentence is the 
Lucene document and the book is the index. Building an index is 
voluntary and takes time proportional to the size of the document and 
time inversely proportional to the power of the computer. Our user base 
are those with ancient, underpowered laptops in 3-rd world countries. On 
those machines it might take 10 minutes to create an index and during 
that time the machine is fairly unresponsive. There is no opportunity to 
"do it in the background."

So what are my choices? (rhetorical) With each new release of my app, 
I'd like to exploit the latest and greatest features of Lucene. And I'm 
going to change my app with features which may or may not be related to 
the use of Lucene. Those latter features are what matter the most to my 
user base. They don't care what technologies are used to do searches. If 
the latest Lucene jar does not let me use Version (or some other 
mechanism) to maintain compatibility with an older index, the user will 
have to re-index. Or I can forgo any future upgrades with Lucene. 
Neither are very palatable.

-- DM Smith

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

On Thu, Apr 15, 2010 at 17:49, Robert Muir <rc...@gmail.com> wrote:
> wrong, it doesnt fix the analyzers problem.
> you need to reindex.
>
> On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>
>> On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com>
>> wrote:
>> > Seamless online upgrades have their place too... say you are upgrading
>> > one server at a time in a cluster.
>>
>> Nothing here that can't be solved with an upgrade tool. Down one
>> server, upgrade index, upgrade sofware, up.

Couldn't care less about analyzers. There's two kinds of breaks in
index compatibility - soft and hard ones.
Hard break is - your index structure changed, you're using a new
encoding for numeric fields, such kind of things.
Soft break is - you fixed a stemmer, so now 'some' words are stemmed
differently, such kind of things.

With hard break you have to do an offline reindex, and then switch
over. With soft breaks you can sometimes just enqueue all your
documents and do reindexation online - that breaks a small percentage
of your queries for a small period of time. Something you can bear, if
that saves you from doing manual labor.

I never claimed an index upgrade tool should fix your tokens, offsets
and whatnot.
It is power-user stuff that allows you to turn some hard breaks into
soft breaks, and then decide on your own how to handle the latter.

We also can hit some index format changes that deny any kind of
automatic conversion. Well, too sad. We'll just skip issuing index
upgrade tool on that release.

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

wrong, it doesnt fix the analyzers problem.

you need to reindex.

On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com>
> wrote:
> > Seamless online upgrades have their place too... say you are upgrading
> > one server at a time in a cluster.
>
> Nothing here that can't be solved with an upgrade tool. Down one
> server, upgrade index, upgrade sofware, up.
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Danil ŢORIN <to...@gmail.com>.

Agree.

However I don't see how lucene could suddenly change that even a
conversion tool is impossible to create.
After all it's all about terms, positions and frequencies.

Yeah..some additions as payloads may appear, disappear, or evolve into
something new, but those are on user's side anyway.

Analyzers indeed are delicate problem so when StandardAnalyzer(which
probably 90% of users use) for same string generates different set of
terms.
But again it's user side problem.
Nothing stops him to rip StandrardAnalyzer from whatever version of
lucene, adapt it to newer indexing API, plug it in and continue.

I already use > 50% customized analyzers, my own query parser and so on.
I have junits for (hopefully) all cases I need to cover, so if new
Analyzer misbehaves, it's my responsability.

Danil.

On Thu, Apr 15, 2010 at 16:56, Grant Ingersoll <gs...@apache.org> wrote:
> I do think major versions should be able to read the previous version index.  Still, even being able to do that is no guarantee that it will produce correct results.  Likewise, even having an upgrade tool is no guarantee that correct results will be produced.  So, my take is that we strive for it, but we all have to realize, and document, that it might not always be possible.  Let's just be practical and pragmatic.  Past history indicates we are capable of, for the most part, reading the prev. version index and upgrading it.  If it can't be done automatically, then we can consider a tool.  If the tool won't work, then we will have to reindex.  It doesn't have to be an all or nothing decision made in the void.  We've always been very practical here about making decisions on problems that are directly facing us, so I would suggest we move forward with the new approach (which I agree makes more sense and is pretty prevalent across a lot of projects) and we take this issue on a case-by-case basis.
>
> -Grant
>
>
> On Apr 15, 2010, at 9:49 AM, Yonik Seeley wrote:
>
>> On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>> On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com> wrote:
>>>> Seamless online upgrades have their place too... say you are upgrading
>>>> one server at a time in a cluster.
>>>
>>> Nothing here that can't be solved with an upgrade tool. Down one
>>> server, upgrade index, upgrade sofware, up.
>>
>> It's still harder.  Consider a common scenario where you have one
>> master and the index being replicated to multiple slaves.  One would
>> need to stop replication to an upgraded slave until the master is also
>> upgraded.  Some people can't even stop replication because they use
>> something like a SAN to share the index.
>>
>> I'm just pointing out that there is a lot of value for many people to
>> back compatible indexes... I'm not trying to make any points about
>> when that back combat should be broken.
>>
>> -Yonik
>> Apache Lucene Eurocon 2010
>> 18-21 May 2010 | Prague
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Grant Ingersoll <gs...@apache.org>.

I do think major versions should be able to read the previous version index.  Still, even being able to do that is no guarantee that it will produce correct results.  Likewise, even having an upgrade tool is no guarantee that correct results will be produced.  So, my take is that we strive for it, but we all have to realize, and document, that it might not always be possible.  Let's just be practical and pragmatic.  Past history indicates we are capable of, for the most part, reading the prev. version index and upgrading it.  If it can't be done automatically, then we can consider a tool.  If the tool won't work, then we will have to reindex.  It doesn't have to be an all or nothing decision made in the void.  We've always been very practical here about making decisions on problems that are directly facing us, so I would suggest we move forward with the new approach (which I agree makes more sense and is pretty prevalent across a lot of projects) and we take this issue on a case-by-case basis.

-Grant

On Apr 15, 2010, at 9:49 AM, Yonik Seeley wrote:

> On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
>> On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com> wrote:
>>> Seamless online upgrades have their place too... say you are upgrading
>>> one server at a time in a cluster.
>> 
>> Nothing here that can't be solved with an upgrade tool. Down one
>> server, upgrade index, upgrade sofware, up.
> 
> It's still harder.  Consider a common scenario where you have one
> master and the index being replicated to multiple slaves.  One would
> need to stop replication to an upgraded slave until the master is also
> upgraded.  Some people can't even stop replication because they use
> something like a SAN to share the index.
> 
> I'm just pointing out that there is a lot of value for many people to
> back compatible indexes... I'm not trying to make any points about
> when that back combat should be broken.
> 
> -Yonik
> Apache Lucene Eurocon 2010
> 18-21 May 2010 | Prague
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Yonik Seeley <yo...@lucidimagination.com>.

On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
> On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com> wrote:
>> Seamless online upgrades have their place too... say you are upgrading
>> one server at a time in a cluster.
>
> Nothing here that can't be solved with an upgrade tool. Down one
> server, upgrade index, upgrade sofware, up.

It's still harder.  Consider a common scenario where you have one
master and the index being replicated to multiple slaves.  One would
need to stop replication to an upgraded slave until the master is also
upgraded.  Some people can't even stop replication because they use
something like a SAN to share the index.

I'm just pointing out that there is a lot of value for many people to
back compatible indexes... I'm not trying to make any points about
when that back combat should be broken.

-Yonik
Apache Lucene Eurocon 2010
18-21 May 2010 | Prague

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com> wrote:
> Seamless online upgrades have their place too... say you are upgrading
> one server at a time in a cluster.

Nothing here that can't be solved with an upgrade tool. Down one
server, upgrade index, upgrade sofware, up.

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Yonik Seeley <yo...@lucidimagination.com>.

Seamless online upgrades have their place too... say you are upgrading
one server at a time in a cluster.

-Yonik
Apache Lucene Eurocon 2010
18-21 May 2010 | Prague

On Thu, Apr 15, 2010 at 8:42 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
> I like the idea of index conversion tool over silent online upgrade
> because it is
> 1. controllable - with online upgrade you never know for sure when
> your index is completely upgraded, even optimize() won't help here, as
> it is a noop for already-optimized indexes
> 2. way easier to write - as flex shows, index format changes are
> accompanied by API changes. Here you don't have to emulate new APIs
> over old structures (can be impossible for some cases?), you only have
> to, well, convert.
>
> On Thu, Apr 15, 2010 at 16:32, Danil ŢORIN <to...@gmail.com> wrote:
>> All I ask is a way to migrate existing indexes to newer format.
>>
>>
>> On Thu, Apr 15, 2010 at 15:21, Robert Muir <rc...@gmail.com> wrote:
>>>
>>> its open source, if you feel this way, you can put the work to add
>>> features to some version branch from trunk in a backwards compatible way.
>>> Then this branch can have a backwards-compatible minor release with new
>>> features, but nothing ground-breaking.
>>> but this kinda stuff shouldnt hinder development on trunk.
>>>
>>> On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com> wrote:
>>>>
>>>> Sometimes it's REALLY impossible to reindex, or has absolutely
>>>> prohibitive cost to do in a running production system (i can't shut it down
>>>> for maintainance, so i need a lot of hardware to reindex ~5 billion
>>>> documents, i have no idea what are the costs to retrieve that data all over
>>>> again, but i estimate it to be quite a lot)
>>>> And providing a way to migrate existing indexes to new lucene is crucial
>>>> from my point of view.
>>>> I don't care what this way is: calling optimize() with newer lucene or
>>>> running some tool that takes 5 days, it's ok with me.
>>>> Just don't put me through full reindexing as I really don't have all that
>>>> data anymore.
>>>> It's not my data, i just receive it from clients, and provide a search
>>>> interface.
>>>> It took years to build those indexes, rebuilding is not an option, and
>>>> staying with old lucene forever just sucks.
>>>>
>>>> Danil.
>>>> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
>>>>>
>>>>>
>>>>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
>>>>>>
>>>>>> Well ... I must say that I completely disagree w/ dropping index
>>>>>> structure back-support. Our customers will simply not hear of reindexing 10s
>>>>>> of TBs of content because of version upgrades. Such a decision is key to
>>>>>> Lucene adoption in large-scale projects. It's entirely not about whether
>>>>>> Lucene is a content store or not - content is stored on other systems, I
>>>>>> agree. But that doesn't mean reindexing it is tolerable.
>>>>>>
>>>>>
>>>>> I don't understand how its helpful to do a MAJOR version upgrade without
>>>>> reindexing... what in the world do you stand to gain from that?
>>>>> The idea here, is that development can be free of such hassles.
>>>>> Development should be this way.
>>>>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
>>>>> reindex, and are willing to do the work to port it back to Version 3 in a
>>>>> completely backwards compatible way, then under this new scheme it can
>>>>> happen.
>>>>>
>>>>> --
>>>>> Robert Muir
>>>>> rcmuir@gmail.com
>>>>
>>>
>>>
>>>
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>
>>
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

I like the idea of index conversion tool over silent online upgrade
because it is
1. controllable - with online upgrade you never know for sure when
your index is completely upgraded, even optimize() won't help here, as
it is a noop for already-optimized indexes
2. way easier to write - as flex shows, index format changes are
accompanied by API changes. Here you don't have to emulate new APIs
over old structures (can be impossible for some cases?), you only have
to, well, convert.

On Thu, Apr 15, 2010 at 16:32, Danil ŢORIN <to...@gmail.com> wrote:
> All I ask is a way to migrate existing indexes to newer format.
>
>
> On Thu, Apr 15, 2010 at 15:21, Robert Muir <rc...@gmail.com> wrote:
>>
>> its open source, if you feel this way, you can put the work to add
>> features to some version branch from trunk in a backwards compatible way.
>> Then this branch can have a backwards-compatible minor release with new
>> features, but nothing ground-breaking.
>> but this kinda stuff shouldnt hinder development on trunk.
>>
>> On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com> wrote:
>>>
>>> Sometimes it's REALLY impossible to reindex, or has absolutely
>>> prohibitive cost to do in a running production system (i can't shut it down
>>> for maintainance, so i need a lot of hardware to reindex ~5 billion
>>> documents, i have no idea what are the costs to retrieve that data all over
>>> again, but i estimate it to be quite a lot)
>>> And providing a way to migrate existing indexes to new lucene is crucial
>>> from my point of view.
>>> I don't care what this way is: calling optimize() with newer lucene or
>>> running some tool that takes 5 days, it's ok with me.
>>> Just don't put me through full reindexing as I really don't have all that
>>> data anymore.
>>> It's not my data, i just receive it from clients, and provide a search
>>> interface.
>>> It took years to build those indexes, rebuilding is not an option, and
>>> staying with old lucene forever just sucks.
>>>
>>> Danil.
>>> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
>>>>
>>>>
>>>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
>>>>>
>>>>> Well ... I must say that I completely disagree w/ dropping index
>>>>> structure back-support. Our customers will simply not hear of reindexing 10s
>>>>> of TBs of content because of version upgrades. Such a decision is key to
>>>>> Lucene adoption in large-scale projects. It's entirely not about whether
>>>>> Lucene is a content store or not - content is stored on other systems, I
>>>>> agree. But that doesn't mean reindexing it is tolerable.
>>>>>
>>>>
>>>> I don't understand how its helpful to do a MAJOR version upgrade without
>>>> reindexing... what in the world do you stand to gain from that?
>>>> The idea here, is that development can be free of such hassles.
>>>> Development should be this way.
>>>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
>>>> reindex, and are willing to do the work to port it back to Version 3 in a
>>>> completely backwards compatible way, then under this new scheme it can
>>>> happen.
>>>>
>>>> --
>>>> Robert Muir
>>>> rcmuir@gmail.com
>>>
>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Danil ŢORIN <to...@gmail.com>.

All I ask is a way to migrate existing indexes to newer format.


On Thu, Apr 15, 2010 at 15:21, Robert Muir <rc...@gmail.com> wrote:

> its open source, if you feel this way, you can put the work to add features
> to some version branch from trunk in a backwards compatible way.
>
> Then this branch can have a backwards-compatible minor release with new
> features, but nothing ground-breaking.
>
> but this kinda stuff shouldnt hinder development on trunk.
>
>
> On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com> wrote:
>
>> Sometimes it's REALLY impossible to reindex, or has absolutely prohibitive
>> cost to do in a running production system (i can't shut it down for
>> maintainance, so i need a lot of hardware to reindex ~5 billion documents, i
>> have no idea what are the costs to retrieve that data all over again, but i
>> estimate it to be quite a lot)
>>
>> And providing a way to migrate existing indexes to new lucene is crucial
>> from my point of view.
>>
>> I don't care what this way is: calling optimize() with newer lucene or
>> running some tool that takes 5 days, it's ok with me.
>>
>> Just don't put me through full reindexing as I really don't have all that
>> data anymore.
>> It's not my data, i just receive it from clients, and provide a search
>> interface.
>>
>> It took years to build those indexes, rebuilding is not an option, and
>> staying with old lucene forever just sucks.
>>
>> Danil.
>>
>> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
>>>
>>>> Well ... I must say that I completely disagree w/ dropping index
>>>> structure back-support. Our customers will simply not hear of reindexing 10s
>>>> of TBs of content because of version upgrades. Such a decision is key to
>>>> Lucene adoption in large-scale projects. It's entirely not about whether
>>>> Lucene is a content store or not - content is stored on other systems, I
>>>> agree. But that doesn't mean reindexing it is tolerable.
>>>>
>>>>
>>> I don't understand how its helpful to do a MAJOR version upgrade without
>>> reindexing... what in the world do you stand to gain from that?
>>>
>>> The idea here, is that development can be free of such hassles.
>>> Development should be this way.
>>>
>>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
>>> reindex, and are willing to do the work to port it back to Version 3 in a
>>> completely backwards compatible way, then under this new scheme it can
>>> happen.
>>>
>>>
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>>
>>
>>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

I can live w/ that Earwin ... I prefer the ongoing upgrades still, but I
won't hold off the back-compat policy change vote because of that.

Shai

On Thu, Apr 15, 2010 at 3:30 PM, Earwin Burrfoot <ea...@gmail.com> wrote:

> I think an index upgrade tool is okay?
> While you still definetly have to code it, things like "if idxVer==m
> doOneStuff elseif idxVer==n doOtherStuff else blowUp" are kept away
> from lucene innards and we all profit?
>
> On Thu, Apr 15, 2010 at 16:21, Robert Muir <rc...@gmail.com> wrote:
> > its open source, if you feel this way, you can put the work to add
> features
> > to some version branch from trunk in a backwards compatible way.
> > Then this branch can have a backwards-compatible minor release with new
> > features, but nothing ground-breaking.
> > but this kinda stuff shouldnt hinder development on trunk.
> >
> > On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com> wrote:
> >>
> >> Sometimes it's REALLY impossible to reindex, or has absolutely
> prohibitive
> >> cost to do in a running production system (i can't shut it down for
> >> maintainance, so i need a lot of hardware to reindex ~5 billion
> documents, i
> >> have no idea what are the costs to retrieve that data all over again,
> but i
> >> estimate it to be quite a lot)
> >> And providing a way to migrate existing indexes to new lucene is crucial
> >> from my point of view.
> >> I don't care what this way is: calling optimize() with newer lucene or
> >> running some tool that takes 5 days, it's ok with me.
> >> Just don't put me through full reindexing as I really don't have all
> that
> >> data anymore.
> >> It's not my data, i just receive it from clients, and provide a search
> >> interface.
> >> It took years to build those indexes, rebuilding is not an option, and
> >> staying with old lucene forever just sucks.
> >>
> >> Danil.
> >> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
> >>>
> >>>
> >>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
> >>>>
> >>>> Well ... I must say that I completely disagree w/ dropping index
> >>>> structure back-support. Our customers will simply not hear of
> reindexing 10s
> >>>> of TBs of content because of version upgrades. Such a decision is key
> to
> >>>> Lucene adoption in large-scale projects. It's entirely not about
> whether
> >>>> Lucene is a content store or not - content is stored on other systems,
> I
> >>>> agree. But that doesn't mean reindexing it is tolerable.
> >>>>
> >>>
> >>> I don't understand how its helpful to do a MAJOR version upgrade
> without
> >>> reindexing... what in the world do you stand to gain from that?
> >>> The idea here, is that development can be free of such hassles.
> >>> Development should be this way.
> >>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
> >>> reindex, and are willing to do the work to port it back to Version 3 in
> a
> >>> completely backwards compatible way, then under this new scheme it can
> >>> happen.
> >>>
> >>> --
> >>> Robert Muir
> >>> rcmuir@gmail.com
> >>
> >
> >
> >
> > --
> > Robert Muir
> > rcmuir@gmail.com
> >
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

I think an index upgrade tool is okay?
While you still definetly have to code it, things like "if idxVer==m
doOneStuff elseif idxVer==n doOtherStuff else blowUp" are kept away
from lucene innards and we all profit?

On Thu, Apr 15, 2010 at 16:21, Robert Muir <rc...@gmail.com> wrote:
> its open source, if you feel this way, you can put the work to add features
> to some version branch from trunk in a backwards compatible way.
> Then this branch can have a backwards-compatible minor release with new
> features, but nothing ground-breaking.
> but this kinda stuff shouldnt hinder development on trunk.
>
> On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com> wrote:
>>
>> Sometimes it's REALLY impossible to reindex, or has absolutely prohibitive
>> cost to do in a running production system (i can't shut it down for
>> maintainance, so i need a lot of hardware to reindex ~5 billion documents, i
>> have no idea what are the costs to retrieve that data all over again, but i
>> estimate it to be quite a lot)
>> And providing a way to migrate existing indexes to new lucene is crucial
>> from my point of view.
>> I don't care what this way is: calling optimize() with newer lucene or
>> running some tool that takes 5 days, it's ok with me.
>> Just don't put me through full reindexing as I really don't have all that
>> data anymore.
>> It's not my data, i just receive it from clients, and provide a search
>> interface.
>> It took years to build those indexes, rebuilding is not an option, and
>> staying with old lucene forever just sucks.
>>
>> Danil.
>> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
>>>
>>>
>>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
>>>>
>>>> Well ... I must say that I completely disagree w/ dropping index
>>>> structure back-support. Our customers will simply not hear of reindexing 10s
>>>> of TBs of content because of version upgrades. Such a decision is key to
>>>> Lucene adoption in large-scale projects. It's entirely not about whether
>>>> Lucene is a content store or not - content is stored on other systems, I
>>>> agree. But that doesn't mean reindexing it is tolerable.
>>>>
>>>
>>> I don't understand how its helpful to do a MAJOR version upgrade without
>>> reindexing... what in the world do you stand to gain from that?
>>> The idea here, is that development can be free of such hassles.
>>> Development should be this way.
>>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
>>> reindex, and are willing to do the work to port it back to Version 3 in a
>>> completely backwards compatible way, then under this new scheme it can
>>> happen.
>>>
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>
>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

its open source, if you feel this way, you can put the work to add features
to some version branch from trunk in a backwards compatible way.

Then this branch can have a backwards-compatible minor release with new
features, but nothing ground-breaking.

but this kinda stuff shouldnt hinder development on trunk.


On Thu, Apr 15, 2010 at 8:17 AM, Danil ŢORIN <to...@gmail.com> wrote:

> Sometimes it's REALLY impossible to reindex, or has absolutely prohibitive
> cost to do in a running production system (i can't shut it down for
> maintainance, so i need a lot of hardware to reindex ~5 billion documents, i
> have no idea what are the costs to retrieve that data all over again, but i
> estimate it to be quite a lot)
>
> And providing a way to migrate existing indexes to new lucene is crucial
> from my point of view.
>
> I don't care what this way is: calling optimize() with newer lucene or
> running some tool that takes 5 days, it's ok with me.
>
> Just don't put me through full reindexing as I really don't have all that
> data anymore.
> It's not my data, i just receive it from clients, and provide a search
> interface.
>
> It took years to build those indexes, rebuilding is not an option, and
> staying with old lucene forever just sucks.
>
> Danil.
>
> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
>
>>
>>
>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
>>
>>> Well ... I must say that I completely disagree w/ dropping index
>>> structure back-support. Our customers will simply not hear of reindexing 10s
>>> of TBs of content because of version upgrades. Such a decision is key to
>>> Lucene adoption in large-scale projects. It's entirely not about whether
>>> Lucene is a content store or not - content is stored on other systems, I
>>> agree. But that doesn't mean reindexing it is tolerable.
>>>
>>>
>> I don't understand how its helpful to do a MAJOR version upgrade without
>> reindexing... what in the world do you stand to gain from that?
>>
>> The idea here, is that development can be free of such hassles.
>> Development should be this way.
>>
>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
>> reindex, and are willing to do the work to port it back to Version 3 in a
>> completely backwards compatible way, then under this new scheme it can
>> happen.
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Thanks Danil - you reminded me of another reason why reindexing is
impossible - fetching the data, even if it's available is too damn costly.

Robert, I think you're driven by Analyzers changes ... been too much around
them I'm afraid :).

A major version upgrade is a move to Java 1.5 for example. I can do that,
and I don't see why I need to reindex my data because of that. And I simply
don't buy that "do this work on your own" ... people can take a snapshot of
the code, maintain it separately and you'll never hear back from them. Who
benefits - neither !
It's open source - true, but it's way past the "Hey look, I'm a new open
source project w/ a dozen users, I can do whatever I want". Lucene is a
respected open source project, w/ serious adoption and deployments. People
trust on the select few committers here to do it right for them, so they
don't need to invest the time and resources in developing core IR stuff. And
now you're pushing to "do it yourself" approach? I simply don't get or buy
it.

When were you struck w/ maintaining backwards change because the index
structure changed? I bet no so many of us, or shall I say just the few Mikes
out there? So how hard is it to require such back-compat support? I
wholeheartedly agree that we shouldn't keep back-compat on Analyzer changes,
nor on bugs such that one which changed the position of the field from -1 to
0 (a while ago - don't remember the exact details).

Shai

On Thu, Apr 15, 2010 at 3:17 PM, Danil ŢORIN <to...@gmail.com> wrote:

> Sometimes it's REALLY impossible to reindex, or has absolutely prohibitive
> cost to do in a running production system (i can't shut it down for
> maintainance, so i need a lot of hardware to reindex ~5 billion documents, i
> have no idea what are the costs to retrieve that data all over again, but i
> estimate it to be quite a lot)
>
> And providing a way to migrate existing indexes to new lucene is crucial
> from my point of view.
>
> I don't care what this way is: calling optimize() with newer lucene or
> running some tool that takes 5 days, it's ok with me.
>
> Just don't put me through full reindexing as I really don't have all that
> data anymore.
> It's not my data, i just receive it from clients, and provide a search
> interface.
>
> It took years to build those indexes, rebuilding is not an option, and
> staying with old lucene forever just sucks.
>
> Danil.
>
> On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:
>
>>
>>
>> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
>>
>>> Well ... I must say that I completely disagree w/ dropping index
>>> structure back-support. Our customers will simply not hear of reindexing 10s
>>> of TBs of content because of version upgrades. Such a decision is key to
>>> Lucene adoption in large-scale projects. It's entirely not about whether
>>> Lucene is a content store or not - content is stored on other systems, I
>>> agree. But that doesn't mean reindexing it is tolerable.
>>>
>>>
>> I don't understand how its helpful to do a MAJOR version upgrade without
>> reindexing... what in the world do you stand to gain from that?
>>
>> The idea here, is that development can be free of such hassles.
>> Development should be this way.
>>
>> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
>> reindex, and are willing to do the work to port it back to Version 3 in a
>> completely backwards compatible way, then under this new scheme it can
>> happen.
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>

Re: Proposal about Version API "relaxation"

Posted by Danil ŢORIN <to...@gmail.com>.

Sometimes it's REALLY impossible to reindex, or has absolutely prohibitive
cost to do in a running production system (i can't shut it down for
maintainance, so i need a lot of hardware to reindex ~5 billion documents, i
have no idea what are the costs to retrieve that data all over again, but i
estimate it to be quite a lot)

And providing a way to migrate existing indexes to new lucene is crucial
from my point of view.

I don't care what this way is: calling optimize() with newer lucene or
running some tool that takes 5 days, it's ok with me.

Just don't put me through full reindexing as I really don't have all that
data anymore.
It's not my data, i just receive it from clients, and provide a search
interface.

It took years to build those indexes, rebuilding is not an option, and
staying with old lucene forever just sucks.

Danil.

On Thu, Apr 15, 2010 at 14:57, Robert Muir <rc...@gmail.com> wrote:

>
>
> On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:
>
>> Well ... I must say that I completely disagree w/ dropping index structure
>> back-support. Our customers will simply not hear of reindexing 10s of TBs of
>> content because of version upgrades. Such a decision is key to Lucene
>> adoption in large-scale projects. It's entirely not about whether Lucene is
>> a content store or not - content is stored on other systems, I agree. But
>> that doesn't mean reindexing it is tolerable.
>>
>>
> I don't understand how its helpful to do a MAJOR version upgrade without
> reindexing... what in the world do you stand to gain from that?
>
> The idea here, is that development can be free of such hassles. Development
> should be this way.
>
> If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
> reindex, and are willing to do the work to port it back to Version 3 in a
> completely backwards compatible way, then under this new scheme it can
> happen.
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <se...@gmail.com> wrote:

> Well ... I must say that I completely disagree w/ dropping index structure
> back-support. Our customers will simply not hear of reindexing 10s of TBs of
> content because of version upgrades. Such a decision is key to Lucene
> adoption in large-scale projects. It's entirely not about whether Lucene is
> a content store or not - content is stored on other systems, I agree. But
> that doesn't mean reindexing it is tolerable.
>
>
I don't understand how its helpful to do a MAJOR version upgrade without
reindexing... what in the world do you stand to gain from that?

The idea here, is that development can be free of such hassles. Development
should be this way.

If you, Shai, need some feature X.Y.Z from Version 4 and don't want to
reindex, and are willing to do the work to port it back to Version 3 in a
completely backwards compatible way, then under this new scheme it can
happen.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Well ... I must say that I completely disagree w/ dropping index structure
back-support. Our customers will simply not hear of reindexing 10s of TBs of
content because of version upgrades. Such a decision is key to Lucene
adoption in large-scale projects. It's entirely not about whether Lucene is
a content store or not - content is stored on other systems, I agree. But
that doesn't mean reindexing it is tolerable.

Up until now, Lucene migrated my segments gradually, and before I upgraded
from X+1 to X+2 I could run optimize() to ensure my index will be readable
by X+2. I don't think I can myself agree to it, let alone convince all the
stakeholders in my company who adopt Lucene today in numerous projects, to
let go of such capability. We've been there before (requiring reindexing on
version upgrades) w/ some offerings and customers simply didn't like it and
were forced to use an enterprise-class search engine which offered less (and
didn't use Lucene, up until recently !). Until we moved to Lucene ...

What's Solr's take on it?

I differentiate between structural changes and runtime changes. I, myself,
don't mind if we let go of back-compat support for runtime changes, such as
those generated by analyzers. For a couple of reasons, the most important
ones are (1) these are not so frequent (but so is index structural change)
and (2) that's a decision I, as the application developer, makes - using or
not a newer version of an Analyzer. I don't mind working hard to make a 2.x
Analyzer version work in the 3.x world, but I cannot make a 2.x index
readable by a 3.x Lucene jar, if the latter doesn't support it. That's the
key difference, in my mind, between the two. I can choose not to upgrade at
all to a newer analyzer version ... but I don't want to be forced to stay w/
older Lucene versions and features because of that ... well people might say
that it's not Lucene's problem, but I beg to differ. Lucene benefits from
wider and faster adoption and we rely on new features to be adopted quickly.
That might be jeopardized if we let go of that strong capability, IMO.

What we can do is provide an index migration tool ... but personally I don't
know what's the difference between that and gradually migrating segments as
they are merged, code-wise. I mean - it has to be the same code. Only an
index migration tool may take days to complete on a very large index, while
the ongoing migration takes ~0 time when you come to upgrade to a newer
Lucene release.

And the note about Terrier requiring reindexing ... well I can't say it's a
strength of it but a damn big weakness IMO.

About the release pace, I don't think we can suddenly release every 2 years
... makes people think the project is stuck. And some out there are not so
fond of using a 'trunk' version and release it w/ their products because
trunk is perceived as ongoing development (which it is) and thus less
stable, or is likely to change and most importantly harder to maintain (as
the consumer). So I still think we should release more often than not.

That's why I wanted to differentiate X and Y, but I don't mind if we release
just X ... if that's so important to people. BTW Mike, Eclipse's releases
are like Lucene, and in fact I don't know of so many projects that just
release X ... many of them seem to release X.Y.

I don't understand why we're treating this as a "all or nothing" thing. We
can let go of API back-compat, that clearly has no affect on index structure
and content. We can even let go of index runtime changes for all I care. But
I simply don't think we can let go of index structure back-support.

Shai

On Thu, Apr 15, 2010 at 1:12 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> 2010/4/15 Shai Erera <se...@gmail.com>:
>
> > One way is to define 'major' as X and minor X.Y, and another is to define
> major as 'X.Y' and minor as 'X.Y.Z'. I prefer the latter but don't have any
> strong feelings against the former.
>
> I prefer X.Y, ie, changes to Y only is a minor release (mostly bug
> fixes but maybe small features); changes to X is a major release.  I
> think that's more "standard", ie, people will generally grok that 3.3
> -> 4.0 is a major change but 3.3 -> 3.4 isn't.
>
> So this proposal would change how Lucene releases are numbered.  Ie,
> the next release would be 4.0.  Bug fixes / small features would then
> be 4.1.
>
> > Index back compat should be maintained between major releases, like it is
> today, STRUCTURE-wise.
>
> No... in the proposal, you must re-index on upgrading to the next
> major release (3.x -> 4.0).
>
> I think supporting old indexes, badly (what we do today) is not a
> great solution.  EG on upgrading to 3.1 you'll immediately see a
> search perf hit since the flex emulation layer is running.  It's a
> trap.
>
> It's this freedom, I think, that'd let us drop Version entirely.  It's
> the back-compat of the index that is the major driver for having
> Version today (eg so that the analyzers can produce tokens matching
> your old index).
>
> EG Terrier seems to have the same requirement -- note the bold "All
> indexes must be rebuilt":
>
>  http://terrier.org/docs/current/whats_new.html
>
> Also, Lucene isn't a primary store (like a filesytem or a database).
> We expect that your "true" content still lives somewhere else.  So why
> do we go to such great lengths to keep the index format for so
> long...?
>
> > BTW, w/ all that - does it mean 'backwards' can be dropped, or at least
> test-backwards activated only on a branch which we decide needs it? That'll
> be really great.
>
> I think the stable branches (2.x, 3.x) would have backwards tests
> created the moment they are branched, to make sure as we fix bugs /
> backport minor features we don't break back compat, along that branch.
>
> I don't think we need the .Z part of a release numbering -- our
> numbers would look like most other software projects.  3.0 is a major
> release, 3.1, 3.2, 3.3 fix bugs / add minor features, etc.
>
> If flex were done in this world I would've finished it alot faster!  A
> huge amount of time went into the cross back compat emulation layers
> (pre-flex APIs and pre-flex index).
>
> > Also, we will still need to maintain the Backwards section in CHANGES (or
> move it to API Changes), to help people upgrade from release to release.
>
> I think we'd create a migration guide to explain how apps migrate to
> the next major release (this is what other projects do), eg like this:
>
>  http://community.jboss.org/wiki/Hibernate3MigrationGuides#A42
>
> > Unless you're telling me we'll start releasing major releases more often?
>
> I think this is mostly orthogonal?  We could still do major releases
> frequently or rarely with this model... however, it would give us more
> freedom to do major releases frequently (vs today where every major
> release sets a scary back-compat-burden stake in the ground).
>
> > I don't see why would anyone releases a 3.x after 4.0 is out unless
> someone really wants to work hard on maintaining back-compat of some
> features
>
> I think the minor releases on the stable branch (3.1, 3.2, 3.3) would
> be mostly bug fixes, but maybe also minor features if
> contributor's/developer's had the itch to make them available on the
> stable (3.x) branch.  How much dev happens on the stable branch can be
> largely determined by itch...
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

2010/4/15 Shai Erera <se...@gmail.com>:

> One way is to define 'major' as X and minor X.Y, and another is to define major as 'X.Y' and minor as 'X.Y.Z'. I prefer the latter but don't have any strong feelings against the former.

I prefer X.Y, ie, changes to Y only is a minor release (mostly bug
fixes but maybe small features); changes to X is a major release. I
think that's more "standard", ie, people will generally grok that 3.3
-> 4.0 is a major change but 3.3 -> 3.4 isn't.

So this proposal would change how Lucene releases are numbered. Ie,
the next release would be 4.0. Bug fixes / small features would then
be 4.1.

> Index back compat should be maintained between major releases, like it is today, STRUCTURE-wise.

No... in the proposal, you must re-index on upgrading to the next
major release (3.x -> 4.0).

I think supporting old indexes, badly (what we do today) is not a
great solution. EG on upgrading to 3.1 you'll immediately see a
search perf hit since the flex emulation layer is running. It's a
trap.

It's this freedom, I think, that'd let us drop Version entirely. It's
the back-compat of the index that is the major driver for having
Version today (eg so that the analyzers can produce tokens matching
your old index).

EG Terrier seems to have the same requirement -- note the bold "All
indexes must be rebuilt":

http://terrier.org/docs/current/whats_new.html

Also, Lucene isn't a primary store (like a filesytem or a database).
We expect that your "true" content still lives somewhere else. So why
do we go to such great lengths to keep the index format for so
long...?

> BTW, w/ all that - does it mean 'backwards' can be dropped, or at least test-backwards activated only on a branch which we decide needs it? That'll be really great.

I think the stable branches (2.x, 3.x) would have backwards tests
created the moment they are branched, to make sure as we fix bugs /
backport minor features we don't break back compat, along that branch.

I don't think we need the .Z part of a release numbering -- our
numbers would look like most other software projects. 3.0 is a major
release, 3.1, 3.2, 3.3 fix bugs / add minor features, etc.

If flex were done in this world I would've finished it alot faster! A
huge amount of time went into the cross back compat emulation layers
(pre-flex APIs and pre-flex index).

> Also, we will still need to maintain the Backwards section in CHANGES (or move it to API Changes), to help people upgrade from release to release.

I think we'd create a migration guide to explain how apps migrate to
the next major release (this is what other projects do), eg like this:

http://community.jboss.org/wiki/Hibernate3MigrationGuides#A42

> Unless you're telling me we'll start releasing major releases more often?

I think this is mostly orthogonal? We could still do major releases
frequently or rarely with this model... however, it would give us more
freedom to do major releases frequently (vs today where every major
release sets a scary back-compat-burden stake in the ground).

> I don't see why would anyone releases a 3.x after 4.0 is out unless someone really wants to work hard on maintaining back-compat of some features

I think the minor releases on the stable branch (3.1, 3.2, 3.3) would
be mostly bug fixes, but maybe also minor features if
contributor's/developer's had the itch to make them available on the
stable (3.x) branch. How much dev happens on the stable branch can be
largely determined by itch...

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Well ... I think that version numbers mean more than we'd like them to mean,
as people perceive them. Let's discuss the format X.Y.Z:

When X is changed, it should mean something 'big' happened - index structure
has changed (e.g. the flexible scoring work), new Java version supported
(Java 1.6) and even stuff like 'flex' which includes statements like "if you
don't want your app to slow down, consider reindexing". Such things signal a
major change in Lucene, sometimes even just policy changes (Java version
supported) and therefore I think we should reserve the ability to bump X
when such things happen.

Another thing is the index structure back-compat policy - today Lucene
supports X-1 index structure, but during upgrades of X.Y versions, your
segments are gradually migrated. Eventually, when you upgrade to 4.0 you
should know whether you have a 2.x index, and call optimize just in case if
you're not sure it's not migrated yet (if you've upgraded to 3.x).
If we start bumping up 'X' too often, we'll either need to change the X-1
policy to X-N, which will just complicate matters for users. Or we'll keep
the X-1 policy, but people will need to call optimize more frequently.

Y should change on a regular basis, and no back-compat API-wise or index
runtime-wise is guaranteed. So the Collector and per-segment searches in 2.9
could go w/o deprecating tons of API, so is the TokenStream work. Changes to
Analyzer's runtime capabilities will also be allowed between Y revisions.

Z should change when bugfixes are fixed, or when features are backported.
Really ... we rarely fix bugs on a released Y branch, and I don't expect
tons of features will be backported to a Y branch (to create a Z+1 release).
Therefore this should not confuse anyone.

So all I'm saying is that instead of increasing X whenever the API, index
structure or runtime behavior has changed, I'm simply proposing to
differentiate between really "major" changes to those that just say 'we're
not back-compat compliant'.

But above all, I'd like to see this change happening, so if I need to
surrender to the X vs. X+Y approach, I will. Just think it will create some
confusion.

BTW, w/ all that - does it mean 'backwards' can be dropped, or at least
test-backwards activated only on a branch which we decide needs it? That'll
be really great.

Shai

On Thu, Apr 15, 2010 at 10:24 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> We can remove Version, because all incompatible changes go straight to
> a new major release, which we release more often, yes.
> 3.x is going to be released after 4.0 if bugs are found and fixed, or
> if people ask to backport some (minor?) features, and some dev has
> time for this.
>
> The question of what to call major release in X.Y.Z scheme - X or Y,
> is there, but immaterial :) I think it's okay to settle with X.Y, we
> have major releases and bugfixes, what that third number can be used
> for?
>
> On Thu, Apr 15, 2010 at 09:29, Shai Erera <se...@gmail.com> wrote:
> > So then I don't understand this:
> >
> > {quote}
> > * A major release always bumps the major release number (2.x ->
> >    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
> >    releases along that branch
> >
> > * There is no back compat across major releases (index nor APIs),
> >    but full back compat within branches.
> >
> > {quote}
> >
> > What's different than what's done today? How can we remove Version in
> that
> > world, if we need to maintain full back-compat between 3.1 and 3.2, index
> > and API-wise? We'll still need to deprecate and come up w/ new classes
> every
> > time, and we'll still need to maintain runtime changes back-compat.
> >
> > Unless you're telling me we'll start releasing major releases more often?
> > Well ... then we're saying the same thing, only I think that instead of
> > releasing 4, 5, 6, 7, 8 every 6 months, we can release 3.1, 3.2, 3.5 ...
> > because if you look back, every minor release included API deprecations
> as
> > well as back-compat breaks. That means that every minor release should
> have
> > been a major release right?
> >
> > Point is, if I understand correctly and you agree w/ my statement above -
> I
> > don't see why would anyone releases a 3.x after 4.0 is out unless someone
> > really wants to work hard on maintaining back-compat of some features.
> >
> > If it's just a numbering thing, then I don't think it matters what is
> > defined as 'major' vs. 'minor'. One way is to define 'major' as X and
> minor
> > X.Y, and another is to define major as 'X.Y' and minor as 'X.Y.Z'. I
> prefer
> > the latter but don't have any strong feelings against the former. Just
> > pointing out that X will grow more rapidly than today. That's all.
> >
> > So did I get it right?
> >
> > Shai
> >
> > On Thu, Apr 15, 2010 at 8:19 AM, Mark Miller <ma...@gmail.com>
> wrote:
> >>
> >> I don't read what you wrote and what Mike wrote as even close to the
> >> same.
> >>
> >> - Mark
> >> http://www.lucidimagination.com (mobile)
> >> On Apr 15, 2010, at 12:05 AM, Shai Erera <se...@gmail.com> wrote:
> >>
> >> Ahh ... a dream finally comes true ... what a great way to start a day
> :).
> >> +1 !!!
> >>
> >> I have some questions/comments though:
> >>
> >> * Index back compat should be maintained between major releases, like it
> >> is today, STRUCTURE-wise. So apps get a chance to incrementally upgrade
> >> their segments when they move from 2.x to 3.x before 4.0 lands and
> they'll
> >> need to call optimize() to ensure 4.0 still works on their index. I hope
> >> that will still be the case? Otherwise I don't see how we can prevent
> >> reindexing by apps.
> >> ** Index behavioral/runtime changes, like those of Analyzers, are ok to
> >> require a reindex, as proposed.
> >>
> >> So after 3.1 is out, trunk can break the API and 3.2 will have a new set
> >> of API? Cool and convenient. For how long do we keep the 3.1 branch
> around?
> >> Also, it used to only fix bugs, but from now on it'll be allowed to
> >> introduce new features, if they maintain back-compat? So 3.1.1 can have
> >> 'flex' (going for the extreme on purpose) if someone maintains
> back-compat?
> >>
> >> I think the back-compat on branches should be only for index runtime
> >> changes. There's no point, in my opinion, to maintain API back-compat
> >> anymore for jars drop-in, if apps will need to upgrade from 3.1 to 3.1.1
> >> just to get a new feature but get it API back-supported? As soon as they
> >> upgrade to 3.2, that means a new set of API right?
> >>
> >> Major releases will just change the index structure format then? Or move
> >> to Java 1.6? Well ... not even that because as I understand it, 3.2 can
> move
> >> to Java 1.6 ... no API back-compat right :).
> >>
> >> That's definitely a great step forward !
> >>
> >> Shai
> >>
> >> On Thu, Apr 15, 2010 at 1:34 AM, Andi Vajda <va...@osafoundation.org>
> >> wrote:
> >>>
> >>> On Thu, 15 Apr 2010, Earwin Burrfoot wrote:
> >>>
> >>>> Can't believe my eyes.
> >>>>
> >>>> +1
> >>>
> >>> Likewise. +1 !
> >>>
> >>> Andi..
> >>>
> >>>>
> >>>> On Thu, Apr 15, 2010 at 01:22, Michael McCandless
> >>>> <lu...@mikemccandless.com> wrote:
> >>>>>
> >>>>> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
> >>>>> <ma...@rectangular.com> wrote:
> >>>>>
> >>>>>> Essentially, we're free to break back compat within "Lucy" at any
> >>>>>> time, but
> >>>>>> we're not able to break back compat within a stable fork like
> "Lucy1",
> >>>>>> "Lucy2", etc.  So what we'll probably do during normal development
> >>>>>> with
> >>>>>> Analyzers is just change them and note the break in the Changes
> file.
> >>>>>
> >>>>> So... what if we change up how we develop and release Lucene:
> >>>>>
> >>>>>  * A major release always bumps the major release number (2.x ->
> >>>>>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
> >>>>>    releases along that branch
> >>>>>
> >>>>>  * There is no back compat across major releases (index nor APIs),
> >>>>>    but full back compat within branches.
> >>>>>
> >>>>> This would match how many other projects work (KS/Lucy, as Marvin
> >>>>> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
> >>>>>
> >>>>> The 'stable' branch (say 3.x now for Lucene) would get bug fixes,
> and,
> >>>>> if any devs have the itch, they could freely back-port improvements
> >>>>> from trunk as long as they kept back-compat within the branch.
> >>>>>
> >>>>> I think in such a future world, we could:
> >>>>>
> >>>>>  * Remove Version entirely!
> >>>>>
> >>>>>  * Not worry at all about back-compat when developing on trunk
> >>>>>
> >>>>>  * Give proper names to new improved classes instead of
> >>>>>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
> >>>>>    today; rename existing classes.
> >>>>>
> >>>>>  * Let analyzers freely, incrementally improve
> >>>>>
> >>>>>  * Use interfaces without fear
> >>>>>
> >>>>>  * Stop spending the truly substantial time (look @ Uwe's awesome
> >>>>>    back-compat layer for analyzers!) that we now must spend when
> >>>>>    adding new features, for back-compat
> >>>>>
> >>>>>  * Be more free to introduce very new not-fully-baked features/APIs,
> >>>>>    marked as experimental, on the expectation that once they are used
> >>>>>    (in trunk) they will iterate/change/improve vs trying so hard to
> >>>>>    get things right on the first go for fear of future back compat
> >>>>>    horrors.
> >>>>>
> >>>>> Thoughts...?
> >>>>>
> >>>>> Mike
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Kirill Zakharenko/?????? ????????? (earwin@gmail.com)
> >>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> >>>> ICQ: 104465785
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>>>
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> >
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

We can remove Version, because all incompatible changes go straight to
a new major release, which we release more often, yes.
3.x is going to be released after 4.0 if bugs are found and fixed, or
if people ask to backport some (minor?) features, and some dev has
time for this.

The question of what to call major release in X.Y.Z scheme - X or Y,
is there, but immaterial :) I think it's okay to settle with X.Y, we
have major releases and bugfixes, what that third number can be used
for?

On Thu, Apr 15, 2010 at 09:29, Shai Erera <se...@gmail.com> wrote:
> So then I don't understand this:
>
> {quote}
> * A major release always bumps the major release number (2.x ->
>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>    releases along that branch
>
> * There is no back compat across major releases (index nor APIs),
>    but full back compat within branches.
>
> {quote}
>
> What's different than what's done today? How can we remove Version in that
> world, if we need to maintain full back-compat between 3.1 and 3.2, index
> and API-wise? We'll still need to deprecate and come up w/ new classes every
> time, and we'll still need to maintain runtime changes back-compat.
>
> Unless you're telling me we'll start releasing major releases more often?
> Well ... then we're saying the same thing, only I think that instead of
> releasing 4, 5, 6, 7, 8 every 6 months, we can release 3.1, 3.2, 3.5 ...
> because if you look back, every minor release included API deprecations as
> well as back-compat breaks. That means that every minor release should have
> been a major release right?
>
> Point is, if I understand correctly and you agree w/ my statement above - I
> don't see why would anyone releases a 3.x after 4.0 is out unless someone
> really wants to work hard on maintaining back-compat of some features.
>
> If it's just a numbering thing, then I don't think it matters what is
> defined as 'major' vs. 'minor'. One way is to define 'major' as X and minor
> X.Y, and another is to define major as 'X.Y' and minor as 'X.Y.Z'. I prefer
> the latter but don't have any strong feelings against the former. Just
> pointing out that X will grow more rapidly than today. That's all.
>
> So did I get it right?
>
> Shai
>
> On Thu, Apr 15, 2010 at 8:19 AM, Mark Miller <ma...@gmail.com> wrote:
>>
>> I don't read what you wrote and what Mike wrote as even close to the
>> same.
>>
>> - Mark
>> http://www.lucidimagination.com (mobile)
>> On Apr 15, 2010, at 12:05 AM, Shai Erera <se...@gmail.com> wrote:
>>
>> Ahh ... a dream finally comes true ... what a great way to start a day :).
>> +1 !!!
>>
>> I have some questions/comments though:
>>
>> * Index back compat should be maintained between major releases, like it
>> is today, STRUCTURE-wise. So apps get a chance to incrementally upgrade
>> their segments when they move from 2.x to 3.x before 4.0 lands and they'll
>> need to call optimize() to ensure 4.0 still works on their index. I hope
>> that will still be the case? Otherwise I don't see how we can prevent
>> reindexing by apps.
>> ** Index behavioral/runtime changes, like those of Analyzers, are ok to
>> require a reindex, as proposed.
>>
>> So after 3.1 is out, trunk can break the API and 3.2 will have a new set
>> of API? Cool and convenient. For how long do we keep the 3.1 branch around?
>> Also, it used to only fix bugs, but from now on it'll be allowed to
>> introduce new features, if they maintain back-compat? So 3.1.1 can have
>> 'flex' (going for the extreme on purpose) if someone maintains back-compat?
>>
>> I think the back-compat on branches should be only for index runtime
>> changes. There's no point, in my opinion, to maintain API back-compat
>> anymore for jars drop-in, if apps will need to upgrade from 3.1 to 3.1.1
>> just to get a new feature but get it API back-supported? As soon as they
>> upgrade to 3.2, that means a new set of API right?
>>
>> Major releases will just change the index structure format then? Or move
>> to Java 1.6? Well ... not even that because as I understand it, 3.2 can move
>> to Java 1.6 ... no API back-compat right :).
>>
>> That's definitely a great step forward !
>>
>> Shai
>>
>> On Thu, Apr 15, 2010 at 1:34 AM, Andi Vajda <va...@osafoundation.org>
>> wrote:
>>>
>>> On Thu, 15 Apr 2010, Earwin Burrfoot wrote:
>>>
>>>> Can't believe my eyes.
>>>>
>>>> +1
>>>
>>> Likewise. +1 !
>>>
>>> Andi..
>>>
>>>>
>>>> On Thu, Apr 15, 2010 at 01:22, Michael McCandless
>>>> <lu...@mikemccandless.com> wrote:
>>>>>
>>>>> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
>>>>> <ma...@rectangular.com> wrote:
>>>>>
>>>>>> Essentially, we're free to break back compat within "Lucy" at any
>>>>>> time, but
>>>>>> we're not able to break back compat within a stable fork like "Lucy1",
>>>>>> "Lucy2", etc.  So what we'll probably do during normal development
>>>>>> with
>>>>>> Analyzers is just change them and note the break in the Changes file.
>>>>>
>>>>> So... what if we change up how we develop and release Lucene:
>>>>>
>>>>>  * A major release always bumps the major release number (2.x ->
>>>>>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>>>>>    releases along that branch
>>>>>
>>>>>  * There is no back compat across major releases (index nor APIs),
>>>>>    but full back compat within branches.
>>>>>
>>>>> This would match how many other projects work (KS/Lucy, as Marvin
>>>>> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>>>>>
>>>>> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
>>>>> if any devs have the itch, they could freely back-port improvements
>>>>> from trunk as long as they kept back-compat within the branch.
>>>>>
>>>>> I think in such a future world, we could:
>>>>>
>>>>>  * Remove Version entirely!
>>>>>
>>>>>  * Not worry at all about back-compat when developing on trunk
>>>>>
>>>>>  * Give proper names to new improved classes instead of
>>>>>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>>>>>    today; rename existing classes.
>>>>>
>>>>>  * Let analyzers freely, incrementally improve
>>>>>
>>>>>  * Use interfaces without fear
>>>>>
>>>>>  * Stop spending the truly substantial time (look @ Uwe's awesome
>>>>>    back-compat layer for analyzers!) that we now must spend when
>>>>>    adding new features, for back-compat
>>>>>
>>>>>  * Be more free to introduce very new not-fully-baked features/APIs,
>>>>>    marked as experimental, on the expectation that once they are used
>>>>>    (in trunk) they will iterate/change/improve vs trying so hard to
>>>>>    get things right on the first go for fear of future back compat
>>>>>    horrors.
>>>>>
>>>>> Thoughts...?
>>>>>
>>>>> Mike
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Kirill Zakharenko/?????? ????????? (earwin@gmail.com)
>>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>>> ICQ: 104465785
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Also, we will still need to maintain the Backwards section in CHANGES (or
move it to API Changes), to help people upgrade from release to release.
Just pointing that out as well.

Shai

On Thu, Apr 15, 2010 at 7:05 AM, Shai Erera <se...@gmail.com> wrote:

> Ahh ... a dream finally comes true ... what a great way to start a day :).
> +1 !!!
>
> I have some questions/comments though:
>
> * Index back compat should be maintained between major releases, like it is
> today, STRUCTURE-wise. So apps get a chance to incrementally upgrade their
> segments when they move from 2.x to 3.x before 4.0 lands and they'll need to
> call optimize() to ensure 4.0 still works on their index. I hope that will
> still be the case? Otherwise I don't see how we can prevent reindexing by
> apps.
> ** Index behavioral/runtime changes, like those of Analyzers, are ok to
> require a reindex, as proposed.
>
> So after 3.1 is out, trunk can break the API and 3.2 will have a new set of
> API? Cool and convenient. For how long do we keep the 3.1 branch around?
> Also, it used to only fix bugs, but from now on it'll be allowed to
> introduce new features, if they maintain back-compat? So 3.1.1 can have
> 'flex' (going for the extreme on purpose) if someone maintains back-compat?
>
> I think the back-compat on branches should be only for index runtime
> changes. There's no point, in my opinion, to maintain API back-compat
> anymore for jars drop-in, if apps will need to upgrade from 3.1 to 3.1.1
> just to get a new feature but get it API back-supported? As soon as they
> upgrade to 3.2, that means a new set of API right?
>
> Major releases will just change the index structure format then? Or move to
> Java 1.6? Well ... not even that because as I understand it, 3.2 can move to
> Java 1.6 ... no API back-compat right :).
>
> That's definitely a great step forward !
>
> Shai
>
>
> On Thu, Apr 15, 2010 at 1:34 AM, Andi Vajda <va...@osafoundation.org>wrote:
>
>>
>> On Thu, 15 Apr 2010, Earwin Burrfoot wrote:
>>
>>  Can't believe my eyes.
>>>
>>> +1
>>>
>>
>> Likewise. +1 !
>>
>> Andi..
>>
>>
>>> On Thu, Apr 15, 2010 at 01:22, Michael McCandless
>>> <lu...@mikemccandless.com> wrote:
>>>
>>>> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
>>>> <ma...@rectangular.com> wrote:
>>>>
>>>>  Essentially, we're free to break back compat within "Lucy" at any time,
>>>>> but
>>>>> we're not able to break back compat within a stable fork like "Lucy1",
>>>>> "Lucy2", etc.  So what we'll probably do during normal development with
>>>>> Analyzers is just change them and note the break in the Changes file.
>>>>>
>>>>
>>>> So... what if we change up how we develop and release Lucene:
>>>>
>>>>  * A major release always bumps the major release number (2.x ->
>>>>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>>>>    releases along that branch
>>>>
>>>>  * There is no back compat across major releases (index nor APIs),
>>>>    but full back compat within branches.
>>>>
>>>> This would match how many other projects work (KS/Lucy, as Marvin
>>>> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>>>>
>>>> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
>>>> if any devs have the itch, they could freely back-port improvements
>>>> from trunk as long as they kept back-compat within the branch.
>>>>
>>>> I think in such a future world, we could:
>>>>
>>>>  * Remove Version entirely!
>>>>
>>>>  * Not worry at all about back-compat when developing on trunk
>>>>
>>>>  * Give proper names to new improved classes instead of
>>>>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>>>>    today; rename existing classes.
>>>>
>>>>  * Let analyzers freely, incrementally improve
>>>>
>>>>  * Use interfaces without fear
>>>>
>>>>  * Stop spending the truly substantial time (look @ Uwe's awesome
>>>>    back-compat layer for analyzers!) that we now must spend when
>>>>    adding new features, for back-compat
>>>>
>>>>  * Be more free to introduce very new not-fully-baked features/APIs,
>>>>    marked as experimental, on the expectation that once they are used
>>>>    (in trunk) they will iterate/change/improve vs trying so hard to
>>>>    get things right on the first go for fear of future back compat
>>>>    horrors.
>>>>
>>>> Thoughts...?
>>>>
>>>> Mike
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Kirill Zakharenko/?????? ????????? (earwin@gmail.com)
>>>
>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>> ICQ: 104465785
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

So then I don't understand this:

{quote}
* A major release always bumps the major release number (2.x ->
   3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
   releases along that branch

* There is no back compat across major releases (index nor APIs),
   but full back compat within branches.

{quote}

What's different than what's done today? How can we remove Version in that
world, if we need to maintain full back-compat between 3.1 and 3.2, index
and API-wise? We'll still need to deprecate and come up w/ new classes every
time, and we'll still need to maintain runtime changes back-compat.

Unless you're telling me we'll start releasing major releases more often?
Well ... then we're saying the same thing, only I think that instead of
releasing 4, 5, 6, 7, 8 every 6 months, we can release 3.1, 3.2, 3.5 ...
because if you look back, every minor release included API deprecations as
well as back-compat breaks. That means that every minor release should have
been a major release right?

Point is, if I understand correctly and you agree w/ my statement above - I
don't see why would anyone releases a 3.x after 4.0 is out unless someone
really wants to work hard on maintaining back-compat of some features.

If it's just a numbering thing, then I don't think it matters what is
defined as 'major' vs. 'minor'. One way is to define 'major' as X and minor
X.Y, and another is to define major as 'X.Y' and minor as 'X.Y.Z'. I prefer
the latter but don't have any strong feelings against the former. Just
pointing out that X will grow more rapidly than today. That's all.

So did I get it right?

Shai

On Thu, Apr 15, 2010 at 8:19 AM, Mark Miller <ma...@gmail.com> wrote:

> I don't read what you wrote and what Mike wrote as even close to the same.
>
> - Mark
>
> http://www.lucidimagination.com (mobile)
>
> On Apr 15, 2010, at 12:05 AM, Shai Erera <se...@gmail.com> wrote:
>
> Ahh ... a dream finally comes true ... what a great way to start a day :).
> +1 !!!
>
> I have some questions/comments though:
>
> * Index back compat should be maintained between major releases, like it is
> today, STRUCTURE-wise. So apps get a chance to incrementally upgrade their
> segments when they move from 2.x to 3.x before 4.0 lands and they'll need to
> call optimize() to ensure 4.0 still works on their index. I hope that will
> still be the case? Otherwise I don't see how we can prevent reindexing by
> apps.
> ** Index behavioral/runtime changes, like those of Analyzers, are ok to
> require a reindex, as proposed.
>
> So after 3.1 is out, trunk can break the API and 3.2 will have a new set of
> API? Cool and convenient. For how long do we keep the 3.1 branch around?
> Also, it used to only fix bugs, but from now on it'll be allowed to
> introduce new features, if they maintain back-compat? So 3.1.1 can have
> 'flex' (going for the extreme on purpose) if someone maintains back-compat?
>
> I think the back-compat on branches should be only for index runtime
> changes. There's no point, in my opinion, to maintain API back-compat
> anymore for jars drop-in, if apps will need to upgrade from 3.1 to 3.1.1
> just to get a new feature but get it API back-supported? As soon as they
> upgrade to 3.2, that means a new set of API right?
>
> Major releases will just change the index structure format then? Or move to
> Java 1.6? Well ... not even that because as I understand it, 3.2 can move to
> Java 1.6 ... no API back-compat right :).
>
> That's definitely a great step forward !
>
> Shai
>
> On Thu, Apr 15, 2010 at 1:34 AM, Andi Vajda < <va...@osafoundation.org>
> vajda@osafoundation.org> wrote:
>
>>
>> On Thu, 15 Apr 2010, Earwin Burrfoot wrote:
>>
>>  Can't believe my eyes.
>>>
>>> +1
>>>
>>
>> Likewise. +1 !
>>
>> Andi..
>>
>>
>>> On Thu, Apr 15, 2010 at 01:22, Michael McCandless
>>> < <lu...@mikemccandless.com> wrote:
>>>
>>>> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
>>>> < <ma...@rectangular.com> wrote:
>>>>
>>>>  Essentially, we're free to break back compat within "Lucy" at any time,
>>>>> but
>>>>> we're not able to break back compat within a stable fork like "Lucy1",
>>>>> "Lucy2", etc.  So what we'll probably do during normal development with
>>>>> Analyzers is just change them and note the break in the Changes file.
>>>>>
>>>>
>>>> So... what if we change up how we develop and release Lucene:
>>>>
>>>>  * A major release always bumps the major release number (2.x ->
>>>>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>>>>    releases along that branch
>>>>
>>>>  * There is no back compat across major releases (index nor APIs),
>>>>    but full back compat within branches.
>>>>
>>>> This would match how many other projects work (KS/Lucy, as Marvin
>>>> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>>>>
>>>> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
>>>> if any devs have the itch, they could freely back-port improvements
>>>> from trunk as long as they kept back-compat within the branch.
>>>>
>>>> I think in such a future world, we could:
>>>>
>>>>  * Remove Version entirely!
>>>>
>>>>  * Not worry at all about back-compat when developing on trunk
>>>>
>>>>  * Give proper names to new improved classes instead of
>>>>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>>>>    today; rename existing classes.
>>>>
>>>>  * Let analyzers freely, incrementally improve
>>>>
>>>>  * Use interfaces without fear
>>>>
>>>>  * Stop spending the truly substantial time (look @ Uwe's awesome
>>>>    back-compat layer for analyzers!) that we now must spend when
>>>>    adding new features, for back-compat
>>>>
>>>>  * Be more free to introduce very new not-fully-baked features/APIs,
>>>>    marked as experimental, on the expectation that once they are used
>>>>    (in trunk) they will iterate/change/improve vs trying so hard to
>>>>    get things right on the first go for fear of future back compat
>>>>    horrors.
>>>>
>>>> Thoughts...?
>>>>
>>>> Mike
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: <ja...@lucene.apache.org>
>>>> java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: <ja...@lucene.apache.org>
>>>> java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Kirill Zakharenko/?????? ????????? ( <ea...@gmail.com>earwin@gmail.com)
>>>
>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>> ICQ: 104465785
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: <ja...@lucene.apache.org>
>>> java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: <ja...@lucene.apache.org>
>>> java-dev-help@lucene.apache.org
>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: <ja...@lucene.apache.org>
>> java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: <ja...@lucene.apache.org>
>> java-dev-help@lucene.apache.org
>>
>
>

Re: Proposal about Version API "relaxation"

Posted by Mark Miller <ma...@gmail.com>.

I don't read what you wrote and what Mike wrote as even close to the  
same.

- Mark

http://www.lucidimagination.com (mobile)

On Apr 15, 2010, at 12:05 AM, Shai Erera <se...@gmail.com> wrote:

> Ahh ... a dream finally comes true ... what a great way to start a  
> day :). +1 !!!
>
> I have some questions/comments though:
>
> * Index back compat should be maintained between major releases,  
> like it is today, STRUCTURE-wise. So apps get a chance to  
> incrementally upgrade their segments when they move from 2.x to 3.x  
> before 4.0 lands and they'll need to call optimize() to ensure 4.0  
> still works on their index. I hope that will still be the case?  
> Otherwise I don't see how we can prevent reindexing by apps.
> ** Index behavioral/runtime changes, like those of Analyzers, are ok  
> to require a reindex, as proposed.
>
> So after 3.1 is out, trunk can break the API and 3.2 will have a new  
> set of API? Cool and convenient. For how long do we keep the 3.1  
> branch around? Also, it used to only fix bugs, but from now on it'll  
> be allowed to introduce new features, if they maintain back-compat?  
> So 3.1.1 can have 'flex' (going for the extreme on purpose) if  
> someone maintains back-compat?
>
> I think the back-compat on branches should be only for index runtime  
> changes. There's no point, in my opinion, to maintain API back- 
> compat anymore for jars drop-in, if apps will need to upgrade from  
> 3.1 to 3.1.1 just to get a new feature but get it API back- 
> supported? As soon as they upgrade to 3.2, that means a new set of  
> API right?
>
> Major releases will just change the index structure format then? Or  
> move to Java 1.6? Well ... not even that because as I understand it,  
> 3.2 can move to Java 1.6 ... no API back-compat right :).
>
> That's definitely a great step forward !
>
> Shai
>
> On Thu, Apr 15, 2010 at 1:34 AM, Andi Vajda  
> <va...@osafoundation.org> wrote:
>
> On Thu, 15 Apr 2010, Earwin Burrfoot wrote:
>
> Can't believe my eyes.
>
> +1
>
> Likewise. +1 !
>
> Andi..
>
>
> On Thu, Apr 15, 2010 at 01:22, Michael McCandless
> <lu...@mikemccandless.com> wrote:
> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
> <ma...@rectangular.com> wrote:
>
> Essentially, we're free to break back compat within "Lucy" at any  
> time, but
> we're not able to break back compat within a stable fork like "Lucy1",
> "Lucy2", etc.  So what we'll probably do during normal development  
> with
> Analyzers is just change them and note the break in the Changes file.
>
> So... what if we change up how we develop and release Lucene:
>
>  * A major release always bumps the major release number (2.x ->
>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>    releases along that branch
>
>  * There is no back compat across major releases (index nor APIs),
>    but full back compat within branches.
>
> This would match how many other projects work (KS/Lucy, as Marvin
> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>
> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
> if any devs have the itch, they could freely back-port improvements
> from trunk as long as they kept back-compat within the branch.
>
> I think in such a future world, we could:
>
>  * Remove Version entirely!
>
>  * Not worry at all about back-compat when developing on trunk
>
>  * Give proper names to new improved classes instead of
>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>    today; rename existing classes.
>
>  * Let analyzers freely, incrementally improve
>
>  * Use interfaces without fear
>
>  * Stop spending the truly substantial time (look @ Uwe's awesome
>    back-compat layer for analyzers!) that we now must spend when
>    adding new features, for back-compat
>
>  * Be more free to introduce very new not-fully-baked features/APIs,
>    marked as experimental, on the expectation that once they are used
>    (in trunk) they will iterate/change/improve vs trying so hard to
>    get things right on the first go for fear of future back compat
>    horrors.
>
> Thoughts...?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
>
>
> -- 
> Kirill Zakharenko/?????? ????????? (earwin@gmail.com)
>
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Ahh ... a dream finally comes true ... what a great way to start a day :).
+1 !!!

I have some questions/comments though:

* Index back compat should be maintained between major releases, like it is
today, STRUCTURE-wise. So apps get a chance to incrementally upgrade their
segments when they move from 2.x to 3.x before 4.0 lands and they'll need to
call optimize() to ensure 4.0 still works on their index. I hope that will
still be the case? Otherwise I don't see how we can prevent reindexing by
apps.
** Index behavioral/runtime changes, like those of Analyzers, are ok to
require a reindex, as proposed.

So after 3.1 is out, trunk can break the API and 3.2 will have a new set of
API? Cool and convenient. For how long do we keep the 3.1 branch around?
Also, it used to only fix bugs, but from now on it'll be allowed to
introduce new features, if they maintain back-compat? So 3.1.1 can have
'flex' (going for the extreme on purpose) if someone maintains back-compat?

I think the back-compat on branches should be only for index runtime
changes. There's no point, in my opinion, to maintain API back-compat
anymore for jars drop-in, if apps will need to upgrade from 3.1 to 3.1.1
just to get a new feature but get it API back-supported? As soon as they
upgrade to 3.2, that means a new set of API right?

Major releases will just change the index structure format then? Or move to
Java 1.6? Well ... not even that because as I understand it, 3.2 can move to
Java 1.6 ... no API back-compat right :).

That's definitely a great step forward !

Shai

On Thu, Apr 15, 2010 at 1:34 AM, Andi Vajda <va...@osafoundation.org> wrote:

>
> On Thu, 15 Apr 2010, Earwin Burrfoot wrote:
>
>  Can't believe my eyes.
>>
>> +1
>>
>
> Likewise. +1 !
>
> Andi..
>
>
>> On Thu, Apr 15, 2010 at 01:22, Michael McCandless
>> <lu...@mikemccandless.com> wrote:
>>
>>> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
>>> <ma...@rectangular.com> wrote:
>>>
>>>  Essentially, we're free to break back compat within "Lucy" at any time,
>>>> but
>>>> we're not able to break back compat within a stable fork like "Lucy1",
>>>> "Lucy2", etc.  So what we'll probably do during normal development with
>>>> Analyzers is just change them and note the break in the Changes file.
>>>>
>>>
>>> So... what if we change up how we develop and release Lucene:
>>>
>>>  * A major release always bumps the major release number (2.x ->
>>>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>>>    releases along that branch
>>>
>>>  * There is no back compat across major releases (index nor APIs),
>>>    but full back compat within branches.
>>>
>>> This would match how many other projects work (KS/Lucy, as Marvin
>>> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>>>
>>> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
>>> if any devs have the itch, they could freely back-port improvements
>>> from trunk as long as they kept back-compat within the branch.
>>>
>>> I think in such a future world, we could:
>>>
>>>  * Remove Version entirely!
>>>
>>>  * Not worry at all about back-compat when developing on trunk
>>>
>>>  * Give proper names to new improved classes instead of
>>>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>>>    today; rename existing classes.
>>>
>>>  * Let analyzers freely, incrementally improve
>>>
>>>  * Use interfaces without fear
>>>
>>>  * Stop spending the truly substantial time (look @ Uwe's awesome
>>>    back-compat layer for analyzers!) that we now must spend when
>>>    adding new features, for back-compat
>>>
>>>  * Be more free to introduce very new not-fully-baked features/APIs,
>>>    marked as experimental, on the expectation that once they are used
>>>    (in trunk) they will iterate/change/improve vs trying so hard to
>>>    get things right on the first go for fear of future back compat
>>>    horrors.
>>>
>>> Thoughts...?
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>>
>>
>>
>> --
>> Kirill Zakharenko/?????? ????????? (earwin@gmail.com)
>>
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

Re: Proposal about Version API "relaxation"

Posted by Andi Vajda <va...@osafoundation.org>.

On Thu, 15 Apr 2010, Earwin Burrfoot wrote:

> Can't believe my eyes.
>
> +1

Likewise. +1 !

Andi..

>
> On Thu, Apr 15, 2010 at 01:22, Michael McCandless
> <lu...@mikemccandless.com> wrote:
>> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
>> <ma...@rectangular.com> wrote:
>>
>>> Essentially, we're free to break back compat within "Lucy" at any time, but
>>> we're not able to break back compat within a stable fork like "Lucy1",
>>> "Lucy2", etc.  So what we'll probably do during normal development with
>>> Analyzers is just change them and note the break in the Changes file.
>>
>> So... what if we change up how we develop and release Lucene:
>>
>>  * A major release always bumps the major release number (2.x ->
>>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>>    releases along that branch
>>
>>  * There is no back compat across major releases (index nor APIs),
>>    but full back compat within branches.
>>
>> This would match how many other projects work (KS/Lucy, as Marvin
>> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>>
>> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
>> if any devs have the itch, they could freely back-port improvements
>> from trunk as long as they kept back-compat within the branch.
>>
>> I think in such a future world, we could:
>>
>>  * Remove Version entirely!
>>
>>  * Not worry at all about back-compat when developing on trunk
>>
>>  * Give proper names to new improved classes instead of
>>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>>    today; rename existing classes.
>>
>>  * Let analyzers freely, incrementally improve
>>
>>  * Use interfaces without fear
>>
>>  * Stop spending the truly substantial time (look @ Uwe's awesome
>>    back-compat layer for analyzers!) that we now must spend when
>>    adding new features, for back-compat
>>
>>  * Be more free to introduce very new not-fully-baked features/APIs,
>>    marked as experimental, on the expectation that once they are used
>>    (in trunk) they will iterate/change/improve vs trying so hard to
>>    get things right on the first go for fear of future back compat
>>    horrors.
>>
>> Thoughts...?
>>
>> Mike
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
>
> -- 
> Kirill Zakharenko/?????? ????????? (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

Can't believe my eyes.

+1

On Thu, Apr 15, 2010 at 01:22, Michael McCandless
<lu...@mikemccandless.com> wrote:
> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
> <ma...@rectangular.com> wrote:
>
>> Essentially, we're free to break back compat within "Lucy" at any time, but
>> we're not able to break back compat within a stable fork like "Lucy1",
>> "Lucy2", etc.  So what we'll probably do during normal development with
>> Analyzers is just change them and note the break in the Changes file.
>
> So... what if we change up how we develop and release Lucene:
>
>  * A major release always bumps the major release number (2.x ->
>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>    releases along that branch
>
>  * There is no back compat across major releases (index nor APIs),
>    but full back compat within branches.
>
> This would match how many other projects work (KS/Lucy, as Marvin
> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>
> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
> if any devs have the itch, they could freely back-port improvements
> from trunk as long as they kept back-compat within the branch.
>
> I think in such a future world, we could:
>
>  * Remove Version entirely!
>
>  * Not worry at all about back-compat when developing on trunk
>
>  * Give proper names to new improved classes instead of
>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>    today; rename existing classes.
>
>  * Let analyzers freely, incrementally improve
>
>  * Use interfaces without fear
>
>  * Stop spending the truly substantial time (look @ Uwe's awesome
>    back-compat layer for analyzers!) that we now must spend when
>    adding new features, for back-compat
>
>  * Be more free to introduce very new not-fully-baked features/APIs,
>    marked as experimental, on the expectation that once they are used
>    (in trunk) they will iterate/change/improve vs trying so hard to
>    get things right on the first go for fear of future back compat
>    horrors.
>
> Thoughts...?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Grant Ingersoll <gs...@apache.org>.

+1

On Apr 14, 2010, at 5:22 PM, Michael McCandless wrote:

> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
> <ma...@rectangular.com> wrote:
> 
>> Essentially, we're free to break back compat within "Lucy" at any time, but
>> we're not able to break back compat within a stable fork like "Lucy1",
>> "Lucy2", etc.  So what we'll probably do during normal development with
>> Analyzers is just change them and note the break in the Changes file.
> 
> So... what if we change up how we develop and release Lucene:
> 
>  * A major release always bumps the major release number (2.x ->
>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>    releases along that branch
> 
>  * There is no back compat across major releases (index nor APIs),
>    but full back compat within branches.
> 
> This would match how many other projects work (KS/Lucy, as Marvin
> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
> 
> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
> if any devs have the itch, they could freely back-port improvements
> from trunk as long as they kept back-compat within the branch.
> 
> I think in such a future world, we could:
> 
>  * Remove Version entirely!
> 
>  * Not worry at all about back-compat when developing on trunk
> 
>  * Give proper names to new improved classes instead of
>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>    today; rename existing classes.
> 
>  * Let analyzers freely, incrementally improve
> 
>  * Use interfaces without fear
> 
>  * Stop spending the truly substantial time (look @ Uwe's awesome
>    back-compat layer for analyzers!) that we now must spend when
>    adding new features, for back-compat
> 
>  * Be more free to introduce very new not-fully-baked features/APIs,
>    marked as experimental, on the expectation that once they are used
>    (in trunk) they will iterate/change/improve vs trying so hard to
>    get things right on the first go for fear of future back compat
>    horrors.
> 
> Thoughts...?
> 
> Mike
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

+1

On Wed, Apr 14, 2010 at 5:22 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
> <ma...@rectangular.com> wrote:
>
> > Essentially, we're free to break back compat within "Lucy" at any time,
> but
> > we're not able to break back compat within a stable fork like "Lucy1",
> > "Lucy2", etc.  So what we'll probably do during normal development with
> > Analyzers is just change them and note the break in the Changes file.
>
> So... what if we change up how we develop and release Lucene:
>
>  * A major release always bumps the major release number (2.x ->
>    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
>    releases along that branch
>
>  * There is no back compat across major releases (index nor APIs),
>    but full back compat within branches.
>
> This would match how many other projects work (KS/Lucy, as Marvin
> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).
>
> The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
> if any devs have the itch, they could freely back-port improvements
> from trunk as long as they kept back-compat within the branch.
>
> I think in such a future world, we could:
>
>  * Remove Version entirely!
>
>  * Not worry at all about back-compat when developing on trunk
>
>  * Give proper names to new improved classes instead of
>    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
>    today; rename existing classes.
>
>  * Let analyzers freely, incrementally improve
>
>  * Use interfaces without fear
>
>  * Stop spending the truly substantial time (look @ Uwe's awesome
>    back-compat layer for analyzers!) that we now must spend when
>    adding new features, for back-compat
>
>  * Be more free to introduce very new not-fully-baked features/APIs,
>    marked as experimental, on the expectation that once they are used
>    (in trunk) they will iterate/change/improve vs trying so hard to
>    get things right on the first go for fear of future back compat
>    horrors.
>
> Thoughts...?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Earwin Burrfoot <ea...@gmail.com>.

> reasonable, but changing APIs around when there's not a good reason
> behind it (other than someone liked the name a little better) should
> still be approached with caution.

Changing names is a good enough reason :)
They make a darn difference between having to read a book to be able
to use some library, or just playing around with it for a bit.

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Yonik Seeley <yo...@lucidimagination.com>.

On Wed, Apr 14, 2010 at 5:22 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
>  * There is no back compat across major releases (index nor APIs),
>    but full back compat within branches.
>
> This would match how many other projects work (KS/Lucy, as Marvin
> describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).

Sort of... except many of these projects listed above care a lot about
back compat, even between major releases.  So while we could always
break back compat, we shouldn't do so unless it's necessary.  It's not
an all-or-nothing scenario though... requiring re-indexing seems
reasonable, but changing APIs around when there's not a good reason
behind it (other than someone liked the name a little better) should
still be approached with caution.

-Yonik
Apache Lucene Eurocon 2010
18-21 May 2010 | Prague

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey
<ma...@rectangular.com> wrote:

> Essentially, we're free to break back compat within "Lucy" at any time, but
> we're not able to break back compat within a stable fork like "Lucy1",
> "Lucy2", etc.  So what we'll probably do during normal development with
> Analyzers is just change them and note the break in the Changes file.

So... what if we change up how we develop and release Lucene:

  * A major release always bumps the major release number (2.x ->
    3.0), and, starts a new branch for all minor (3.1, 3.2, 3.3)
    releases along that branch

  * There is no back compat across major releases (index nor APIs),
    but full back compat within branches.

This would match how many other projects work (KS/Lucy, as Marvin
describes above; Apache Tomcat; Hibernate; log4J; FreeBSD; etc.).

The 'stable' branch (say 3.x now for Lucene) would get bug fixes, and,
if any devs have the itch, they could freely back-port improvements
from trunk as long as they kept back-compat within the branch.

I think in such a future world, we could:

  * Remove Version entirely!

  * Not worry at all about back-compat when developing on trunk

  * Give proper names to new improved classes instead of
    StandardAnalzyer2, or SmartStandardAnalyzer, that we end up doing
    today; rename existing classes.

  * Let analyzers freely, incrementally improve

  * Use interfaces without fear

  * Stop spending the truly substantial time (look @ Uwe's awesome
    back-compat layer for analyzers!) that we now must spend when
    adding new features, for back-compat

  * Be more free to introduce very new not-fully-baked features/APIs,
    marked as experimental, on the expectation that once they are used
    (in trunk) they will iterate/change/improve vs trying so hard to
    get things right on the first go for fear of future back compat
    horrors.

Thoughts...?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Tue, Apr 13, 2010 at 02:46:56PM -0400, Robert Muir wrote:

> Unlike other components in Lucene, Analyzers get hit the worst because any
> change is basically a break, and there's really not any other option besides
> Version to implement any backwards compatibility for them.

New class names would work, too.  

I only mention that for the sake of completeness, though -- it's not a
suggestion.

> But things like index back compat seems kinda useless for analyzers anyway.
> If we improve them in nearly any way, you have to reindex with them to get
> the benefits.

I'm a little concerned about the issue DM Smith brought up: what happens when
you have separate applications within the same JVM which have built indexes
using separate versions of an Analyzer?

That use case is supported under the current regime, but I'm not sure whether
it would be with aggressively versioned Analyzer packages.  If it's not, under
what circumstances does that matter?

> I'd love to hear elaborations of any thoughts you have on how this could
> work.

Well, for Lucy, I think we may have addressed this problem with the new back
compat policy we're auditioning with KS:

    KinoSearch spins off stable forks into new namespaces periodically. As of
    this release, the latest is "KinoSearch1", forked from version 0.165.
    Users who require strong backwards compatibility should use a stable fork.

    The main namespace, "KinoSearch", is an unstable development branch (as
    hinted at by its version number). Superficial API changes are frequent.
    Hard file format compatibility breaks which require reindexing are rare,
    as we generally try to provide continuity across multiple releases, but
    they happen every once in a while.

Essentially, we're free to break back compat within "Lucy" at any time, but
we're not able to break back compat within a stable fork like "Lucy1",
"Lucy2", etc.  So what we'll probably do during normal development with
Analyzers is just change them and note the break in the Changes file.

I doubt such a policy would be an option for Lucene, though.  I think you'd
have to go with separate jars for lucene-core and lucene-analyzers, possibly
on independent release schedules.  You'd have to bundle the broken ones with
lucene-core until a major version break for bug compatibility, but the fixed
ones could be distributed via lucene-analyzers concurrently.

Hmm, I suppose that doesn't work with the convention that the only difference
between Lucene X.9 and Lucene Y.0 is the removal of deprecations.  But if
anything is crying out for a rethink in the Lucene back compat policy, IMO
that's it: make major version breaks act like major version breaks and change
stuff that needs changin'.

Marvin Humphrey

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Tue, Apr 13, 2010 at 2:39 PM, Marvin Humphrey <ma...@rectangular.com>wrote:
>
>
> I wonder if it's possible to solve this problem for Analyzers by decoupling
> their distribution from the Lucene core and versioning them separately.
>  I.e.
> remove MatchVersion and increment individual Analyzer version numbers
> instead.
>
> This wouldn't solve the problem for good defaults elsewhere in the library.
> For that, I see no remedy other than more frequent major version
> increments.
>

Marvin, I too have been trying to imagine a scheme for this. Unlike other
components in Lucene, Analyzers get hit the worst because any change is
basically a break, and there's really not any other option besides Version
to implement any backwards compatibility for them.

But things like index back compat seems kinda useless for analyzers anyway.
If we improve them in nearly any way, you have to reindex with them to get
the benefits.

I'd love to hear elaborations of any thoughts you have on how this could
work.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Tue, Apr 13, 2010 at 11:17:56AM -0700, Andi Vajda wrote:

> Using global statics is flawed.

+1.

I wonder if it's possible to solve this problem for Analyzers by decoupling
their distribution from the Lucene core and versioning them separately.  I.e.
remove MatchVersion and increment individual Analyzer version numbers instead.

This wouldn't solve the problem for good defaults elsewhere in the library.
For that, I see no remedy other than more frequent major version increments.

Marvin Humphrey


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Andi Vajda <va...@osafoundation.org>.

On Apr 13, 2010, at 11:09, Shai Erera <se...@gmail.com> wrote:

> > That is a static default!
>
> Yes Uwe ... I'm aware of that :)
> But that's not a static default for Lucene ... only for the  
> application, if it chooses to use it ...

So you have two apps on the same vm and both choose to use this global  
and pick different values. Now what ?

The static default is for the entire classloader, not just Lucene or  
the app.

Using global statics is flawed.

Andi..

>
> > so there are no plans to reimplement such a thing again
>
> Well ... that's not exactly what I'm proposing here. I'm not for re- 
> implementing any sort of staticness, unless the app chooses to use  
> it. And please don't give me that 'there are no plans ...' answer -  
> it kind of kills the discussion, which is not healthy for a community.
>
> I agree that static variables might cause troubles to some  
> deployments, BUT:
>
> 1) Not all apps are deployed on a Web Server together with other  
> apps who happen to use Lucene.
> 2) Those that are deployed on web servers usually include lucene.jar  
> in their classpath and are loaded by a different class loader than  
> the rest ...
>
> So we're really talking about deployments where Lucene is a common,  
> shared library between all apps ...
>
> And I guess that what bothers me the most is that it feels to me  
> like we're trying to protect people from stuff we haven't yet  
> received complaints on (at least none that I'm aware of), while  
> we're hurting the programming experience of others ... almost  
> recklessly. I'd hope we can find a way around that, because today I  
> pass the same Version value around everywhere, and it's simply  
> inconvenient. Just yesterday people complained about the need to  
> call writer.commit() after new IW() if they want to open a  
> reader ... one-liner inconvenience vs. dozen of lines here -- point  
> is, what's perceived as unnecessary code DOES bother people ... only  
> here it's just a setting thing, and my proposal is not to make it  
> generically static. So let's not get caught on that 'static-ness'.  
> And besides, if you ask me - variables like Version, that are needed  
> in so many places, are usually made static ... but not in Lucene ...
>
> So if possible ... I'd like to think how we can fix/improve the use  
> of Version, in ways that won't break apps. Because the fact to the  
> matter is - we invented Version to allow for changes w/o breaking  
> back-compat, while the backwards section in CHANGES seems to grow  
> from release to release (I know - I'm partly to blame for it :)),  
> and another fact is that I don't remember even one complaint about a  
> change which broke back-compat. People have raised this issue  
> numerous times in the past, even proposed all sorts of contracts and  
> definitions on how we can be 'allowed' to break back-compat ... but  
> nothing came out of it.
>
> The fact that we are not able to reach consensus doesn't mean the  
> problem doesn't bother many out there. And ignoring the fact that  
> currently the API looks cluttered is not doing any good. There must  
> be away to allow some apps out there (IMO the majority) to set that  
> Version thing once, and let Lucene use that value everywhere  
> else ... while for others to pass it along as much as they want.
>
> Shai
>
> On Tue, Apr 13, 2010 at 7:41 PM, Uwe Schindler <uw...@thetaphi.de>  
> wrote:
> Hi Shai,
>
>
>
> one of the problem I have is: That is a static default! We want to  
> get rid of them (and did it mostly, only some relicts remain), so  
> there are no plans to reimplement such a thing again. The badest one  
> is BooleanQuery.maxClauseCount. The same applies to all types of  
> sysprops. As Lucene and solr is mostly running in servlet  
> containers, this type of thing  makes web applications no longer  
> isolated. This is also a general contract for libraries: never ever  
> rely on sysprops or statics.
>
>
>
> Uwe
>
>
>
> -----
>
> Uwe Schindler
>
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
> http://www.thetaphi.de
>
> eMail: uwe@thetaphi.de
>
>
>
> From: Shai Erera [mailto:serera@gmail.com]
> Sent: Tuesday, April 13, 2010 5:27 PM
> To: java-dev@lucene.apache.org
> Subject: Proposal about Version API "relaxation"
>
>
>
> Hi
>
> I'd like to propose a relaxation on the Version API. Uwe, please  
> read the entire email before you reply :).
>
> I was thinking, following a question on the user list, that the  
> Version-based API may not be very intuitive to users, especially  
> those who don't care about versioning, as well as very inconvenient.  
> So there are two issues here:
> 1) How should one use Version smartly so that he keeps backwards  
> compatibility. I think we all know the answer, but a Wiki page with  
> some "best practices" tips would really help users use it.
> 2) How can one write sane code, which doesn't pass versions all over  
> the place if: (1) he doesn't care about versions, or (2) he cares,  
> and sets the Version to the same value in his app, in all places.
>
> Also, I think that today we offer a flexibility to users, to set  
> different Versions on different objects in the life span of their  
> application - which is a good flexibility but can also lead people  
> to shoot themselves in the legs if they're not careful -- e.g.  
> upgrading Version across their app, but failing to do so for one or  
> two places ...
>
> So the change I'd like to propose is to mostly alleviate (2) and  
> better protect users - I DO NOT PROPOSE TO GET RID OF Version :).
>
> I was thinking that we can add on Version a DEFAULT version, which  
> the caller can set. So Version.setDefault and Version.getDefault  
> will be added, as static members (more on the static-ness of it  
> later). We then change the API which requires Version to also expose  
> an API which doesn't require it, and that API will call  
> Version.getDefault(). People can use it if they want to ...
>
> Few points:
> 1) As a default DEFAULT Version is controversial, I don't want to  
> propose it, even though I think Lucene can define the DEFAULT to be  
> the latest. Instead, I propose that Version.getDefault throw a  
> DefaultVersionNotSetException if it wasn't set, while an API which  
> relies on the default Version is called (I don't want to return  
> null, not sure how safe it is).
> 2) That DEFAULT Version is static, which means it will affect all  
> indexing code running inside the JVM. Which is fine:
> 2.1) Perhaps all the indexing code should use the same Version
> 2.2) If you know that's not the case, then pass Version to the API  
> which requires it - you cannot use the 'default Version' API --  
> nothing changes for you.
> One case is missing -- you might not know if your code is the only  
> indexing code which runs in the JVM ... I don't have a solution to  
> that, but I think it'll be revealed pretty quickly, and you can  
> change your code then ...
>
> So to summarize - the current Version API will remain and people can  
> still use it. The DEFAULT Version API is meant for convenience for  
> those who don't want to pass Version everywhere, for the reasons I  
> outlined above. This will also clean our test code significantly, as  
> the tests will set the DEFAULT version to TEST_VERSION_CURRENT at  
> start ...
>
> The changes to the Version class will be very simple.
>
> If people think that's acceptable, I can open an issue and work on it.
>
> Shai
>
>

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Tue, Apr 13, 2010 at 2:09 PM, Shai Erera <se...@gmail.com> wrote:

>  Because the fact to the matter is - we invented Version to allow for
> changes w/o breaking back-compat, while the backwards section in CHANGES
> seems to grow from release to release (I know - I'm partly to blame for it
> :)), and another fact is that I don't remember even one complaint about a
> change which broke back-compat. People have raised this issue numerous times
> in the past, even proposed all sorts of contracts and definitions on how we
> can be 'allowed' to break back-compat ... but nothing came out of it.
>

 Lets not dance around the real issue then.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Grant Ingersoll <gs...@apache.org>.

On Apr 13, 2010, at 2:09 PM, Shai Erera wrote:
> 
> And I guess that what bothers me the most is that it feels to me like we're trying to protect people from stuff we haven't yet received complaints on (at least none that I'm aware of),

I think we have, they just aren't explicitly stated b/c most users don't necessarily grasp the subtleties of tokens at this level.  We have long talked and had questions about how to handle bug fixes in Analyzers.

> while we're hurting the programming experience of others ... almost recklessly.

Wearing my trainer hat as someone who has to explain this stuff to newbies on a regular basis, I agree, I think VERSION does hurt the programming experience (although, I don't agree it is even close to reckless) at least initially and the issue was more easily dealt with through CHANGES.txt and letting people know there is a bug fix that will require re-indexing.  That being said, I'm not sure I like the static proposal.   Separating out the analyzers more seems good on paper and versioning individual ones somehow.

-Grant

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

> That is a static default!

Yes Uwe ... I'm aware of that :)
But that's not a static default for Lucene ... only for the application, if
it chooses to use it ...

> so there are no plans to reimplement such a thing again

Well ... that's not exactly what I'm proposing here. I'm not for
re-implementing any sort of staticness, unless the app chooses to use it.
And please don't give me that 'there are no plans ...' answer - it kind of
kills the discussion, which is not healthy for a community.

I agree that static variables might cause troubles to some deployments, BUT:

1) Not all apps are deployed on a Web Server together with other apps who
happen to use Lucene.
2) Those that are deployed on web servers usually include lucene.jar in
their classpath and are loaded by a different class loader than the rest ...

So we're really talking about deployments where Lucene is a common, shared
library between all apps ...

And I guess that what bothers me the most is that it feels to me like we're
trying to protect people from stuff we haven't yet received complaints on
(at least none that I'm aware of), while we're hurting the programming
experience of others ... almost recklessly. I'd hope we can find a way
around that, because today I pass the same Version value around everywhere,
and it's simply inconvenient. Just yesterday people complained about the
need to call writer.commit() after new IW() if they want to open a reader
... one-liner inconvenience vs. dozen of lines here -- point is, what's
perceived as unnecessary code DOES bother people ... only here it's just a
setting thing, and my proposal is not to make it generically static. So
let's not get caught on that 'static-ness'. And besides, if you ask me
- variables
like Version, that are needed in so many places, are usually made static ...
but not in Lucene ...

So if possible ... I'd like to think how we can fix/improve the use of
Version, in ways that won't break apps. Because the fact to the matter is -
we invented Version to allow for changes w/o breaking back-compat, while the
backwards section in CHANGES seems to grow from release to release (I know -
I'm partly to blame for it :)), and another fact is that I don't remember
even one complaint about a change which broke back-compat. People have
raised this issue numerous times in the past, even proposed all sorts of
contracts and definitions on how we can be 'allowed' to break back-compat
... but nothing came out of it.

The fact that we are not able to reach consensus doesn't mean the problem
doesn't bother many out there. And ignoring the fact that currently the API
looks cluttered is not doing any good. There must be away to allow some apps
out there (IMO the majority) to set that Version thing once, and let Lucene
use that value everywhere else ... while for others to pass it along as much
as they want.

Shai

On Tue, Apr 13, 2010 at 7:41 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

>  Hi Shai,
>
>
>
> one of the problem I have is: That is a static default! We want to get rid
> of them (and did it mostly, only some relicts remain), so there are no plans
> to reimplement such a thing again. The badest one is
> BooleanQuery.maxClauseCount. The same applies to all types of sysprops. As
> Lucene and solr is mostly running in servlet containers, this type of thing
> makes web applications no longer isolated. This is also a general contract
> for libraries: never ever rely on sysprops or statics.
>
>
>
> Uwe
>
>
>
> -----
>
> Uwe Schindler
>
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
> http://www.thetaphi.de
>
> eMail: uwe@thetaphi.de
>
>
>
> *From:* Shai Erera [mailto:serera@gmail.com]
> *Sent:* Tuesday, April 13, 2010 5:27 PM
> *To:* java-dev@lucene.apache.org
> *Subject:* Proposal about Version API "relaxation"
>
>
>
> Hi
>
> I'd like to propose a relaxation on the Version API. Uwe, please read the
> entire email before you reply :).
>
> I was thinking, following a question on the user list, that the
> Version-based API may not be very intuitive to users, especially those who
> don't care about versioning, as well as very inconvenient. So there are two
> issues here:
> 1) How should one use Version smartly so that he keeps backwards
> compatibility. I think we all know the answer, but a Wiki page with some
> "best practices" tips would really help users use it.
> 2) How can one write sane code, which doesn't pass versions all over the
> place if: (1) he doesn't care about versions, or (2) he cares, and sets the
> Version to the same value in his app, in all places.
>
> Also, I think that today we offer a flexibility to users, to set different
> Versions on different objects in the life span of their application - which
> is a good flexibility but can also lead people to shoot themselves in the
> legs if they're not careful -- e.g. upgrading Version across their app, but
> failing to do so for one or two places ...
>
> So the change I'd like to propose is to mostly alleviate (2) and better
> protect users - I DO NOT PROPOSE TO GET RID OF Version :).
>
> I was thinking that we can add on Version a DEFAULT version, which the
> caller can set. So Version.setDefault and Version.getDefault will be added,
> as static members (more on the static-ness of it later). We then change the
> API which requires Version to also expose an API which doesn't require it,
> and that API will call Version.getDefault(). People can use it if they want
> to ...
>
> Few points:
> 1) As a default DEFAULT Version is controversial, I don't want to propose
> it, even though I think Lucene can define the DEFAULT to be the latest.
> Instead, I propose that Version.getDefault throw a
> DefaultVersionNotSetException if it wasn't set, while an API which relies on
> the default Version is called (I don't want to return null, not sure how
> safe it is).
> 2) That DEFAULT Version is static, which means it will affect all indexing
> code running inside the JVM. Which is fine:
> 2.1) Perhaps all the indexing code should use the same Version
> 2.2) If you know that's not the case, then pass Version to the API which
> requires it - you cannot use the 'default Version' API -- nothing changes
> for you.
> One case is missing -- you might not know if your code is the only indexing
> code which runs in the JVM ... I don't have a solution to that, but I think
> it'll be revealed pretty quickly, and you can change your code then ...
>
> So to summarize - the current Version API will remain and people can still
> use it. The DEFAULT Version API is meant for convenience for those who don't
> want to pass Version everywhere, for the reasons I outlined above. This will
> also clean our test code significantly, as the tests will set the DEFAULT
> version to TEST_VERSION_CURRENT at start ...
>
> The changes to the Version class will be very simple.
>
> If people think that's acceptable, I can open an issue and work on it.
>
> Shai
>

RE: Proposal about Version API "relaxation"

Posted by Uwe Schindler <uw...@thetaphi.de>.

Hi Shai,

 

one of the problem I have is: That is a static default! We want to get rid of them (and did it mostly, only some relicts remain), so there are no plans to reimplement such a thing again. The badest one is BooleanQuery.maxClauseCount. The same applies to all types of sysprops. As Lucene and solr is mostly running in servlet containers, this type of thing  makes web applications no longer isolated. This is also a general contract for libraries: never ever rely on sysprops or statics.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Shai Erera [mailto:serera@gmail.com] 
Sent: Tuesday, April 13, 2010 5:27 PM
To: java-dev@lucene.apache.org
Subject: Proposal about Version API "relaxation"

 

Hi

I'd like to propose a relaxation on the Version API. Uwe, please read the entire email before you reply :).

I was thinking, following a question on the user list, that the Version-based API may not be very intuitive to users, especially those who don't care about versioning, as well as very inconvenient. So there are two issues here:
1) How should one use Version smartly so that he keeps backwards compatibility. I think we all know the answer, but a Wiki page with some "best practices" tips would really help users use it.
2) How can one write sane code, which doesn't pass versions all over the place if: (1) he doesn't care about versions, or (2) he cares, and sets the Version to the same value in his app, in all places.

Also, I think that today we offer a flexibility to users, to set different Versions on different objects in the life span of their application - which is a good flexibility but can also lead people to shoot themselves in the legs if they're not careful -- e.g. upgrading Version across their app, but failing to do so for one or two places ...

So the change I'd like to propose is to mostly alleviate (2) and better protect users - I DO NOT PROPOSE TO GET RID OF Version :).

I was thinking that we can add on Version a DEFAULT version, which the caller can set. So Version.setDefault and Version.getDefault will be added, as static members (more on the static-ness of it later). We then change the API which requires Version to also expose an API which doesn't require it, and that API will call Version.getDefault(). People can use it if they want to ...

Few points:
1) As a default DEFAULT Version is controversial, I don't want to propose it, even though I think Lucene can define the DEFAULT to be the latest. Instead, I propose that Version.getDefault throw a DefaultVersionNotSetException if it wasn't set, while an API which relies on the default Version is called (I don't want to return null, not sure how safe it is).
2) That DEFAULT Version is static, which means it will affect all indexing code running inside the JVM. Which is fine:
2.1) Perhaps all the indexing code should use the same Version
2.2) If you know that's not the case, then pass Version to the API which requires it - you cannot use the 'default Version' API -- nothing changes for you.
One case is missing -- you might not know if your code is the only indexing code which runs in the JVM ... I don't have a solution to that, but I think it'll be revealed pretty quickly, and you can change your code then ...

So to summarize - the current Version API will remain and people can still use it. The DEFAULT Version API is meant for convenience for those who don't want to pass Version everywhere, for the reasons I outlined above. This will also clean our test code significantly, as the tests will set the DEFAULT version to TEST_VERSION_CURRENT at start ...

The changes to the Version class will be very simple.

If people think that's acceptable, I can open an issue and work on it.

Shai

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

Well the no-arg ctor will be using Version.getDefault() which will
throw an exception if not set, and delegate the call to the
Version-aware ctor.

On Tuesday, April 13, 2010, Robert Muir <rc...@gmail.com> wrote:
> On Tue, Apr 13, 2010 at 11:27 AM, Shai Erera <se...@gmail.com> wrote:
>
>
> I was thinking that we can add on Version a DEFAULT version, which the caller can set. So Version.setDefault and Version.getDefault will be added, as static members (more on the static-ness of it later). We then change the API which requires Version to also expose an API which doesn't require it, and that API will call Version.getDefault(). People can use it if they want to ...
>
> I don't understand how this works... if Something has a no-arg ctor today, and i want to improve it in a backwards-compatible way, how will this work?
> the way this works today, lets say while working with 3.1 is:
>
> Something() is deprecated, and invokes Something(3.0)Something(Version) is added, and emulates the old behavior for < 3.1, and the new behavior for >= 3.1
> i dont see how backwards compatibility will work with this proposal, since the no-arg ctor would then emulate some random behavior depending on a static.
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Robert Muir <rc...@gmail.com>.

On Tue, Apr 13, 2010 at 11:27 AM, Shai Erera <se...@gmail.com> wrote:

> I was thinking that we can add on Version a DEFAULT version, which the
> caller can set. So Version.setDefault and Version.getDefault will be added,
> as static members (more on the static-ness of it later). We then change the
> API which requires Version to also expose an API which doesn't require it,
> and that API will call Version.getDefault(). People can use it if they want
> to ...
>

I don't understand how this works... if Something has a no-arg ctor today,
and i want to improve it in a backwards-compatible way, how will this work?

the way this works today, lets say while working with 3.1 is:
Something() is deprecated, and invokes Something(3.0)
Something(Version) is added, and emulates the old behavior for < 3.1, and
the new behavior for >= 3.1

i dont see how backwards compatibility will work with this proposal, since
the no-arg ctor would then emulate some random behavior depending on a
static.

-- 
Robert Muir
rcmuir@gmail.com

Re: Proposal about Version API "relaxation"

Posted by Shai Erera <se...@gmail.com>.

> Because the version mechanism is not a single value for the entire library
but rather feature by feature. I don't see how a global setter can help.

That's only true if we believe people use different Version values in
different places of their code ... and note that they will still be able to.
I'm not proposing to take out Version from the ctors, just to add an
additional default-version the app can set and use.So if the app doesn't
want to do it .. it doesn't have to.

Shai

On Tue, Apr 13, 2010 at 9:40 PM, DM Smith <dm...@gmail.com> wrote:

> I like the concept of version, but I'm concerned about it too.
>
> The current Version mechanism allows one to use more than one Version in
> their code. Imagine that we are at 3.2 and one was unable to upgrade to a
> most version for a particular feature. Let's also suppose that at 3.2 a new
> feature was introduced and was taken advantage of. But at 3.5 that new
> feature is versioned but one is unable to upgrade for it, too. Now what? Use
> 3.0 for the one feature and 3.2 for the other?
>
> What about the interoperability of versioned features? Does a version 3.0
> class play well with a 3.2 versioned class? How do we test that?
>
> A long term issue is that of bw compat for the version itself. The bw
> compat contract is two fold: API and index. The API has a shorter lifetime
> of compatibility than that of an index. How does one deprecate a particular
> version for the api but not the index? How does one know whether one
> versioned feature impacts the index and an other does not?
>
> I'm hoping that I'm imagining a problem that will never actually arise.
>
> Shai, to your suggestion: Because the version mechanism is not a single
> value for the entire library but rather feature by feature. I don't see how
> a global setter can help.
>
> -- DM
>
>
> On 04/13/2010 11:27 AM, Shai Erera wrote:
>
>> Hi
>>
>> I'd like to propose a relaxation on the Version API. Uwe, please read the
>> entire email before you reply :).
>>
>> I was thinking, following a question on the user list, that the
>> Version-based API may not be very intuitive to users, especially those who
>> don't care about versioning, as well as very inconvenient. So there are two
>> issues here:
>> 1) How should one use Version smartly so that he keeps backwards
>> compatibility. I think we all know the answer, but a Wiki page with some
>> "best practices" tips would really help users use it.
>> 2) How can one write sane code, which doesn't pass versions all over the
>> place if: (1) he doesn't care about versions, or (2) he cares, and sets the
>> Version to the same value in his app, in all places.
>>
>> Also, I think that today we offer a flexibility to users, to set different
>> Versions on different objects in the life span of their application - which
>> is a good flexibility but can also lead people to shoot themselves in the
>> legs if they're not careful -- e.g. upgrading Version across their app, but
>> failing to do so for one or two places ...
>>
>> So the change I'd like to propose is to mostly alleviate (2) and better
>> protect users - I DO NOT PROPOSE TO GET RID OF Version :).
>>
>> I was thinking that we can add on Version a DEFAULT version, which the
>> caller can set. So Version.setDefault and Version.getDefault will be added,
>> as static members (more on the static-ness of it later). We then change the
>> API which requires Version to also expose an API which doesn't require it,
>> and that API will call Version.getDefault(). People can use it if they want
>> to ...
>>
>> Few points:
>> 1) As a default DEFAULT Version is controversial, I don't want to propose
>> it, even though I think Lucene can define the DEFAULT to be the latest.
>> Instead, I propose that Version.getDefault throw a
>> DefaultVersionNotSetException if it wasn't set, while an API which relies on
>> the default Version is called (I don't want to return null, not sure how
>> safe it is).
>> 2) That DEFAULT Version is static, which means it will affect all indexing
>> code running inside the JVM. Which is fine:
>> 2.1) Perhaps all the indexing code should use the same Version
>> 2.2) If you know that's not the case, then pass Version to the API which
>> requires it - you cannot use the 'default Version' API -- nothing changes
>> for you.
>> One case is missing -- you might not know if your code is the only
>> indexing code which runs in the JVM ... I don't have a solution to that, but
>> I think it'll be revealed pretty quickly, and you can change your code then
>> ...
>>
>> So to summarize - the current Version API will remain and people can still
>> use it. The DEFAULT Version API is meant for convenience for those who don't
>> want to pass Version everywhere, for the reasons I outlined above. This will
>> also clean our test code significantly, as the tests will set the DEFAULT
>> version to TEST_VERSION_CURRENT at start ...
>>
>> The changes to the Version class will be very simple.
>>
>> If people think that's acceptable, I can open an issue and work on it.
>>
>> Shai
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Proposal about Version API "relaxation"

Posted by DM Smith <dm...@gmail.com>.

I like the concept of version, but I'm concerned about it too.

The current Version mechanism allows one to use more than one Version in 
their code. Imagine that we are at 3.2 and one was unable to upgrade to 
a most version for a particular feature. Let's also suppose that at 3.2 
a new feature was introduced and was taken advantage of. But at 3.5 that 
new feature is versioned but one is unable to upgrade for it, too. Now 
what? Use 3.0 for the one feature and 3.2 for the other?

What about the interoperability of versioned features? Does a version 
3.0 class play well with a 3.2 versioned class? How do we test that?

A long term issue is that of bw compat for the version itself. The bw 
compat contract is two fold: API and index. The API has a shorter 
lifetime of compatibility than that of an index. How does one deprecate 
a particular version for the api but not the index? How does one know 
whether one versioned feature impacts the index and an other does not?

I'm hoping that I'm imagining a problem that will never actually arise.

Shai, to your suggestion: Because the version mechanism is not a single 
value for the entire library but rather feature by feature. I don't see 
how a global setter can help.

-- DM

On 04/13/2010 11:27 AM, Shai Erera wrote:
> Hi
>
> I'd like to propose a relaxation on the Version API. Uwe, please read 
> the entire email before you reply :).
>
> I was thinking, following a question on the user list, that the 
> Version-based API may not be very intuitive to users, especially those 
> who don't care about versioning, as well as very inconvenient. So 
> there are two issues here:
> 1) How should one use Version smartly so that he keeps backwards 
> compatibility. I think we all know the answer, but a Wiki page with 
> some "best practices" tips would really help users use it.
> 2) How can one write sane code, which doesn't pass versions all over 
> the place if: (1) he doesn't care about versions, or (2) he cares, and 
> sets the Version to the same value in his app, in all places.
>
> Also, I think that today we offer a flexibility to users, to set 
> different Versions on different objects in the life span of their 
> application - which is a good flexibility but can also lead people to 
> shoot themselves in the legs if they're not careful -- e.g. upgrading 
> Version across their app, but failing to do so for one or two places ...
>
> So the change I'd like to propose is to mostly alleviate (2) and 
> better protect users - I DO NOT PROPOSE TO GET RID OF Version :).
>
> I was thinking that we can add on Version a DEFAULT version, which the 
> caller can set. So Version.setDefault and Version.getDefault will be 
> added, as static members (more on the static-ness of it later). We 
> then change the API which requires Version to also expose an API which 
> doesn't require it, and that API will call Version.getDefault(). 
> People can use it if they want to ...
>
> Few points:
> 1) As a default DEFAULT Version is controversial, I don't want to 
> propose it, even though I think Lucene can define the DEFAULT to be 
> the latest. Instead, I propose that Version.getDefault throw a 
> DefaultVersionNotSetException if it wasn't set, while an API which 
> relies on the default Version is called (I don't want to return null, 
> not sure how safe it is).
> 2) That DEFAULT Version is static, which means it will affect all 
> indexing code running inside the JVM. Which is fine:
> 2.1) Perhaps all the indexing code should use the same Version
> 2.2) If you know that's not the case, then pass Version to the API 
> which requires it - you cannot use the 'default Version' API -- 
> nothing changes for you.
> One case is missing -- you might not know if your code is the only 
> indexing code which runs in the JVM ... I don't have a solution to 
> that, but I think it'll be revealed pretty quickly, and you can change 
> your code then ...
>
> So to summarize - the current Version API will remain and people can 
> still use it. The DEFAULT Version API is meant for convenience for 
> those who don't want to pass Version everywhere, for the reasons I 
> outlined above. This will also clean our test code significantly, as 
> the tests will set the DEFAULT version to TEST_VERSION_CURRENT at 
> start ...
>
> The changes to the Version class will be very simple.
>
> If people think that's acceptable, I can open an issue and work on it.
>
> Shai


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Proposal about Version API "relaxation"

Posted by Tim Williams <wi...@gmail.com>.

On Tue, Apr 13, 2010 at 11:27 AM, Shai Erera <se...@gmail.com> wrote:
> Hi
>
> I'd like to propose a relaxation on the Version API. Uwe, please read the
> entire email before you reply :).
>
> I was thinking, following a question on the user list, that the
> Version-based API may not be very intuitive to users, especially those who
> don't care about versioning, as well as very inconvenient. So there are two
> issues here:
> 1) How should one use Version smartly so that he keeps backwards
> compatibility. I think we all know the answer, but a Wiki page with some
> "best practices" tips would really help users use it.
> 2) How can one write sane code, which doesn't pass versions all over the
> place if: (1) he doesn't care about versions, or (2) he cares, and sets the
> Version to the same value in his app, in all places.
>
> Also, I think that today we offer a flexibility to users, to set different
> Versions on different objects in the life span of their application - which
> is a good flexibility but can also lead people to shoot themselves in the
> legs if they're not careful -- e.g. upgrading Version across their app, but
> failing to do so for one or two places ...
>
> So the change I'd like to propose is to mostly alleviate (2) and better
> protect users - I DO NOT PROPOSE TO GET RID OF Version :).
>
> I was thinking that we can add on Version a DEFAULT version, which the
> caller can set. So Version.setDefault and Version.getDefault will be added,
> as static members (more on the static-ness of it later). We then change the
> API which requires Version to also expose an API which doesn't require it,
> and that API will call Version.getDefault(). People can use it if they want
> to ...
>
> Few points:
> 1) As a default DEFAULT Version is controversial, I don't want to propose
> it, even though I think Lucene can define the DEFAULT to be the latest.
> Instead, I propose that Version.getDefault throw a
> DefaultVersionNotSetException if it wasn't set, while an API which relies on
> the default Version is called (I don't want to return null, not sure how
> safe it is).
> 2) That DEFAULT Version is static, which means it will affect all indexing
> code running inside the JVM. Which is fine:
> 2.1) Perhaps all the indexing code should use the same Version
> 2.2) If you know that's not the case, then pass Version to the API which
> requires it - you cannot use the 'default Version' API -- nothing changes
> for you.
> One case is missing -- you might not know if your code is the only indexing
> code which runs in the JVM ... I don't have a solution to that, but I think
> it'll be revealed pretty quickly, and you can change your code then ...
>
> So to summarize - the current Version API will remain and people can still
> use it. The DEFAULT Version API is meant for convenience for those who don't
> want to pass Version everywhere, for the reasons I outlined above. This will
> also clean our test code significantly, as the tests will set the DEFAULT
> version to TEST_VERSION_CURRENT at start ...
>
> The changes to the Version class will be very simple.
>
> If people think that's acceptable, I can open an issue and work on it.

>From the peanut gallery, I say *please*.  This Version idiom has taken
an otherwise beautiful api and nearly boosted it to the likes of DOM
and InputStream;)

--tim

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org