You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Yonik Seeley <yo...@lucidimagination.com> on 2009/06/10 19:00:01 UTC

back compat is good

I'm starting to feel like the lone holdout that thinks back compat for
commonly used interfaces and index formats is important.  So I'll sum
up some of my thoughts and leave it at that:

- I doubt that the number of new users for each release of Lucene
exceeds the sum total of all existing users of Lucene.  Lucene is
already the dominant open source search library, so we're never going
to hit that type of exponential growth going forward.  Existing users
are very important.
- Good back compat makes the lives of all Lucene users easier
- Good back compat makes the lives of Lucene developers easier in some
ways also.  We don't *need* to go back and patcholder releases, since
we can say "use a newer release".  If things change too much, that
will no longer be an easy option for many users, and more people will
get stuck in the past because upgrading is too painful.
- The difficulty of change can also be a good thing - it forces people
to really think if changes are worth it and only add them where it
really makes sense.

The last threads on back compat generated so much volume that I
couldn't keep up, and I expect there are many others that couldn't
either.  I'm not personally interested in discussing it in the
abstract further... I'm more interested in actual code
patches/proposals.

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Simon Willnauer <si...@googlemail.com>.
On Wed, Jun 10, 2009 at 7:00 PM, Yonik Seeley<yo...@lucidimagination.com> wrote:
> I'm starting to feel like the lone holdout that thinks back compat for
> commonly used interfaces and index formats is important.  So I'll sum
> up some of my thoughts and leave it at that:
>
> - I doubt that the number of new users for each release of Lucene
> exceeds the sum total of all existing users of Lucene.  Lucene is
> already the dominant open source search library, so we're never going
> to hit that type of exponential growth going forward.  Existing users
> are very important.
> - Good back compat makes the lives of all Lucene users easier
> - Good back compat makes the lives of Lucene developers easier in some
> ways also.  We don't *need* to go back and patcholder releases, since
> we can say "use a newer release".  If things change too much, that
> will no longer be an easy option for many users, and more people will
> get stuck in the past because upgrading is too painful.
> - The difficulty of change can also be a good thing - it forces people
> to really think if changes are worth it and only add them where it
> really makes sense.
I have been around since 1.4 and looking back from today I assume it
is/was worth all the pain. Being able to not looking at lucene for 1
1/2 years and using it again without thinking too much about what has
changed is a huge advantage!

On the other hand, I really appreciate the decision of the Python
community moving forward and getting rid of legacy code, functions,
interfaces etc. in P3K. Each time you decide to take such a step you
will be in the same situation with back compatibility. I would not
change the policy and rather go a similar way as the python community
went with p3k.
A clean cut can have major advantages but after breaking compatibility
keep on sticking to the policy is a must I guess. the bad thing about
APIs is that you have only one chance to get it right.

I did not follow the thread about back compat at all so if that has
been proposed / discussed just ignore it.


>
> The last threads on back compat generated so much volume that I
> couldn't keep up, and I expect there are many others that couldn't
> either.  I'm not personally interested in discussing it in the
> abstract further... I'm more interested in actual code
> patches/proposals.
>
> -Yonik
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Mark Miller <ma...@gmail.com>.
Yonik Seeley wrote:
> I'm starting to feel like the lone holdout that thinks back compat for
> commonly used interfaces and index formats is important.  
I think the fact that your not the only one is why things got stymied.

I wouldnt personally support anything that didnt try and maintain 
stability in commonly used interfaces,
and it appeared that consensus easily favored maintaining strong index 
back compat.

The current policy has much stronger hooks than just common interfaces 
and index formats though.

For really important things, we make exceptions anyway, and that will 
probably still be the case.

The win we can probably get, I think, is a policy that makes things 
easier where we pay a lot for a little. Its worth a lot of pain to 
support common interfaces
and index formats. That doesnt cover all of the ground though.

We have already dealt with a lot of this by making special exceptions, 
using abstract classes, and 'experimental APIs'.

Perhaps it makes sense to just bring our back compat policy up to date 
with the reality of what has been happening anyway.

Or maybe nothing needs to be done after all. But I think we need to 
address the out of the box performance in some manner.

-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Grant Ingersoll <gs...@apache.org>.
I'm not against back compatibility.  In fact, I agree with your  
points, especially the use of the phrase "commonly used interfaces".

My main problem is our approach seems to be very dogmatic and  
detrimental for _less_ commonly used interfaces (more importantly less  
commonly _implemented_ Interfaces) and it creates a whole lot of cruft  
in the code.  Code that is only released every 6-12 months anyway.

Specific examples include:
1. Fieldable
2. FieldCache and ExtendedFieldCache
3. The five gazillion IndexWriter constructors
4. The Analyzer.tokenStream stuff.

The thing is, we have this false sense about back compatibility  
anyway.  We think we are doing it, but time and again it slips through  
because there is _NO WAY_ we can know all of the myriad of uses of  
Lucene.  My take:  be strict about index compatibility, take API  
changes on a case-by-case basis, favoring _preserving_ back  
compatibility unless it is too expensive.  Communicate any changes  
loudly.

So, yes, back compatibility as part of a pragmatic approach that  
recognizes our release timeframes and the ability for modern IDEs to  
help in refactoring is good.  Back compatibility for the sake of back  
compatibility is harmful and will ultimately be the downfall of  
Lucene, IMO, because it won't keep up simply because it will take  
twice as long to develop new ways of doing things and it will scare  
away new contributors who can't possibly fathom all of the back  
compatibility requirements (heck, us committers who have been around  
for a long time can't even do it).

At any rate, I also am promoting the case by case approach.  And I  
will kick it off by opening an issue that gets rid of the stupid  
ExtendedFieldCache abomination and breaks the FieldCache back compat.  
interface construct.


-Grant

On Jun 10, 2009, at 1:00 PM, Yonik Seeley wrote:

> I'm starting to feel like the lone holdout that thinks back compat for
> commonly used interfaces and index formats is important.  So I'll sum
> up some of my thoughts and leave it at that:
>
> - I doubt that the number of new users for each release of Lucene
> exceeds the sum total of all existing users of Lucene.  Lucene is
> already the dominant open source search library, so we're never going
> to hit that type of exponential growth going forward.  Existing users
> are very important.
> - Good back compat makes the lives of all Lucene users easier
> - Good back compat makes the lives of Lucene developers easier in some
> ways also.  We don't *need* to go back and patcholder releases, since
> we can say "use a newer release".  If things change too much, that
> will no longer be an easy option for many users, and more people will
> get stuck in the past because upgrading is too painful.
> - The difficulty of change can also be a good thing - it forces people
> to really think if changes are worth it and only add them where it
> really makes sense.
>
> The last threads on back compat generated so much volume that I
> couldn't keep up, and I expect there are many others that couldn't
> either.  I'm not personally interested in discussing it in the
> abstract further... I'm more interested in actual code
> patches/proposals.
>
> -Yonik
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Michael McCandless <lu...@mikemccandless.com>.
Well... Lucene still seems to be experiencing strong adoption/growth,
eg combined user+dev email traffic:

  http://lucene.markmail.org/

Net/net, I also think that back-compat is important and we shouldn't
up and abandon it or relax our policy too much.

However, I wish we had better tools for *implementing* our policy.
Really, the programming language should provide facilities... but it
won't (for a looong time), so we discuss our own solutions like
actsAsVersion.

And it pains me when our back compat policy forces us to sacrifice new
users' experience (not being to change default settings; not being
able to fix bugs in analyzers; etc).  At least we have an OK
workaround for that, and I also think we have softened our stance on
when to make exceptions here.

Mike

On Wed, Jun 10, 2009 at 1:00 PM, Yonik Seeley<yo...@lucidimagination.com> wrote:
> I'm starting to feel like the lone holdout that thinks back compat for
> commonly used interfaces and index formats is important.  So I'll sum
> up some of my thoughts and leave it at that:
>
> - I doubt that the number of new users for each release of Lucene
> exceeds the sum total of all existing users of Lucene.  Lucene is
> already the dominant open source search library, so we're never going
> to hit that type of exponential growth going forward.  Existing users
> are very important.
> - Good back compat makes the lives of all Lucene users easier
> - Good back compat makes the lives of Lucene developers easier in some
> ways also.  We don't *need* to go back and patcholder releases, since
> we can say "use a newer release".  If things change too much, that
> will no longer be an easy option for many users, and more people will
> get stuck in the past because upgrading is too painful.
> - The difficulty of change can also be a good thing - it forces people
> to really think if changes are worth it and only add them where it
> really makes sense.
>
> The last threads on back compat generated so much volume that I
> couldn't keep up, and I expect there are many others that couldn't
> either.  I'm not personally interested in discussing it in the
> abstract further... I'm more interested in actual code
> patches/proposals.
>
> -Yonik
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Mark Miller <ma...@gmail.com>.
Yonik Seeley wrote:
> On Wed, Jun 10, 2009 at 4:11 PM, Mark Miller <ma...@gmail.com> wrote:
>   
>> The computer should handle that
>> for me. It really should be as easy
>> as saying, look I want the best new defaults, or I want the back compat
>> defaults. The computer should figure
>> out the rest for me.
>>     
>
> actsAsVersion ;-)
> nice and back compatible.
> Introduce Settings classes in the future when+where it makes sense.
>
> -Yonik
>   
I liked the idea of something along those lines. I fell out on some of 
the discussion at the end as well though. Hard to keep up.

Even the static thing seemed fine to me - we do enough of that type of 
thing in cases where it would be a lot less clear anyway.

Short of that, even passing a settings class in some form would probably 
be fine. Its not like we don't already have a lot of constructors with a 
lot of parameters.

I didn't like the idea on first blush, but frankly, its not even all 
that bad.

I would certainly rather be able to throw a switch to get great 
performance rather than run through documentation figuring out what I 
have to toggle and change - repeat when new releases come out. You still 
should pay attention of course, but hunting down all of the performance 
'fixes' is a burden many will probably avoid. Especially those that are 
evaluating Lucene or building their first system.

-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Wed, Jun 10, 2009 at 4:21 PM, Yonik Seeley<yo...@lucidimagination.com> wrote:
> On Wed, Jun 10, 2009 at 4:11 PM, Mark Miller <ma...@gmail.com> wrote:
>> The computer should handle that
>> for me. It really should be as easy
>> as saying, look I want the best new defaults, or I want the back compat
>> defaults. The computer should figure
>> out the rest for me.
>
> actsAsVersion ;-)
> nice and back compatible.
> Introduce Settings classes in the future when+where it makes sense.

+1

But, per class (not global), and on an as-needed basis.

I still think this is a good, simple solution to implementing our
back-compat... so I opened LUCENE-1684 to add a "matchVersion" arg to
StandardAnalyzer, and I think it actually works very well.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Wed, Jun 10, 2009 at 4:11 PM, Mark Miller <ma...@gmail.com> wrote:
> The computer should handle that
> for me. It really should be as easy
> as saying, look I want the best new defaults, or I want the back compat
> defaults. The computer should figure
> out the rest for me.

actsAsVersion ;-)
nice and back compatible.
Introduce Settings classes in the future when+where it makes sense.

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Mark Miller <ma...@gmail.com>.
>> As far as default settings, it seems like it can be mostly fixed with
>> documentation (i.e. recommended settings for maximum performance).
>> That seems like a very small burden for people writing new
>> applications with Lucene anyway (compare to the cost of writing the
>> whole application).  On the other hand, existing users may be
>> essentially "done" with the Lucene development in their project, and
>> want to upgrade for bug fixes, performance increases, and maybe to
>> incrementally add new features.
>>     
>
> I think we need to do both.  We should doc things like "use a big RAM
> buffer", "turn off CFS", "use an SSD", "use threads", etc.
>
> But for things like "open a readOnly reader", "turn on the acronym fix
> in StandardAnalyzer", "use BooleanScorer not BooleanScorer2", "don't
> discard positions in StopFilter", "use NIOFSDirectory not
> FSDirectory", "turn off scoring when sorting by field", we should fix
> Lucene to do those by default.
>
> I'd like for Lucene to make a good first impression.
>
> Mike
>
> -
I agree that this needs to be fixed with Lucene. For a bit, I also 
thought that documentation was enough.

But on further thought, its a bit absurd. The computer should handle 
that for me. It really should be as easy
as saying, look I want the best new defaults, or I want the back compat 
defaults. The computer should figure
out the rest for me. I know its not as easy as typing it, but thats 
still a doable goal I would think. Its got to be somehow.
I know it comes down to different inconveniences, but I refuse to 
believe that there is not a solution.

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Wed, Jun 10, 2009 at 2:23 PM, Yonik Seeley<yo...@lucidimagination.com> wrote:

>> Well... Lucene still seems to be experiencing strong adoption/growth,
>> eg combined user+dev email traffic:
>> http://lucene.markmail.org/
>
> I think that includes all Lucene sub-projects (Solr, Tika, Mahout,
> Nutch, Droids, etc).
>
> http://lucene.markmail.org/search/?q=list%3Aorg.apache.lucene.java-user

Woops you're right.  java-user alone looks to have flattened out
recently... though usage of eg Solr is also usage of Lucene:

  http://lucene.markmail.org/search/?q=list%3Aorg.apache.lucene.java-user+list%3Aorg.apache.lucene.solr-user

What I'd really love to see is "how many cumulative searches have
been done by Lucene, everywhere" as a function of time...

>> And it pains me when our back compat policy forces us to sacrifice new
>> users' experience (not being to change default settings; not being
>> able to fix bugs in analyzers; etc).
>
> As far as default settings, it seems like it can be mostly fixed with
> documentation (i.e. recommended settings for maximum performance).
> That seems like a very small burden for people writing new
> applications with Lucene anyway (compare to the cost of writing the
> whole application).  On the other hand, existing users may be
> essentially "done" with the Lucene development in their project, and
> want to upgrade for bug fixes, performance increases, and maybe to
> incrementally add new features.

I think we need to do both.  We should doc things like "use a big RAM
buffer", "turn off CFS", "use an SSD", "use threads", etc.

But for things like "open a readOnly reader", "turn on the acronym fix
in StandardAnalyzer", "use BooleanScorer not BooleanScorer2", "don't
discard positions in StopFilter", "use NIOFSDirectory not
FSDirectory", "turn off scoring when sorting by field", we should fix
Lucene to do those by default.

I'd like for Lucene to make a good first impression.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: back compat is good

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Wed, Jun 10, 2009 at 2:01 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> Well... Lucene still seems to be experiencing strong adoption/growth,
> eg combined user+dev email traffic:
> http://lucene.markmail.org/

I think that includes all Lucene sub-projects (Solr, Tika, Mahout,
Nutch, Droids, etc).

http://lucene.markmail.org/search/?q=list%3Aorg.apache.lucene.java-user

> And it pains me when our back compat policy forces us to sacrifice new
> users' experience (not being to change default settings; not being
> able to fix bugs in analyzers; etc).

As far as default settings, it seems like it can be mostly fixed with
documentation (i.e. recommended settings for maximum performance).
That seems like a very small burden for people writing new
applications with Lucene anyway (compare to the cost of writing the
whole application).  On the other hand, existing users may be
essentially "done" with the Lucene development in their project, and
want to upgrade for bug fixes, performance increases, and maybe to
incrementally add new features.

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org