You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2006/07/07 19:17:55 UTC

Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Hi Chuck,

I think bulk update would be good (although I'm not sure how it would be different from batching deletes and adds, but I'm sure there is a difference, or else you wouldn't have done it).
Java 1.5 - no conclusion, but personally I felt:
- no strong arguments for 1.4, only a few people argued for it
- very little interest from 1.4 adversaries in helping with backporting to 1.4 or updating the build system to do the retro thing with 1.5 code

So I think you should contribute your code.  This will give us a real example of having something possibly valuable, and written with 1.5 features, so we can finalize 1.4 vs. 1.5 discussion, probably with a vote on lucene-dev.

Otis

----- Original Message ----
From: Chuck Williams <ch...@manawiz.com>
To: java-dev@lucene.apache.org
Sent: Thursday, July 6, 2006 5:07:41 PM
Subject: Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

robert engels wrote on 07/06/2006 12:24 PM:
> I guess we just chose a much simpler way to do this...
>
> Even with you code changes, to see the modification made using the
> IndexWriter, it must be closed, and a new IndexReader opened.
>
> So a far simpler way is to get the collection of updates first, then
>
> using opened indexreader,
> for each doc in collection
>       delete document using "key"
> endfor
>
> open indexwriter
> for each doc in collection
>       add document
> endfor
>
> open indexreader
>
>
> I don't see how your way is any faster. You must always flush to disk
> and open the indexreader to see the changes.

....

Bulk updates however require yet another approach.  Sorry to change
topics here, but I'm wondering if there was a final decision on the
question of java 1.5 in the core.  If I submitted a bulk update
capability that required java 1.5, would it be eligible for inclusion in
the core or not?

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Doug Cutting <cu...@apache.org>.
Daniel John Debrunner wrote:
> I'm new to Lucene but not Apache, this is not how Apache projects are
> meant to work. All decisions must be on the mailing lists and decisions
> are made by the community via "consensus gathering", not a sub-set of
> folks off the list. Or am I reading too much into this comment?
> 
> http://www.apache.org/foundation/how-it-works.html

No, you read it correctly.  This is a consensus decision.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Daniel John Debrunner <dj...@apache.org>.
DM Smith wrote:

>     However, I think you have identified that the core people need to 
> make a decision and the rest of us need to go with it. So, I suggest 
> that Doug convene such a meeting of the minds and communicate the 
> decision to the rest of us.

I'm new to Lucene but not Apache, this is not how Apache projects are
meant to work. All decisions must be on the mailing lists and decisions
are made by the community via "consensus gathering", not a sub-set of
folks off the list. Or am I reading too much into this comment?

http://www.apache.org/foundation/how-it-works.html

Dan.




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by DM Smith <dm...@gmail.com>.
On Jul 8, 2006, at 12:56 PM, Chuck Williams wrote:

>
> I prefer to contribute to Lucene, but my workload simply
> does not allow time to be spent on backporting.

I'll stand by my offer to do the backporting when it is possible and  
does not do violence to the implementation.

I'd prefer to wait until the patch that is in Jira is ready to be  
applied. At that point post the request here and I'll see if it is  
doable.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Chuck Williams <ch...@manawiz.com>.
Doug Cutting wrote on 07/08/2006 09:41 AM:
> Chuck Williams wrote:
>> I only work in 1.5 and use its features extensively.  I don't think
>> about 1.4 at all, and so have no idea how heavily dependent the code in
>> question is on 1.5.
>>
>> Unfortunately, I won't be able to contribute anything substantial to
>> Lucene so long as it has a 1.4 requirement.
>
> The 1.5 decision requires a consensus.  You're making ultimatums, which
> does not help to build consensus.  By stating an inflexible position
> you've become a fact that informs the process.

My statement was not intended as an ultimatum at all.  Rather, it is
simply a fact.  I prefer to contribute to Lucene, but my workload simply
does not allow time to be spent on backporting.

>
> I think we should try to minimize the number of inconvenienced people.
> Both developers and users are people.  Some developers are happy to
> continue in 1.4, adding new features that users who are confined to 1.4
> JVMs will be able to use.  Other developers will only contribute 1.5
> code, perhaps (unless we find a technical workaround) excluding users
> confined to 1.4 JVMs.  But it is difficult to compare the inconvenience
> of a developer who refuses to code back-compatibly to a user who is
> deprived new features.

Doug, respectfully, this issue is inflammatory in its nature.  I've
found a couple of your comments to be inflammatory, although I suspect
you did not intend them that way.  Specifically the term "refuses" above
and your prior comment about considering use of your veto power if the
committers were to vote to move to 1.5.

I'm not "refusing" to do anything.  I am overwhelmed in a crunch for the
next several months and simply informing the community that I have code
that others may find valuable that might be contributed, but that it
requires 1.5 and that I cannot backport it.  I cannot unilaterally
decide to contribute the code, needing the agreement of the company I'm
working for.  They are only interested in the contribution if there is
interest in having it in the core.  These are simply facts.  I suspect
I'm not the only person in this kind of situation.

>
> Since GCJ is effectively available on all platforms, we could say that
> we will start accepting 1.5 features when a GCJ release supports those
> features.  Does that seem reasonable?

Seems like a reasonable compromise to me.  If I had a vote on this it
would be +1.

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Daniel John Debrunner <dj...@apache.org>.
Vic Bancroft wrote:


>> On Jul 10, 2006, at 11:17 PM, Daniel John Debrunner wrote:
>>
>>> Doug Cutting wrote:
>>>
>>>> Since GCJ is effectively available on all platforms, we could say  that
>>>> we will start accepting 1.5 features when a GCJ release supports  those
>>>> features.  Does that seem reasonable?
>>>
>>>
>>> Seems potentially a little strange to me. Does this mean Lucene 
>>> would be
>>> limited to the set of 1.5 features actually implemented by GCJ? So if
>>> there is a 1.5 feature that is not supported by GCJ (while others are)
>>> it cannot be used?
>>>
>>> Seems more natural to support the complete 1.5 as defined by Sun/Java,
>>> not the subset implemented by one open source compiler.
>>
>>
> Do you have a different favorite open source java compiler for 1.5 ?

No, I just think the platform for Lucene (or any Java project) should be
defined by the spec (JDK 1.4, 1.5 or 1.6), not a single (possible
partial) implementation of the spec.

Dan.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Vic Bancroft <ba...@america.net>.
robert engels wrote:

> Seems  silly to support 1.5 and not do it this way.

Sometimes a little silliness is some serious fun!  Just give me a rubber 
nose, since I am just clowning around trying to build Andi's kewly 
contrib/db using gcj on the slightly stylish db-4.4.20 and je-3.0.12 . . .

> On Jul 10, 2006, at 11:17 PM, Daniel John Debrunner wrote:
>
>> Doug Cutting wrote:
>>
>>> Since GCJ is effectively available on all platforms, we could say  that
>>> we will start accepting 1.5 features when a GCJ release supports  those
>>> features.  Does that seem reasonable?
>>
>> Seems potentially a little strange to me. Does this mean Lucene  
>> would be
>> limited to the set of 1.5 features actually implemented by GCJ? So if
>> there is a 1.5 feature that is not supported by GCJ (while others are)
>> it cannot be used?
>>
>> Seems more natural to support the complete 1.5 as defined by Sun/Java,
>> not the subset implemented by one open source compiler.
>
Do you have a different favorite open source java compiler for 1.5 ?

more,
l8r,
v

-- 
"The future is here. It's just not evenly distributed yet."
 -- William Gibson, quoted by Whitfield Diffie


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by robert engels <re...@ix.netcom.com>.
Agreed. I think those that are reliant on GCJ should plan on  
expending the effort to do whatever backporting is needed to make  
Lucene work on it. It should also be a GCJ branch or version. Seems  
silly to support 1.5 and not do it this way.


On Jul 10, 2006, at 11:17 PM, Daniel John Debrunner wrote:

> Doug Cutting wrote:
>
>> Since GCJ is effectively available on all platforms, we could say  
>> that
>> we will start accepting 1.5 features when a GCJ release supports  
>> those
>> features.  Does that seem reasonable?
>
> Seems potentially a little strange to me. Does this mean Lucene  
> would be
> limited to the set of 1.5 features actually implemented by GCJ? So if
> there is a 1.5 feature that is not supported by GCJ (while others are)
> it cannot be used?
>
> Seems more natural to support the complete 1.5 as defined by Sun/Java,
> not the subset implemented by one open source compiler.
>
> Dan.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Andi Vajda <va...@osafoundation.org>.
On Tue, 11 Jul 2006, robert engels wrote:

> It's been years and GCJ still doesn't have anywhere near full 1.4 classpath 
> libraries.
>
> So now if we want to write code for Lucene we have to know what libraries are 
> available for GCJ?
>
> GCJ is a joke.

It looks like classpath is quite close to 100% 1.4 JRE support.

     http://www.kaffe.org/~stuart/japi/htmlout/h-jdk14-classpath.html

Of course, earlier gcj versions, such as 3.4.x, come with a libgcj based on 
an earlier version of classpath with bigger holes (regex support, for 
example).

Things are moving in the right direction, however...

Andi..

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by robert engels <re...@ix.netcom.com>.
It's been years and GCJ still doesn't have anywhere near full 1.4  
classpath libraries.

So now if we want to write code for Lucene we have to know what  
libraries are available for GCJ?

GCJ is a joke.


On Jul 11, 2006, at 8:54 AM, Andi Vajda wrote:

>
> On Tue, 11 Jul 2006, DM Smith wrote:
>
>> Eclipse has a built in compiler called ecj and it can compile Java  
>> 1.6 code today. However, unless classes are provided at runtime  
>> for linking, one will get build errors.
>
> It looks like ecj is going to replace the gcj java front-end  
> compiler thereby making the 1.5 language features available to gcj.  
> In the meantime, the classpath project is working towards adding  
> support for all JRE classes. I'm quite optimistic that we should  
> see a 1.5 capable gcj this year. This isn't saying much, however,  
> about which platforms, besides Red Hat Linux, this gcj would be  
> producing stable executables for. For example, gcj on Windows is  
> very far behind and is getting very little development time these  
> days.
>
> Andi..
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Andi Vajda <va...@osafoundation.org>.
On Tue, 11 Jul 2006, DM Smith wrote:

> Eclipse has a built in compiler called ecj and it can compile Java 1.6 code 
> today. However, unless classes are provided at runtime for linking, one will 
> get build errors.

It looks like ecj is going to replace the gcj java front-end compiler thereby 
making the 1.5 language features available to gcj. In the meantime, the 
classpath project is working towards adding support for all JRE classes. I'm 
quite optimistic that we should see a 1.5 capable gcj this year. This isn't 
saying much, however, about which platforms, besides Red Hat Linux, this gcj 
would be producing stable executables for. For example, gcj on Windows is very 
far behind and is getting very little development time these days.

Andi..


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by DM Smith <dm...@gmail.com>.
On Jul 11, 2006, at 12:17 AM, Daniel John Debrunner wrote:

> Doug Cutting wrote:
>
>> Since GCJ is effectively available on all platforms, we could say  
>> that
>> we will start accepting 1.5 features when a GCJ release supports  
>> those
>> features.  Does that seem reasonable?
>
> Seems potentially a little strange to me. Does this mean Lucene  
> would be
> limited to the set of 1.5 features actually implemented by GCJ? So if
> there is a 1.5 feature that is not supported by GCJ (while others are)
> it cannot be used?
>
> Seems more natural to support the complete 1.5 as defined by Sun/Java,
> not the subset implemented by one open source compiler.
>

Eclipse has a built in compiler called ecj and it can compile Java  
1.6 code today. However, unless classes are provided at runtime for  
linking, one will get build errors.

The same is true with gcj. It still does not fully support Java 1.4,  
(almost there...) classes, though it supports all language features.  
However, on Fedora, Eclipse is built with ecj and to me this  
demonstrates that it is close enough for most use cases.

Gcj will have support for the language features before it supports  
all the new classes.

In terms of Lucene, I believe that the most important classes that  
are wanted are the concurrency ones. (At least that is how I have  
read the posts here.)

I think the measure of readiness is not that it compiles today with  
gcj, but that the Java 1.5 classes and features that are likely to be  
used by lucene are implemented and pass all lucene tests.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Daniel John Debrunner <dj...@apache.org>.
Doug Cutting wrote:

> Since GCJ is effectively available on all platforms, we could say that
> we will start accepting 1.5 features when a GCJ release supports those
> features.  Does that seem reasonable?

Seems potentially a little strange to me. Does this mean Lucene would be
limited to the set of 1.5 features actually implemented by GCJ? So if
there is a 1.5 feature that is not supported by GCJ (while others are)
it cannot be used?

Seems more natural to support the complete 1.5 as defined by Sun/Java,
not the subset implemented by one open source compiler.

Dan.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Andi Vajda <va...@osafoundation.org>.
On Tue, 11 Jul 2006, Doug Cutting wrote:

> Probably this would get fixed more quickly if someone contributed a patch to 
> JavaCC.  Even it were not committed, we could build our own version of 
> JavaCC.  Any intrepid volunteers?

For patches that seem too kludgy to make it into Lucene's sources (for 
example, to work around the lack of proper exception support under Windows 
gcj in the query parser) a compromise could be to keep these patches in a 
separate file and apply them to the Lucene sources before building them with 
gcj. This is how PyLucene is built today.

Some patches have already been incorporated into the Lucene sources (for 
example, in Searcher.java, to workaround gcj bug 15411).

Of course, the long term goal should be to no longer have any patches at all.
I've been working on PyLucene about two and a half years now and the number of 
patches has remained fairly stable.

A nice side effect of trying to support gcj with Java Lucene by including it 
into the Lucene test framework could be that the gcj developers might be more 
inclined to taking a look at gcj-related issues that are thus made much easier 
to reproduce.

Andi..

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Doug Cutting <cu...@apache.org>.
Andi Vajda wrote:
> Just last week, a PyLucene user got it to work on Solaris. I have no 
> access to a Solaris machine to validate this. If I had my choice of 
> platform, I'd pick one of (in order of preference):
>   - Mac OS X (Intel or PPC)
>   - a recent Red Hat Linux since this is the one most gcj developers use
>   - Ubuntu 6.06

The Apache machine where we run nightly builds runs Solaris.

My first platform of choice would be Ubuntu.

> Unless junit can be made to run compiled under gcj, I see some more work 
> on the unit tests side. This could be interesting too...

A search for "gcj junit" finds:

http://www.mail-archive.com/user@ant.apache.org/msg19104.html

> Yes, I filed bug 53 almost two years ago, it's not gone very far :(
>     https://javacc.dev.java.net/issues/show_bug.cgi?id=53

Probably this would get fixed more quickly if someone contributed a 
patch to JavaCC.  Even it were not committed, we could build our own 
version of JavaCC.  Any intrepid volunteers?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Andi Vajda <va...@osafoundation.org>.
On Tue, 11 Jul 2006, Doug Cutting wrote:

> Andi Vajda wrote:
>> I'd be interested in doing this but what is it that we're after in 
>> 'supporting gcj' actually ?
>
> I think it would sufficient to:
>
> 1. Compile only .jar and .class with gcj (not .java).
> 2. Pass all unit tests on a single platform.

Just last week, a PyLucene user got it to work on Solaris. I have no access to 
a Solaris machine to validate this. If I had my choice of platform, I'd pick 
one of (in order of preference):
   - Mac OS X (Intel or PPC)
   - a recent Red Hat Linux since this is the one most gcj developers use
   - Ubuntu 6.06

As for the version of gcj I'd suggest using:
   - Mac OS X Intel : gcj 4.0.2 (heavily patched)
   - Mac OS X PPC : gcj 3.4.6
   - Red Hat Linux : I'd try 4.2.0 downgrading until I find one that works,
     probably 4.1.1
   - Ubuntu 6.06: gcj 3.4.6

Unless junit can be made to run compiled under gcj, I see some more work on 
the unit tests side. This could be interesting too...

>> Even when only compiling .jar -> .so with gcj, a number of patches still 
>> need to be applied:
>>         http://svn.osafoundation.org/pylucene/trunk/patches.lucene
>
> The patches to JavaCC-generated code should probably really become JavaCC 
> patches.  Have you looked into that?

Yes, I filed bug 53 almost two years ago, it's not gone very far :(
     https://javacc.dev.java.net/issues/show_bug.cgi?id=53

Most of the rest look like reasonable 
> changes to Lucene, except perhaps the "native matches", which looks a bit 
> fishy for Lucene's trunk.

The native match patches are required because the libgcj that comes with gcj 
3.4.x doesn't provide a regular expressions implementation. This is solved in 
PyLucene by using python's. I think gcj 4 comes with regex support but gcj 4 
is not yet well supported on most platforms.

For the gcj platform story, see this pylucene-dev post I sent recently:
   http://lists.osafoundation.org/pipermail/pylucene-dev/2006-June/001106.html

Andi..

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by DM Smith <dm...@gmail.com>.
On Jul 11, 2006, at 3:51 AM, Doug Cutting wrote:

> Andi Vajda wrote:
>> I'd be interested in doing this but what is it that we're after in  
>> 'supporting gcj' actually ?
>
> I think it would sufficient to:
>
> 1. Compile only .jar and .class with gcj (not .java).
> 2. Pass all unit tests on a single platform.
>
> This would provide an existence proof that Lucene can run under  
> GCJ, and doesn't require solving GCJ's porting issues.
>

For me the platform of choice would be MacOS X, since 10.3 will never  
have Java 5. (IIRC, 10.4 has only been out for about a year.)
Most of the other platforms will.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Doug Cutting <cu...@apache.org>.
Andi Vajda wrote:
> I'd be interested in doing this but what is it that we're after in 
> 'supporting gcj' actually ?

I think it would sufficient to:

1. Compile only .jar and .class with gcj (not .java).
2. Pass all unit tests on a single platform.

This would provide an existence proof that Lucene can run under GCJ, and 
doesn't require solving GCJ's porting issues.

> Even when only compiling .jar -> .so with gcj, a number of patches 
> still need to be applied:
>         http://svn.osafoundation.org/pylucene/trunk/patches.lucene

The patches to JavaCC-generated code should probably really become 
JavaCC patches.  Have you looked into that?  Most of the rest look like 
reasonable changes to Lucene, except perhaps the "native matches", which 
looks a bit fishy for Lucene's trunk.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Vic Bancroft <ba...@america.net>.
Andi Vajda wrote:

> On Mon, 10 Jul 2006, Doug Cutting wrote:
>
>> Andi Vajda wrote:
>>
>>> On Sat, 8 Jul 2006, Doug Cutting wrote:
>>>
>>>> Since GCJ is effectively available on all platforms, we could say 
>>>> that we will start accepting 1.5 features when a GCJ release 
>>>> supports those features. Does that seem reasonable?
>>>
>>> +1
>>
>> If we use this criteria, then we should probably officially support 
>> GCJ. Ideally we should run nightly unit tests with GCJ. Andi, would 
>> you be interested in helping to set this up?
>
This is interesting to me, is the nightly build environment difficult to 
replicate ?

> I'd be interested in doing this but what is it that we're after in 
> 'supporting gcj' actually ?

There is some advantage in using gcj as a measure of usability in the 
context of a "free (as in beer)" java, such that for a given target 
platform, one can deliver executables and shared libraries without 
requiring virtual machine runtimes. The second advantage is to give a 
simple method to nightly test contributions using new features. The 
third advantage seems to be a reduction in computational load on servers 
running native code.

> - running a fully compiled program linked against a lucene.so ?
> if so, which platforms ? the gcj story is very different on each and 
> every
> platform, including different linuxes and gcj is not well supported on
> some platforms at all.

This seems to be the case, since on an updated fedora core 5 with gcj 
(GCC) 4.1.1 20060525 (Red Hat 4.1.1-1), the Makefile modifications 
required are trivial.

> - running java bytecode with the gcj VM (gij, I believe) ?
> if the .java code needs to be compiled with gcj then a number of patches
> still need to be applied against the Java lucene sources.
> PyLucene is built by compiling .java -> .jar using a regular JDK (Apple's
> or Blackdown) and using gcj to compile from .jar -> .so thereby working
> around all the gcj java front-end bugs
> Even when only compiling .jar -> .so with gcj, a number of patches still
> need to be applied:
> http://svn.osafoundation.org/pylucene/trunk/patches.lucene

The last time I checked for src/gcj/Makefile (revision 420696), all that 
was required was to fix the name of the lucene archive file to match 
what is actually generated, e.g., $(BUILD)/lucene-core-[0-9].*.jar and 
add the FieldCache* to the names to skip . . .

Not having contributed to lucene yet, is it required to generate a 
'patch' to add to jira, or is the following output from a simple `svn 
diff` sufficient for experimentation ?

    Index: src/gcj/Makefile
    ===================================================================
    --- src/gcj/Makefile (revision 420696)
    +++ src/gcj/Makefile (working copy)
    @@ -8,7 +8,7 @@
    CORE=$(BUILD)/classes/java
    SRC=.

    -CORE_OBJ:=$(subst .jar,.a,$(wildcard $(BUILD)/lucene-[0-9]*.jar))
    +CORE_OBJ:=$(subst .jar,.a,$(wildcard $(BUILD)/lucene-core-[0-9]*.jar))
    CORE_JAVA:=$(shell find $(ROOT)/src/java -name '*.java')

    CORE_HEADERS=\
    @@ -55,7 +55,7 @@
    # yet accept from .class files.
    # NOTE: Change when
    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15501 is fixed.
    $(CORE_OBJ) : $(CORE_JAVA)
    - $(GCJ) $(GCJFLAGS) -c -I $(CORE) -o $@ `find $(ROOT)/src/java
    -name '*.java' -not -name '*Sort*' -not -name 'Span*'` `find $(CORE)
    -name '*.class' -name '*Sort*' -or -name 'Span*'`
    + $(GCJ) $(GCJFLAGS) -c -I $(CORE) -o $@ `find $(ROOT)/src/java
    -name '*.java' -not -name '*Sort*' -not -name 'Span*' -not -name
    'FieldCache*'` `find $(CORE) -name '*.class' -name '*Sort*' -or
    -name 'Span*' -or -name 'FieldCache*'`

    # generate object code from jar files using gcj
    %.a : %.jar

more,
l8r,
v

-- 
"The future is here. It's just not evenly distributed yet."
 -- William Gibson, quoted by Whitfield Diffie


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Andi Vajda <va...@osafoundation.org>.
On Mon, 10 Jul 2006, Doug Cutting wrote:

> Andi Vajda wrote:
>> On Sat, 8 Jul 2006, Doug Cutting wrote:
>>> Since GCJ is effectively available on all platforms, we could say that we 
>>> will start accepting 1.5 features when a GCJ release supports those 
>>> features. Does that seem reasonable?
>> 
>> +1
>
> If we use this criteria, then we should probably officially support GCJ. 
> Ideally we should run nightly unit tests with GCJ.  Andi, would you be 
> interested in helping to set this up?

I'd be interested in doing this but what is it that we're after in 'supporting 
gcj' actually ?

   - running a fully compiled program linked against a lucene.so ?
     if so, which platforms ? the gcj story is very different on each and every
     platform, including different linuxes and gcj is not well supported on
     some platforms at all.

   - running java bytecode with the gcj VM (gij, I believe) ?
     if the .java code needs to be compiled with gcj then a number of patches
     still need to be applied against the Java lucene sources.
     PyLucene is built by compiling .java -> .jar using a regular JDK (Apple's
     or Blackdown) and using gcj to compile from .jar -> .so thereby working
     around all the gcj java front-end bugs
     Even when only compiling .jar -> .so with gcj, a number of patches still
     need to be applied:
         http://svn.osafoundation.org/pylucene/trunk/patches.lucene

Andi..

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Doug Cutting <cu...@apache.org>.
Andi Vajda wrote:
> On Sat, 8 Jul 2006, Doug Cutting wrote:
>> Since GCJ is effectively available on all platforms, we could say that 
>> we will start accepting 1.5 features when a GCJ release supports those 
>> features. Does that seem reasonable?
> 
> +1

If we use this criteria, then we should probably officially support GCJ. 
Ideally we should run nightly unit tests with GCJ.  Andi, would you be 
interested in helping to set this up?

Our unit test scripts are at:

https://svn.apache.org/repos/asf/lucene/java/nightly/

These are run on lucene.zones.apache.org, a Solaris box.  If you (or 
someone else) is willing, then I can make you an account on this machine 
and you can alter the nightly build process to include testing against 
the most recent GCJ release.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Andi Vajda <va...@osafoundation.org>.
On Sat, 8 Jul 2006, Doug Cutting wrote:

> Since GCJ is effectively available on all platforms, we could say that we 
> will start accepting 1.5 features when a GCJ release supports those features. 
> Does that seem reasonable?

+1

Andi..

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by DM Smith <dm...@gmail.com>.
On Jul 8, 2006, at 12:41 PM, Doug Cutting wrote:

>
> Since GCJ is effectively available on all platforms, we could say  
> that we will start accepting 1.5 features when a GCJ release  
> supports those features.  Does that seem reasonable?

I have been doing a bit of reading on GCJ compatibility. I think it  
is going to come in 2 parts:
1) It supports all the new language features of Java 1.5.
2) It has an implementation of all the new classes and methods that  
Lucene uses.

For me the test is that it is released for MacOSX.

With these three things, I'd be happy :)

DM Smith, stick in the mud :)

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Doug Cutting <cu...@apache.org>.
Chuck Williams wrote:
> I doubt any single contribution will change anyone's mind.  I would like
> to have clarity on the 1.5 decision before deciding whether or not to
> contribute this and other things.  My ParallelWriter contribution, which
> also requires 1.5, is already sitting in jira.

Sitting in Jira is better than not sitting in Jira, no?

> I only work in 1.5 and use its features extensively.  I don't think
> about 1.4 at all, and so have no idea how heavily dependent the code in
> question is on 1.5.
> 
> Unfortunately, I won't be able to contribute anything substantial to
> Lucene so long as it has a 1.4 requirement.

The 1.5 decision requires a consensus.  You're making ultimatums, which
does not help to build consensus.  By stating an inflexible position
you've become a fact that informs the process.

I think we should try to minimize the number of inconvenienced people.
Both developers and users are people.  Some developers are happy to
continue in 1.4, adding new features that users who are confined to 1.4
JVMs will be able to use.  Other developers will only contribute 1.5
code, perhaps (unless we find a technical workaround) excluding users
confined to 1.4 JVMs.  But it is difficult to compare the inconvenience
of a developer who refuses to code back-compatibly to a user who is 
deprived new features.

Since GCJ is effectively available on all platforms, we could say that 
we will start accepting 1.5 features when a GCJ release supports those 
features.  Does that seem reasonable?

Doug



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by Chuck Williams <ch...@manawiz.com>.
DM Smith wrote on 07/07/2006 07:07 PM:
> Otis,
>     First let me say, I don't want to rehash the arguments for or
> against Java 1.5.

This is an emotional issue for people on both sides.

>     However, I think you have identified that the core people need to
> make a decision and the rest of us need to go with it.

It would be most helpful to have clarity on this issue.

> On Jul 7, 2006, at 1:17 PM, Otis Gospodnetic wrote:
>
>> Hi Chuck,
>>
>> I think bulk update would be good (although I'm not sure how it would
>> be different from batching deletes and adds, but I'm sure there is a
>> difference, or else you wouldn't have done it).

Bulk update works by rewriting all segments that contain a document to
be modified in a single linear pass.  This is orders of magnitude faster
than delete/add if the set of documents to be updated is large,
especially if only a few small fields are mutable on Documents that have
many possibly large immutable fields.  E.g., on a somewhat slow
development machine I updated several fields on 1,000,000 large
documents in 43 seconds.

There is an existing patch in jira that takes this same approach
(LUCENE-382).  However the limitations in that patch are substantial: 
only optimized indexes, stored fields are not updated, updates are
independent of the existing field value, etc.  These limitations make
that implementation not suitable for many use cases.

My implementation eliminates all of those limitations, providing a fast
flexible solution for applying an arbitrary value transformation to
selected documents and fields in the index (doc.field.new_value = f(doc,
field.old_value, doc.other_field_values) for arbitrary f).  It also
works with ParallelReader (and the ParallelWriter I've already
contributed).  This allows the mutable fields to be segregated into a
separate subindex.  Only that subindex need be updated.  This alone is
an enormous advantage over a large number of delete/add's where the same
optimization is not possible due to the doc-id synchronization
requirements of ParallelReader.

There is a substantial amount of code required to do this, and it is
completely dependent on the index representation.  To simplify merge
issues with ongoing Lucene changes, I had to copy and edit certain
private methods out of the existing index code (and make extensive use
of the package-only api's).  Beyond normal benefits of open sourcing
code, my interest in contributing this is to see the index code
refactored to take bulk update into account.  This is increased by the
current focus on a new flexible index representation.  I would like to
see bulk update as one of the operations supported in the new
representation.

>> So I think you should contribute your code.  This will give us a real
>> example of having something possibly valuable, and written with 1.5
>> features, so we can finalize 1.4 vs. 1.5 discussion, probably with a
>> vote on lucene-dev.

I doubt any single contribution will change anyone's mind.  I would like
to have clarity on the 1.5 decision before deciding whether or not to
contribute this and other things.  My ParallelWriter contribution, which
also requires 1.5, is already sitting in jira.

I only work in 1.5 and use its features extensively.  I don't think
about 1.4 at all, and so have no idea how heavily dependent the code in
question is on 1.5.

Unfortunately, I won't be able to contribute anything substantial to
Lucene so long as it has a 1.4 requirement.

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 (was ommented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided))

Posted by DM Smith <dm...@gmail.com>.
Otis,
	First let me say, I don't want to rehash the arguments for or  
against Java 1.5. We can all go back and read the last two major  
threads on the issue. I don't think there is anything new to say.

	However, I think statements like:
		"no strong arguments" (I think the arguments were reasonable)
		"only a few people argued for it" (Only a few argued against it)
		"very little interest" (Very few votes are on any Jira issue, so  
what does that say)
		"adversaries" (I am not an adversary, I am a very interested party  
with a personal interest in the outcome)
	are inflammatory.

	I am willing to do the back port if it is possible and if it does  
not do violence to the implementation.

	There are a number of patches sitting in Jira and it is not clear to  
me which are even close to being applied. I am not interested in  
doing work on patches that are old or might sit around for a while  
until they are applied (and therefore become out of sync).

	If the patches are identified as being worthy of being applied and  
are also identified as being Java 1.5, I will port it and it's test  
if it make sense.

	It has already been granted that contrib allow Java 1.5. So I  
presume that the build has been updated to allow for 1.5 in contrib  
and not in core. If this is not the case I think that the first  
committer (or submitter) of Java 1.5 code to contrib has the  
responsibility to change the build system (or at least ensure that it  
is done.)

	As to the build system, I am not the right person to see that it  
works. I am using Eclipse to do the builds. I maintain 2 workspaces,  
one with core only and that is Java 1.4.2 and the other is core and  
contrib and that is Java 1.5. I have done this so I can help "back  
port" to Java 1.4.

	However, I think you have identified that the core people need to  
make a decision and the rest of us need to go with it. So, I suggest  
that Doug convene such a meeting of the minds and communicate the  
decision to the rest of us.

DM



On Jul 7, 2006, at 1:17 PM, Otis Gospodnetic wrote:

> Hi Chuck,
>
> I think bulk update would be good (although I'm not sure how it  
> would be different from batching deletes and adds, but I'm sure  
> there is a difference, or else you wouldn't have done it).
> Java 1.5 - no conclusion, but personally I felt:
> - no strong arguments for 1.4, only a few people argued for it
> - very little interest from 1.4 adversaries in helping with  
> backporting to 1.4 or updating the build system to do the retro  
> thing with 1.5 code
>
> So I think you should contribute your code.  This will give us a  
> real example of having something possibly valuable, and written  
> with 1.5 features, so we can finalize 1.4 vs. 1.5 discussion,  
> probably with a vote on lucene-dev.
>
> Otis
>
> ----- Original Message ----
> From: Chuck Williams <ch...@manawiz.com>
> To: java-dev@lucene.apache.org
> Sent: Thursday, July 6, 2006 5:07:41 PM
> Subject: Re: [jira] Commented: (LUCENE-565) Supporting  
> deleteDocuments in IndexWriter (Code and Performance Results Provided)
>
> robert engels wrote on 07/06/2006 12:24 PM:
>> I guess we just chose a much simpler way to do this...
>>
>> Even with you code changes, to see the modification made using the
>> IndexWriter, it must be closed, and a new IndexReader opened.
>>
>> So a far simpler way is to get the collection of updates first, then
>>
>> using opened indexreader,
>> for each doc in collection
>>       delete document using "key"
>> endfor
>>
>> open indexwriter
>> for each doc in collection
>>       add document
>> endfor
>>
>> open indexreader
>>
>>
>> I don't see how your way is any faster. You must always flush to disk
>> and open the indexreader to see the changes.
>
> ....
>
> Bulk updates however require yet another approach.  Sorry to change
> topics here, but I'm wondering if there was a final decision on the
> question of java 1.5 in the core.  If I submitted a bulk update
> capability that required java 1.5, would it be eligible for  
> inclusion in
> the core or not?
>
> Chuck
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org