You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Chuck Williams (JIRA)" <ji...@apache.org> on 2006/06/12 22:47:29 UTC

[jira] Created: (LUCENE-600) ParallelWriter companion to ParallelReader

ParallelWriter companion to ParallelReader
------------------------------------------

         Key: LUCENE-600
         URL: http://issues.apache.org/jira/browse/LUCENE-600
     Project: Lucene - Java
        Type: Improvement

  Components: Index  
    Versions: 2.1    
    Reporter: Chuck Williams


A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
    1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
    2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
    3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.

A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.

This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Closed: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Michael Busch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Busch closed LUCENE-600.
--------------------------------

    Resolution: Won't Fix

This solution has the severe drawback that it only works if the IndexWriter is used in flush-by-docCount mode. Lucene's default behavior is flush-by-size now.

See LUCENE-1879 for a more generic approach.

> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Michael Busch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749641#action_12749641 ] 

Michael Busch commented on LUCENE-600:
--------------------------------------

{quote}
I contributed the first patch to make flush-by-size possible; see Lucene-709. There is no incompatibility with ParallelWriter, even the early version contributed here 3 years ago.
{quote}

I should have been more precise: flush and *merge* by size. Also the patch provided here doesn't allow deleting by term or query, unless the field(s) the terms or queries are searched on are contained in all parallel indexes, right? Also with this approach, what happens if you commit one indexWriter successfully, but a parallel one fails during commit and needs to be rolled back. How are these consistency issues handled?

> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Chuck Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749678#action_12749678 ] 

Chuck Williams commented on LUCENE-600:
---------------------------------------

A given logical Document must have the same doc-id in each subindex, which is maintained by using a merge policy that guarantees consistency across the subindexes, either merge-by-count or merge-by-size as dictated by the size-dominant subindex.

I just read your wiki page and it looks like your MasterMergePolicy is the same for the merge-by-size case, right?

We've bee using parallel incremental indexing in production apps now for a long time, along with the efficient update mechanism described in the patent app.

The original company I did this work for was acquired by a larger company who now owns the IP.  I don't know how they would feel about a contribution of the latest version of ParallelWriter, which works with the current Lucene.  I could inquire if you are truly open to it, but it sounds like you may be on your own path to a quite similar thing.

Your wiki page says, "when you need to reindex this field you can simply create a new generation of this parallel index and fill it with the new values".  That is the rub of the problem, and the one we created an efficient algorithm and implementation for several years ago.  ParallelWriter is the easy part.


> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Chuck Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749656#action_12749656 ] 

Chuck Williams commented on LUCENE-600:
---------------------------------------

The version attached here is from over 3 years ago.  Our version has evolved along with Lucene and the whole apparatus is fully functional with the latest lucene.

The fields in each subindex are disjoint.  A logical Document is the collection of all fields from each real Document in each real subindex with same doc-id (i.e., the model Doug started with ParallelReader).  There is no issue with deletion by query or term as it deletes the whole logical Document.  Field updates in our scheme don't use deletion.

Merge-by-size is only an issue if you allow it to be decided independently in each subindex.  In practice that is not very important since one subindex is size-dominant (the one containing the document body field).  One can merge-by-size that subindex and force the others to merge consistently.

The only reason for the corresponding-segment constraint is that deletion changes doc-id's by purging deleted documents.  I know some Lucene apps address this by never purging deleted documents, which is ok in some domains where deletion is rare.  I think there are other ways to resolve it as well.



> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749497#action_12749497 ] 

Mark Miller commented on LUCENE-600:
------------------------------------

bq. and filed for a patent on the method

I'm the furthest thing from a lawyer, but didn't you publish your own killer prior art right here? Patch posted 06, filing 08?

Rhetorical question to a degree I guess - this patent stuff is really fascinating. Nonetheless, one of the reasons I envy Europe.

> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5

Posted by Doug Cutting <cu...@apache.org>.
Chuck Williams wrote:
> I think the last discussion ended with the main counter-argument being
> lack of support by gjc.  Current top of GJC News:
> 
>> *June 6, 2006* RMS approved the plan to use the Eclipse compiler as
>> the new gcj front end. Work is being done on the |gcj-eclipse| branch;
>> it can already build libgcj. This project will allow us to ship a 1.5
>> compiler in the relatively near future. The old |gcjx| branch and
>> project is now dead.

Another Java implementation to track is Apache Harmony.  Harmony intends 
to have a 1.5-compatible JVM completed by Q1 2007.

http://incubator.apache.org/harmony/roadmap.html

Doug



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by Otis Gospodnetic <ot...@yahoo.com>.
I'll just send it to java-user in a bit in order to get the answers only from Lucene users (and not peeps just passing by lucene.apache.org).

Otis

----- Original Message ----
From: Grant Ingersoll <gs...@syr.edu>
To: java-dev@lucene.apache.org
Sent: Friday, June 16, 2006 6:53:57 AM
Subject: Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader

+1

Do you want to post it on the user list?  It might also be good to put 
it up on the main website.

Otis Gospodnetic wrote:
> Grant: how to poll users?  How about this: http://www.quimble.com/poll/view/2156 ?  If you think that's ok, we can send that to java-user tomorrow and see.  Hey, how about some bets?  I'll put a $10 for a beer on 1.5.
>
>   
Wow, $10 for a beer?  That must be some pretty good beer.  Either that 
or you live in New York City and that is a cheap beer!  Anyway, I am 
betting it is 1.5 as well.  Maybe we can get together at ApacheCon or 
something for one...



> Otis
>
> ----- Original Message ----
> From: Grant Ingersoll <gs...@syr.edu>
> To: java-dev@lucene.apache.org
> Sent: Tuesday, June 13, 2006 5:01:30 PM
> Subject: Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader
>
>
>   
>> In addition to performance, productivity and functionality benefits, my
>> main argument for 1.5 is that it is used by the vast majority of lucene
>> community members.  
>>     
>
> I am not so sure about this. Perhaps we should take a poll on the user 
> list?  Not even sure how that would be managed or counted, but...
>
>   
>> Everything I write is in 1.5 and I don't have time
>> to backport.  I have a significant body of code from which to extract
>> and contribute patches that others would likely find useful.  How many
>> others are in a similar position?
>>   
>>     
> I definitely would prefer to make future contributions in 1.5 (even the 
> patch we just contributed (issue 545) could have been better given 1.5, 
> but it is fine with 1.4 as well).  I tend to think if people don't want 
> the new functionality or if it breaks their app. then they need not 
> upgrade, or they can contribute patches against the branches for prior 
> releases and we can support that as needed.   To me, this is what major 
> releases are about.  I know that when a major release comes out that I 
> should expect library changes that may break my code.  If I don't want 
> that pain, then I don't upgrade.
>   
>> On the side, not leaving valued community members behind is important.
>>
>> I think the pmc / committers just need to make a decision which will
>> impact one group or the other.
>>
>> Chuck
>>
>>
>> Grant Ingersoll wrote on 06/13/2006 03:35 AM:
>>   
>>     
>>> Well, we have our first Java 1.5 patch...  Now that we have had a week
>>> or two to digest the comments, do we want to reopen the discussion?
>>>
>>> Chuck Williams (JIRA) wrote:
>>>     
>>>       
>>>>      [ http://issues.apache.org/jira/browse/LUCENE-600?page=all ]
>>>>
>>>> Chuck Williams updated LUCENE-600:
>>>> ----------------------------------
>>>>
>>>>     Attachment: ParallelWriter.patch
>>>>
>>>> Patch to create and integrate ParallelWriter, Writable and
>>>> TestParallelWriter -- also modifies build to use java 1.5.
>>>>
>>>>
>>>>  
>>>>       
>>>>         
>>>>> ParallelWriter companion to ParallelReader
>>>>> ------------------------------------------
>>>>>
>>>>>          Key: LUCENE-600
>>>>>          URL: http://issues.apache.org/jira/browse/LUCENE-600
>>>>>      Project: Lucene - Java
>>>>>         Type: Improvement
>>>>>     
>>>>>         
>>>>>           
>>>>  
>>>>       
>>>>         
>>>>>   Components: Index
>>>>>     Versions: 2.1
>>>>>     Reporter: Chuck Williams
>>>>>  Attachments: ParallelWriter.patch
>>>>>
>>>>> A new class ParallelWriter is provided that serves as a companion to
>>>>> ParallelReader.  ParallelWriter meets all of the doc-id
>>>>> synchronization requirements of ParallelReader, subject to:
>>>>>     1.  ParallelWriter.addDocument() is synchronized, which might
>>>>> have an adverse effect on performance.  The writes to the
>>>>> sub-indexes are, however, done in parallel.
>>>>>     2.  The application must ensure that the ParallelReader is never
>>>>> reopened inside ParallelWriter.addDocument(), else it might find the
>>>>> sub-indexes out of sync.
>>>>>     3.  The application must deal with recovery from
>>>>> ParallelWriter.addDocument() exceptions.  Recovery must restore the
>>>>> synchronization of doc-ids, e.g. by deleting any trailing
>>>>> document(s) in one sub-index that were not successfully added to all
>>>>> sub-indexes, and then optimizing all sub-indexes.
>>>>> A new interface, Writable, is provided to abstract IndexWriter and
>>>>> ParallelWriter.  This is in the same spirit as the existing
>>>>> Searchable and Fieldable classes.
>>>>> This implementation uses java 1.5.  The patch applies against
>>>>> today's svn head.  All tests pass, including the new
>>>>> TestParallelWriter.
>>>>>     
>>>>>         
>>>>>           
>>>>   
>>>>       
>>>>         
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>   
>>     
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by Grant Ingersoll <gs...@syr.edu>.
+1

Do you want to post it on the user list?  It might also be good to put 
it up on the main website.

Otis Gospodnetic wrote:
> Grant: how to poll users?  How about this: http://www.quimble.com/poll/view/2156 ?  If you think that's ok, we can send that to java-user tomorrow and see.  Hey, how about some bets?  I'll put a $10 for a beer on 1.5.
>
>   
Wow, $10 for a beer?  That must be some pretty good beer.  Either that 
or you live in New York City and that is a cheap beer!  Anyway, I am 
betting it is 1.5 as well.  Maybe we can get together at ApacheCon or 
something for one...



> Otis
>
> ----- Original Message ----
> From: Grant Ingersoll <gs...@syr.edu>
> To: java-dev@lucene.apache.org
> Sent: Tuesday, June 13, 2006 5:01:30 PM
> Subject: Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader
>
>
>   
>> In addition to performance, productivity and functionality benefits, my
>> main argument for 1.5 is that it is used by the vast majority of lucene
>> community members.  
>>     
>
> I am not so sure about this. Perhaps we should take a poll on the user 
> list?  Not even sure how that would be managed or counted, but...
>
>   
>> Everything I write is in 1.5 and I don't have time
>> to backport.  I have a significant body of code from which to extract
>> and contribute patches that others would likely find useful.  How many
>> others are in a similar position?
>>   
>>     
> I definitely would prefer to make future contributions in 1.5 (even the 
> patch we just contributed (issue 545) could have been better given 1.5, 
> but it is fine with 1.4 as well).  I tend to think if people don't want 
> the new functionality or if it breaks their app. then they need not 
> upgrade, or they can contribute patches against the branches for prior 
> releases and we can support that as needed.   To me, this is what major 
> releases are about.  I know that when a major release comes out that I 
> should expect library changes that may break my code.  If I don't want 
> that pain, then I don't upgrade.
>   
>> On the side, not leaving valued community members behind is important.
>>
>> I think the pmc / committers just need to make a decision which will
>> impact one group or the other.
>>
>> Chuck
>>
>>
>> Grant Ingersoll wrote on 06/13/2006 03:35 AM:
>>   
>>     
>>> Well, we have our first Java 1.5 patch...  Now that we have had a week
>>> or two to digest the comments, do we want to reopen the discussion?
>>>
>>> Chuck Williams (JIRA) wrote:
>>>     
>>>       
>>>>      [ http://issues.apache.org/jira/browse/LUCENE-600?page=all ]
>>>>
>>>> Chuck Williams updated LUCENE-600:
>>>> ----------------------------------
>>>>
>>>>     Attachment: ParallelWriter.patch
>>>>
>>>> Patch to create and integrate ParallelWriter, Writable and
>>>> TestParallelWriter -- also modifies build to use java 1.5.
>>>>
>>>>
>>>>  
>>>>       
>>>>         
>>>>> ParallelWriter companion to ParallelReader
>>>>> ------------------------------------------
>>>>>
>>>>>          Key: LUCENE-600
>>>>>          URL: http://issues.apache.org/jira/browse/LUCENE-600
>>>>>      Project: Lucene - Java
>>>>>         Type: Improvement
>>>>>     
>>>>>         
>>>>>           
>>>>  
>>>>       
>>>>         
>>>>>   Components: Index
>>>>>     Versions: 2.1
>>>>>     Reporter: Chuck Williams
>>>>>  Attachments: ParallelWriter.patch
>>>>>
>>>>> A new class ParallelWriter is provided that serves as a companion to
>>>>> ParallelReader.  ParallelWriter meets all of the doc-id
>>>>> synchronization requirements of ParallelReader, subject to:
>>>>>     1.  ParallelWriter.addDocument() is synchronized, which might
>>>>> have an adverse effect on performance.  The writes to the
>>>>> sub-indexes are, however, done in parallel.
>>>>>     2.  The application must ensure that the ParallelReader is never
>>>>> reopened inside ParallelWriter.addDocument(), else it might find the
>>>>> sub-indexes out of sync.
>>>>>     3.  The application must deal with recovery from
>>>>> ParallelWriter.addDocument() exceptions.  Recovery must restore the
>>>>> synchronization of doc-ids, e.g. by deleting any trailing
>>>>> document(s) in one sub-index that were not successfully added to all
>>>>> sub-indexes, and then optimizing all sub-indexes.
>>>>> A new interface, Writable, is provided to abstract IndexWriter and
>>>>> ParallelWriter.  This is in the same spirit as the existing
>>>>> Searchable and Fieldable classes.
>>>>> This implementation uses java 1.5.  The patch applies against
>>>>> today's svn head.  All tests pass, including the new
>>>>> TestParallelWriter.
>>>>>     
>>>>>         
>>>>>           
>>>>   
>>>>       
>>>>         
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>   
>>     
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by Otis Gospodnetic <ot...@yahoo.com>.
I agree and completely understand Chuck.  I'm waiting for my employer to sign and fax the CCLA for some search benchmarking code I wrote, and it uses Java 1.5 stuff.   It would only be a contrib piece, not core, so it's less of a problem, but...

Grant: how to poll users?  How about this: http://www.quimble.com/poll/view/2156 ?  If you think that's ok, we can send that to java-user tomorrow and see.  Hey, how about some bets?  I'll put a $10 for a beer on 1.5.

Otis

----- Original Message ----
From: Grant Ingersoll <gs...@syr.edu>
To: java-dev@lucene.apache.org
Sent: Tuesday, June 13, 2006 5:01:30 PM
Subject: Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader


> In addition to performance, productivity and functionality benefits, my
> main argument for 1.5 is that it is used by the vast majority of lucene
> community members.  

I am not so sure about this. Perhaps we should take a poll on the user 
list?  Not even sure how that would be managed or counted, but...

> Everything I write is in 1.5 and I don't have time
> to backport.  I have a significant body of code from which to extract
> and contribute patches that others would likely find useful.  How many
> others are in a similar position?
>   
I definitely would prefer to make future contributions in 1.5 (even the 
patch we just contributed (issue 545) could have been better given 1.5, 
but it is fine with 1.4 as well).  I tend to think if people don't want 
the new functionality or if it breaks their app. then they need not 
upgrade, or they can contribute patches against the branches for prior 
releases and we can support that as needed.   To me, this is what major 
releases are about.  I know that when a major release comes out that I 
should expect library changes that may break my code.  If I don't want 
that pain, then I don't upgrade.
> On the side, not leaving valued community members behind is important.
>
> I think the pmc / committers just need to make a decision which will
> impact one group or the other.
>
> Chuck
>
>
> Grant Ingersoll wrote on 06/13/2006 03:35 AM:
>   
>> Well, we have our first Java 1.5 patch...  Now that we have had a week
>> or two to digest the comments, do we want to reopen the discussion?
>>
>> Chuck Williams (JIRA) wrote:
>>     
>>>      [ http://issues.apache.org/jira/browse/LUCENE-600?page=all ]
>>>
>>> Chuck Williams updated LUCENE-600:
>>> ----------------------------------
>>>
>>>     Attachment: ParallelWriter.patch
>>>
>>> Patch to create and integrate ParallelWriter, Writable and
>>> TestParallelWriter -- also modifies build to use java 1.5.
>>>
>>>
>>>  
>>>       
>>>> ParallelWriter companion to ParallelReader
>>>> ------------------------------------------
>>>>
>>>>          Key: LUCENE-600
>>>>          URL: http://issues.apache.org/jira/browse/LUCENE-600
>>>>      Project: Lucene - Java
>>>>         Type: Improvement
>>>>     
>>>>         
>>>  
>>>       
>>>>   Components: Index
>>>>     Versions: 2.1
>>>>     Reporter: Chuck Williams
>>>>  Attachments: ParallelWriter.patch
>>>>
>>>> A new class ParallelWriter is provided that serves as a companion to
>>>> ParallelReader.  ParallelWriter meets all of the doc-id
>>>> synchronization requirements of ParallelReader, subject to:
>>>>     1.  ParallelWriter.addDocument() is synchronized, which might
>>>> have an adverse effect on performance.  The writes to the
>>>> sub-indexes are, however, done in parallel.
>>>>     2.  The application must ensure that the ParallelReader is never
>>>> reopened inside ParallelWriter.addDocument(), else it might find the
>>>> sub-indexes out of sync.
>>>>     3.  The application must deal with recovery from
>>>> ParallelWriter.addDocument() exceptions.  Recovery must restore the
>>>> synchronization of doc-ids, e.g. by deleting any trailing
>>>> document(s) in one sub-index that were not successfully added to all
>>>> sub-indexes, and then optimizing all sub-indexes.
>>>> A new interface, Writable, is provided to abstract IndexWriter and
>>>> ParallelWriter.  This is in the same spirit as the existing
>>>> Searchable and Fieldable classes.
>>>> This implementation uses java 1.5.  The patch applies against
>>>> today's svn head.  All tests pass, including the new
>>>> TestParallelWriter.
>>>>     
>>>>         
>>>   
>>>       
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by Grant Ingersoll <gs...@syr.edu>.
> In addition to performance, productivity and functionality benefits, my
> main argument for 1.5 is that it is used by the vast majority of lucene
> community members.  

I am not so sure about this. Perhaps we should take a poll on the user 
list?  Not even sure how that would be managed or counted, but...

> Everything I write is in 1.5 and I don't have time
> to backport.  I have a significant body of code from which to extract
> and contribute patches that others would likely find useful.  How many
> others are in a similar position?
>   
I definitely would prefer to make future contributions in 1.5 (even the 
patch we just contributed (issue 545) could have been better given 1.5, 
but it is fine with 1.4 as well).  I tend to think if people don't want 
the new functionality or if it breaks their app. then they need not 
upgrade, or they can contribute patches against the branches for prior 
releases and we can support that as needed.   To me, this is what major 
releases are about.  I know that when a major release comes out that I 
should expect library changes that may break my code.  If I don't want 
that pain, then I don't upgrade.
> On the side, not leaving valued community members behind is important.
>
> I think the pmc / committers just need to make a decision which will
> impact one group or the other.
>
> Chuck
>
>
> Grant Ingersoll wrote on 06/13/2006 03:35 AM:
>   
>> Well, we have our first Java 1.5 patch...  Now that we have had a week
>> or two to digest the comments, do we want to reopen the discussion?
>>
>> Chuck Williams (JIRA) wrote:
>>     
>>>      [ http://issues.apache.org/jira/browse/LUCENE-600?page=all ]
>>>
>>> Chuck Williams updated LUCENE-600:
>>> ----------------------------------
>>>
>>>     Attachment: ParallelWriter.patch
>>>
>>> Patch to create and integrate ParallelWriter, Writable and
>>> TestParallelWriter -- also modifies build to use java 1.5.
>>>
>>>
>>>  
>>>       
>>>> ParallelWriter companion to ParallelReader
>>>> ------------------------------------------
>>>>
>>>>          Key: LUCENE-600
>>>>          URL: http://issues.apache.org/jira/browse/LUCENE-600
>>>>      Project: Lucene - Java
>>>>         Type: Improvement
>>>>     
>>>>         
>>>  
>>>       
>>>>   Components: Index
>>>>     Versions: 2.1
>>>>     Reporter: Chuck Williams
>>>>  Attachments: ParallelWriter.patch
>>>>
>>>> A new class ParallelWriter is provided that serves as a companion to
>>>> ParallelReader.  ParallelWriter meets all of the doc-id
>>>> synchronization requirements of ParallelReader, subject to:
>>>>     1.  ParallelWriter.addDocument() is synchronized, which might
>>>> have an adverse effect on performance.  The writes to the
>>>> sub-indexes are, however, done in parallel.
>>>>     2.  The application must ensure that the ParallelReader is never
>>>> reopened inside ParallelWriter.addDocument(), else it might find the
>>>> sub-indexes out of sync.
>>>>     3.  The application must deal with recovery from
>>>> ParallelWriter.addDocument() exceptions.  Recovery must restore the
>>>> synchronization of doc-ids, e.g. by deleting any trailing
>>>> document(s) in one sub-index that were not successfully added to all
>>>> sub-indexes, and then optimizing all sub-indexes.
>>>> A new interface, Writable, is provided to abstract IndexWriter and
>>>> ParallelWriter.  This is in the same spirit as the existing
>>>> Searchable and Fieldable classes.
>>>> This implementation uses java 1.5.  The patch applies against
>>>> today's svn head.  All tests pass, including the new
>>>> TestParallelWriter.
>>>>     
>>>>         
>>>   
>>>       
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by Chuck Williams <ch...@manawiz.com>.
I think the last discussion ended with the main counter-argument being
lack of support by gjc.  Current top of GJC News:

> *June 6, 2006* RMS approved the plan to use the Eclipse compiler as
> the new gcj front end. Work is being done on the |gcj-eclipse| branch;
> it can already build libgcj. This project will allow us to ship a 1.5
> compiler in the relatively near future. The old |gcjx| branch and
> project is now dead.

In addition to performance, productivity and functionality benefits, my
main argument for 1.5 is that it is used by the vast majority of lucene
community members.  Everything I write is in 1.5 and I don't have time
to backport.  I have a significant body of code from which to extract
and contribute patches that others would likely find useful.  How many
others are in a similar position?

On the side, not leaving valued community members behind is important.

I think the pmc / committers just need to make a decision which will
impact one group or the other.

Chuck


Grant Ingersoll wrote on 06/13/2006 03:35 AM:
> Well, we have our first Java 1.5 patch...  Now that we have had a week
> or two to digest the comments, do we want to reopen the discussion?
>
> Chuck Williams (JIRA) wrote:
>>      [ http://issues.apache.org/jira/browse/LUCENE-600?page=all ]
>>
>> Chuck Williams updated LUCENE-600:
>> ----------------------------------
>>
>>     Attachment: ParallelWriter.patch
>>
>> Patch to create and integrate ParallelWriter, Writable and
>> TestParallelWriter -- also modifies build to use java 1.5.
>>
>>
>>  
>>> ParallelWriter companion to ParallelReader
>>> ------------------------------------------
>>>
>>>          Key: LUCENE-600
>>>          URL: http://issues.apache.org/jira/browse/LUCENE-600
>>>      Project: Lucene - Java
>>>         Type: Improvement
>>>     
>>
>>  
>>>   Components: Index
>>>     Versions: 2.1
>>>     Reporter: Chuck Williams
>>>  Attachments: ParallelWriter.patch
>>>
>>> A new class ParallelWriter is provided that serves as a companion to
>>> ParallelReader.  ParallelWriter meets all of the doc-id
>>> synchronization requirements of ParallelReader, subject to:
>>>     1.  ParallelWriter.addDocument() is synchronized, which might
>>> have an adverse effect on performance.  The writes to the
>>> sub-indexes are, however, done in parallel.
>>>     2.  The application must ensure that the ParallelReader is never
>>> reopened inside ParallelWriter.addDocument(), else it might find the
>>> sub-indexes out of sync.
>>>     3.  The application must deal with recovery from
>>> ParallelWriter.addDocument() exceptions.  Recovery must restore the
>>> synchronization of doc-ids, e.g. by deleting any trailing
>>> document(s) in one sub-index that were not successfully added to all
>>> sub-indexes, and then optimizing all sub-indexes.
>>> A new interface, Writable, is provided to abstract IndexWriter and
>>> ParallelWriter.  This is in the same spirit as the existing
>>> Searchable and Fieldable classes.
>>> This implementation uses java 1.5.  The patch applies against
>>> today's svn head.  All tests pass, including the new
>>> TestParallelWriter.
>>>     
>>
>>   
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Java 1.5 was [jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by Grant Ingersoll <gs...@syr.edu>.
Well, we have our first Java 1.5 patch...  Now that we have had a week 
or two to digest the comments, do we want to reopen the discussion?

Chuck Williams (JIRA) wrote:
>      [ http://issues.apache.org/jira/browse/LUCENE-600?page=all ]
>
> Chuck Williams updated LUCENE-600:
> ----------------------------------
>
>     Attachment: ParallelWriter.patch
>
> Patch to create and integrate ParallelWriter, Writable and TestParallelWriter -- also modifies build to use java 1.5.
>
>
>   
>> ParallelWriter companion to ParallelReader
>> ------------------------------------------
>>
>>          Key: LUCENE-600
>>          URL: http://issues.apache.org/jira/browse/LUCENE-600
>>      Project: Lucene - Java
>>         Type: Improvement
>>     
>
>   
>>   Components: Index
>>     Versions: 2.1
>>     Reporter: Chuck Williams
>>  Attachments: ParallelWriter.patch
>>
>> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
>> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
>> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.
>>     
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Chuck Williams (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/LUCENE-600?page=all ]

Chuck Williams updated LUCENE-600:
----------------------------------

    Attachment: ParallelWriter.patch

Patch to create and integrate ParallelWriter, Writable and TestParallelWriter -- also modifies build to use java 1.5.


> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>          Key: LUCENE-600
>          URL: http://issues.apache.org/jira/browse/LUCENE-600
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Index
>     Versions: 2.1
>     Reporter: Chuck Williams
>  Attachments: ParallelWriter.patch
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Chuck Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749599#action_12749599 ] 

Chuck Williams commented on LUCENE-600:
---------------------------------------

The patent isn't on the parallel writer stuff, it's on the update method.  Using parallel indexes is just the first simple step to an efficient update method, far from the complete solution.

Personally I'm not a supporter of software patents and hope Bilski raises the bar substantially.  However, we live in the world we live in now and companies, including the ones I've worked for who funded this work, want to protect their IP, at least for defensive reasons.  I tried to broker a deal to contribute the update method and code to lucene instead of protecting it, but there was lack of interest in the lucene community at the time and so the deal fell apart.

At this point the patent is filed and published, so if it issues anybody who infringes risks a charge of doing so knowingly, which substantially increases penalties.

I'm the messenger here, so please don't shoot me.


> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Chuck Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749660#action_12749660 ] 

Chuck Williams commented on LUCENE-600:
---------------------------------------

Erratum:  "deletion changes doc-id's by purging deleted documents" --> "*merging* changes doc-id's by purging deleted documents"


> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Michael Busch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749740#action_12749740 ] 

Michael Busch commented on LUCENE-600:
--------------------------------------

{quote}
I could inquire if you are truly open to it, but it sounds like you may be on your own path to a quite similar thing.
{quote}

Well my goal is to get the best possible implementation of this feature into Lucene. Nothing is set in stone yet. So you should feel free to suggest improvements. And if you think your implementation is better or has details worth looking at it would be good if you could submit your code.

> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Michael Busch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749663#action_12749663 ] 

Michael Busch commented on LUCENE-600:
--------------------------------------

{quote}
The version attached here is from over 3 years ago. Our version has evolved along with Lucene and the whole apparatus is fully functional with the latest lucene. 
{quote}

Well this issue hasn't been updated in 3 years, so I didn't know that it was still being worked on. Of course you're more than welcome to help working on LUCENE-1879 - it has the same goals and it's just a different JIRA number after all.

{quote}
The only reason for the corresponding-segment constraint is that deletion changes doc-id's by purging deleted documents. 
{quote}

So does you approach require doc ids to be stable or can the app using your parallel writer delete docs and purge deleted docs?

> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Chuck Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749450#action_12749450 ] 

Chuck Williams commented on LUCENE-600:
---------------------------------------

I contributed the first patch to make flush-by-size possible; see Lucene-709.  There is no incompatibility with ParallelWriter, even the early version contributed here 3 years ago.  We've been doing efficient updating of selected mutable fields now for a long time and filed for a patent on the method.  See published patent application 20090193406.


> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-600) ParallelWriter companion to ParallelReader

Posted by "Michael Busch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749682#action_12749682 ] 

Michael Busch commented on LUCENE-600:
--------------------------------------

{quote}
I just read your wiki page and it looks like your MasterMergePolicy is the same for the merge-by-size case, right?
{quote}

Yep sounds very similar. The MasterMergePolicy can wrap any other MergePolicy.

> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
>                 Key: LUCENE-600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-600
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Chuck Williams
>            Priority: Minor
>         Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to ParallelReader.  ParallelWriter meets all of the doc-id synchronization requirements of ParallelReader, subject to:
>     1.  ParallelWriter.addDocument() is synchronized, which might have an adverse effect on performance.  The writes to the sub-indexes are, however, done in parallel.
>     2.  The application must ensure that the ParallelReader is never reopened inside ParallelWriter.addDocument(), else it might find the sub-indexes out of sync.
>     3.  The application must deal with recovery from ParallelWriter.addDocument() exceptions.  Recovery must restore the synchronization of doc-ids, e.g. by deleting any trailing document(s) in one sub-index that were not successfully added to all sub-indexes, and then optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and ParallelWriter.  This is in the same spirit as the existing Searchable and Fieldable classes.
> This implementation uses java 1.5.  The patch applies against today's svn head.  All tests pass, including the new TestParallelWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org