You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Grant Ingersoll (JIRA)" <ji...@apache.org> on 2007/10/22 22:14:50 UTC

[jira] Updated: (SOLR-342) Add support for Lucene's new Indexing and merge features (excluding Document/Field/Token reuse)

     [ https://issues.apache.org/jira/browse/SOLR-342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-342:
---------------------------------

    Description: 
LUCENE-843 adds support for new indexing capabilities using the setRAMBufferSizeMB() method that should significantly speed up indexing for many applications.  To fix this, we will need trunk version of Lucene (or wait for the next official release of Lucene)

Side effect of this is that Lucene's new, faster StandardTokenizer will also be incorporated.  

Also need to think about how we want to incorporate the new merge scheduling functionality (new default in Lucene is to do merges in a background thread)

  was:
LUCENE-843 adds support for new indexing capabilities using the setRAMBufferSizeMB() method that should significantly speed up indexing for many applications.  To fix this, we will need trunk version of Lucene (or wait for the next official release of Lucene)

Side effect of this is that Lucene's new, faster StandardTokenizer will also be incorporated.

        Summary: Add support for Lucene's new Indexing and merge features (excluding Document/Field/Token reuse)  (was: Add support for Lucene's new setRAMBufferSizeMB() method in IndexWriter)

Updated to cover the broader scope of changes that effect upgrading to Lucene trunk.

Plan to implement:
Add <ramBufferSizeMB> tag to specify the number of megabytes to give Lucene.  Setting this value will override all other related settings (maxBufferedDocs, etc.) related to IndexWriter configuration

Add <mergeScheduler> tag that can have two values:  concurrent or serial.   Or would it be better to take in a classname?  Doing the latter would mean we would have to have a no-arg constructor, right?

Add <mergePolicy> tag that defines the merge policy that can have two values: byteSize or docCount.  Or would it be better to take a classname? 

NOTE: I am not proposing to handle the new reusable Document/Field/Token mechanism in Lucene, which should also be considered.




> Add support for Lucene's new Indexing and merge features (excluding Document/Field/Token reuse)
> -----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-342
>                 URL: https://issues.apache.org/jira/browse/SOLR-342
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>
> LUCENE-843 adds support for new indexing capabilities using the setRAMBufferSizeMB() method that should significantly speed up indexing for many applications.  To fix this, we will need trunk version of Lucene (or wait for the next official release of Lucene)
> Side effect of this is that Lucene's new, faster StandardTokenizer will also be incorporated.  
> Also need to think about how we want to incorporate the new merge scheduling functionality (new default in Lucene is to do merges in a background thread)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.