You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Tom Burton-West (JIRA)" <ji...@apache.org> on 2010/12/18 16:48:01 UTC

[jira] Created: (SOLR-2290) the termsInfosDivisor for readers opened by indexWriter should be configurable in Solr

the termsInfosDivisor for readers opened by indexWriter should be configurable in Solr
--------------------------------------------------------------------------------------

                 Key: SOLR-2290
                 URL: https://issues.apache.org/jira/browse/SOLR-2290
             Project: Solr
          Issue Type: New Feature
            Reporter: Tom Burton-West
            Priority: Minor


Solr allows users to set the termInfosIndexDivisor used by the  indexReader during search time  in solrconfig.xml, but not in the  indexReader opened by the IndexWriter when indexing/merging.

When dealing with an index with a large number of unique terms, setting the termInfosIndexDivisor at search time is helpful in  reducing memory use.  It would also be helpful in reducing memory use during indexing/merging if it was made configurable for indexReaders opened by indexWriter during indexing/merging.

This thread contains some background:
http://www.lucidimagination.com/search/document/b5c756a366e1a0d6/memory_use_during_merges_oom

In the Lucene 3.x branch it looks like this is done in IndexWriterConfig.setReaderTermsIndexDivisor, although there is also this method signature in IndexWriter.java: IndexReader getReader(int termInfosIndexDivisor)

  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2290) the termsInfosDivisor for readers opened by indexWriter should be configurable in Solr

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974224#action_12974224 ] 

Jason Rutherglen commented on SOLR-2290:
----------------------------------------

I think it'll require creating a new sub-element of mainIndex and indexDefaults called perhaps indexWriterConfig?  Because attributes such as unlockOnStartup and reopenReaders cannot be injected in, and we probably don't want to mix injected properties with non-injected properties?

> the termsInfosDivisor for readers opened by indexWriter should be configurable in Solr
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-2290
>                 URL: https://issues.apache.org/jira/browse/SOLR-2290
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Tom Burton-West
>            Priority: Minor
>
> Solr allows users to set the termInfosIndexDivisor used by the  indexReader during search time  in solrconfig.xml, but not in the  indexReader opened by the IndexWriter when indexing/merging.
> When dealing with an index with a large number of unique terms, setting the termInfosIndexDivisor at search time is helpful in  reducing memory use.  It would also be helpful in reducing memory use during indexing/merging if it was made configurable for indexReaders opened by indexWriter during indexing/merging.
> This thread contains some background:
> http://www.lucidimagination.com/search/document/b5c756a366e1a0d6/memory_use_during_merges_oom
> In the Lucene 3.x branch it looks like this is done in IndexWriterConfig.setReaderTermsIndexDivisor, although there is also this method signature in IndexWriter.java: IndexReader getReader(int termInfosIndexDivisor)
>   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2290) the termsInfosDivisor for readers opened by indexWriter should be configurable in Solr

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12972847#action_12972847 ] 

Jason Rutherglen commented on SOLR-2290:
----------------------------------------

Tom, I think this can be generified to use SOLR-1447's property injection into IWC.

> the termsInfosDivisor for readers opened by indexWriter should be configurable in Solr
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-2290
>                 URL: https://issues.apache.org/jira/browse/SOLR-2290
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Tom Burton-West
>            Priority: Minor
>
> Solr allows users to set the termInfosIndexDivisor used by the  indexReader during search time  in solrconfig.xml, but not in the  indexReader opened by the IndexWriter when indexing/merging.
> When dealing with an index with a large number of unique terms, setting the termInfosIndexDivisor at search time is helpful in  reducing memory use.  It would also be helpful in reducing memory use during indexing/merging if it was made configurable for indexReaders opened by indexWriter during indexing/merging.
> This thread contains some background:
> http://www.lucidimagination.com/search/document/b5c756a366e1a0d6/memory_use_during_merges_oom
> In the Lucene 3.x branch it looks like this is done in IndexWriterConfig.setReaderTermsIndexDivisor, although there is also this method signature in IndexWriter.java: IndexReader getReader(int termInfosIndexDivisor)
>   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2290) the termsInfosDivisor for readers opened by indexWriter should be configurable in Solr

Posted by "Tom Burton-West (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973990#action_12973990 ] 

Tom Burton-West commented on SOLR-2290:
---------------------------------------

Thanks Jason,

I'm still working my way though the Solr codebase and don't yet really  understand the relationship between the lucene classes and the solr classes and the Solr config process.

I'd like to do the property injection.  I'm having trouble conceptualizing what the entry would be in the solrconfig.xml file
and what the name would be in SolrIndexConfig.  How do we distinguish  between the indexReader termInfoIndexDivisor used during search and the indexReader termInfoIndexDivisor for the indexReader that is used by the indexWriter?

Would it be something like this?

<<indexWriter name="IndexWriter" class="org.apache.solr.update.SolrIndexWriter">
<int name="termInfosIndexDivisor">8</int>
</indexWriter>


Maybe you could point me to the classes I should be looking at?

Tom

> the termsInfosDivisor for readers opened by indexWriter should be configurable in Solr
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-2290
>                 URL: https://issues.apache.org/jira/browse/SOLR-2290
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Tom Burton-West
>            Priority: Minor
>
> Solr allows users to set the termInfosIndexDivisor used by the  indexReader during search time  in solrconfig.xml, but not in the  indexReader opened by the IndexWriter when indexing/merging.
> When dealing with an index with a large number of unique terms, setting the termInfosIndexDivisor at search time is helpful in  reducing memory use.  It would also be helpful in reducing memory use during indexing/merging if it was made configurable for indexReaders opened by indexWriter during indexing/merging.
> This thread contains some background:
> http://www.lucidimagination.com/search/document/b5c756a366e1a0d6/memory_use_during_merges_oom
> In the Lucene 3.x branch it looks like this is done in IndexWriterConfig.setReaderTermsIndexDivisor, although there is also this method signature in IndexWriter.java: IndexReader getReader(int termInfosIndexDivisor)
>   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org