You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2010/11/19 19:50:13 UTC

[jira] Commented: (LUCENE-2771) Remove norms() support from non-atomic IndexReaders

    [ https://issues.apache.org/jira/browse/LUCENE-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933911#action_12933911 ] 

Uwe Schindler commented on LUCENE-2771:
---------------------------------------

Here the relevant comments:

{quote}
*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms.
we need to change SegmentMerger to not call norms on the top-level IR but populate its normBuffer from the subs. 
in my opinion it seems crazy we are currently creating these big arrays this way (yeah there is the hairy code for re-open that re-uses the big merged cache for the NRT case, but still).

Maybe i am missing something.

*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms. we need to change SegmentMerger to not call norms on the top-level IR but populate its normBuffer from the subs. in my opinion it seems crazy we are currently creating these big arrays this way (yeah there is the hairy code for re-open that re-uses the big merged cache for the NRT case, but still). Maybe i am missing something. 

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms and need to be fixed. 
maybe we could add an uncached "MultiNorms" or something at least in src/test for convenience,
just to fill the byte arrays so these tests can assertEquals

otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these tests.

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms and need to be fixed. maybe we could add an uncached "MultiNorms" or something at least in src/test for convenience, just to fill the byte arrays so these tests can assertEquals otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these tests. 

*Uwe Schindler added a comment - 19/Nov/10 07:17 AM*

Here a better patch for the segment merger. We should even apply this if we not remove top-level norms! 
It saves lots of memory during merging by using ReaderUtil to go down to segment level! Don't wonder about BytesRef, but we need a reference here because of the anonymous inner class.

*Uwe Schindler added a comment - 19/Nov/10 07:17 AM*

Here a better patch for the segment merger. We should even apply this if we not remove top-level norms! It saves lots of memory during merging by using ReaderUtil to go down to segment level! Don't wonder about BytesRef, but we need a reference here because of the anonymous inner class. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. 
For ParallelReader i forced it to require non-composite readers only (e.g. SlowMultiReaderWrap them if thats not the case).

TODO: 
- ParallelReader shouldnt need multifields etc anymore 
- there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent 
- merge in Uwe's improved SegmentsMerger 
- clean up code. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. For ParallelReader i forced it to require non-composite readers only (e.g. SlowMultiReaderWrap them if thats not the case). TODO: 
ParallelReader shouldnt need multifields etc anymore 
there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent 
merge in Uwe's improved SegmentsMerger 
clean up code. 
{quote}

> Remove norms() support from non-atomic IndexReaders
> ---------------------------------------------------
>
>                 Key: LUCENE-2771
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2771
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Uwe Schindler
>             Fix For: 4.0
>
>
> Spin-off from LUCENE-2769:
> Currently all IndexReaders support norms(), but the core of Lucene never uses it and its even dangerous because of memory usage. We should do the same like with MultiFields and factor it out and throw UOE on non-atomic readers.
> The SlowMultiReaderWrapper can then manage the norms. Also ParallelReader needs to be fixed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org