You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2012/06/18 01:39:42 UTC

[jira] [Updated] (LUCENE-3866) Make CompositeReader.getSequentialSubReaders() and the corresponding IndexReaderContext methods return unmodifiable List

     [ https://issues.apache.org/jira/browse/LUCENE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3866:
----------------------------------

    Attachment: LUCENE-3866-step1.patch

Here the first step of this refactoring (affects only few core classes):

- getSequentialSubReaders() returns a List<? extends IndexReader> (unmodifiable)
- Nuked ReaderUtil.Gather completely (this one is ineffective and slow because of useless recursion). We can use ReaderContext.leaves() to get all atomic sub readers including their docStarts
- Moved ReaderUtil.Slice to separate class (it's somehow unrelated), but has nothing to do with ReaderUtil anymore -> Should move to index package + hidden

There is still one (slow) nocommit, because step 2 will refactor the ReaderContexts to use Collections, too.

Yonik:

bq. I think we should allow users to access in the most low-level efficient manner, just as lucene can. Remember, this is expert level stuff!  Seems like at most we would need a javadoc comment saying "don't modify this".

Actually it's more efficient than before. And the collection views are still backed by arrays and are never used in inner loops (it's just when IndexSearcher initializes or executes searches on all segments). Without any real benchmark showing a slowness (please do this *after* step 2 -> nocommit) I don't think we should risk corrumpt readers.
                
> Make CompositeReader.getSequentialSubReaders() and the corresponding IndexReaderContext methods return unmodifiable List<R extends IndexReader>
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3866
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3866
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-3866-step1.patch
>
>
> Since Lucene 2.9 we have the method getSequentialSubReader() returning IndexReader[]. Based on hardly-to-debug errors in user's code, Robert and me realized that returning an array from a public API is an anti-pattern. If the array is intended to be modifiable (like byte[] in terms,...), it is fine to use arrays in public APIs, but not, if the array must be protected from modification. As IndexReaders are 100% unmodifiable in trunk code (no deletions,...), the only possibility to corrumpt the reader is by modifying the array returned by getSequentialSubReaders(). We should prevent this.
> The same theoretically applies to FieldCache, too - but the party that is afraid of performance problems is too big to fight against that :(
> For getSequentialSubReaders there is no performance problem at all. The binary search of reader-ids inside BaseCompositeReader can still be done with the internal protected array, but public APIs should expose only a unmodifiable List. The same applies to leaves() and children() in IndexReaderContext. This change to list would also allow to make CompositeReader and CompositeReaderContext Iterable<R extends IndexReader>, so some loops would look nice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org