You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael Busch (JIRA)" <ji...@apache.org> on 2010/04/14 23:14:52 UTC
[jira] Updated: (LUCENE-2324) Per thread DocumentsWriters that
write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Busch updated LUCENE-2324:
----------------------------------
Attachment: lucene-2324.patch
The patch removes all *PerThread classes downstream of DocumentsWriter.
This simplifies a lot of the flushing logic in the different consumers. The patch also removes FreqProxMergeState, because we don't have to interleave posting lists from different threads anymore of course. I really like these simplifications!
There is still a lot to do: The changes in DocumentsWriter and IndexWriter are currently just experimental to make everything compile. Next I will introduce DocumentsWriterPerThread and implement the sequenceID logic (which was discussed here in earlier comments) and the new RAM management. I also want to go through the indexing chain once again - there are probably a few more things to clean up or simplify.
The patch compiles and actually a surprising amount of tests pass. Only multi-threaded tests seem to fail,
which is not very surprising, considering I removed all thread-handling logic from DocumentsWriter. :)
So this patch isn't working yet - just wanted to post my current progress.
> Per thread DocumentsWriters that write their own private segments
> -----------------------------------------------------------------
>
> Key: LUCENE-2324
> URL: https://issues.apache.org/jira/browse/LUCENE-2324
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Michael Busch
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 3.1
>
> Attachments: lucene-2324.patch, LUCENE-2324.patch
>
>
> See LUCENE-2293 for motivation and more details.
> I'm copying here Mike's summary he posted on 2293:
> Change the approach for how we buffer in RAM to a more isolated
> approach, whereby IW has N fully independent RAM segments
> in-process and when a doc needs to be indexed it's added to one of
> them. Each segment would also write its own doc stores and
> "normal" segment merging (not the inefficient merge we now do on
> flush) would merge them. This should be a good simplification in
> the chain (eg maybe we can remove the *PerThread classes). The
> segments can flush independently, letting us make much better
> concurrent use of IO & CPU.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org