You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jason Rutherglen (JIRA)" <ji...@apache.org> on 2011/06/27 06:46:47 UTC

[jira] [Updated] (LUCENE-3245) Realtime terms dictionary

     [ https://issues.apache.org/jira/browse/LUCENE-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Rutherglen updated LUCENE-3245:
-------------------------------------

    Attachment: LUCENE-3245.patch

Here's a basic initial patch implementing a single threaded writer, multiple reader atomic integer array skip list.  

The next step is to tie in the ByteBlockPool to store terms, eg, implement an RTTermsDictAIA class, and an RTTermsDictCSLM class.  

We can then load the same Wiki-EN terms, and measure the comparative write speeds.  

Then create a set of terms to lookup from each terms dict and measure the time difference.  

I am not yet sure how the speed of AtomicIntegerArray will compare with CSLM's usage of AtomicReferenceFieldUpdater.  Of note is the fact that because of DWPTs we do not need a skip list that supports concurrent writes.  And because we're only adding new unique terms, we do not need delete functionality.  Ie, AIA could be faster, though we may need to inline code and perform various tuning tricks.

> Realtime terms dictionary
> -------------------------
>
>                 Key: LUCENE-3245
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3245
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>    Affects Versions: 4.0
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: LUCENE-3245.patch
>
>
> For LUCENE-2312 we need a realtime terms dictionary.  While ConcurrentSkipListMap may be used, it has drawbacks in terms of high object overhead which can impact GC collection times and heap memory usage.  
> If we implement a skip list that uses primitive backing arrays, we can hopefully have a data structure that is [as] fast and memory efficient.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org