You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2011/06/17 01:07:47 UTC

[jira] [Created] (LUCENE-3209) Memory codec

Memory codec
------------

                 Key: LUCENE-3209
                 URL: https://issues.apache.org/jira/browse/LUCENE-3209
             Project: Lucene - Java
          Issue Type: Improvement
          Components: core/index
            Reporter: Michael McCandless
            Assignee: Michael McCandless
             Fix For: 4.0


This codec stores all terms/postings in RAM.  It uses an
FST<BytesRef>.  This is useful on a primary key field to ensure
lookups don't need to hit disk, to keep NRT reopen time fast even
under IO contention.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3209) Memory codec

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051002#comment-13051002 ] 

Michael McCandless commented on LUCENE-3209:
--------------------------------------------

Woops!  I forgot about LUCENE-3069, but, yes this is very similar.

But I think one difference is LUCENE-3069 aims to have all terms memory resident but postings would still reside in the Directory, I think?  Whereas my patch here puts all terms and postings in RAM (in a single FST).  The postings format is similar to what PulsingCodec does, ie, doc + tf + pos + payload are all serialized into a single byte[] using delta vInts.

So I think we should keep LUCENE-3069 open, as an enhancement to this codec to make it separately controllable whether postings should also be in RAM?

> Memory codec
> ------------
>
>                 Key: LUCENE-3209
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3209
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3209.patch
>
>
> This codec stores all terms/postings in RAM.  It uses an
> FST<BytesRef>.  This is useful on a primary key field to ensure
> lookups don't need to hit disk, to keep NRT reopen time fast even
> under IO contention.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3209) Memory codec

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050793#comment-13050793 ] 

Michael McCandless commented on LUCENE-3209:
--------------------------------------------

To clarify: this codec stores postings on disk, but then on read (for searching) it loads the full byte[] from disk into RAM.

> Memory codec
> ------------
>
>                 Key: LUCENE-3209
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3209
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3209.patch
>
>
> This codec stores all terms/postings in RAM.  It uses an
> FST<BytesRef>.  This is useful on a primary key field to ensure
> lookups don't need to hit disk, to keep NRT reopen time fast even
> under IO contention.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3209) Memory codec

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-3209.
----------------------------------------

    Resolution: Fixed

> Memory codec
> ------------
>
>                 Key: LUCENE-3209
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3209
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3209.patch
>
>
> This codec stores all terms/postings in RAM.  It uses an
> FST<BytesRef>.  This is useful on a primary key field to ensure
> lookups don't need to hit disk, to keep NRT reopen time fast even
> under IO contention.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3209) Memory codec

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3209:
---------------------------------------

    Attachment: LUCENE-3209.patch

Patch; I think it's working and ready to commit.  All tests pass w/ it, and I went and disabled the same tests that avoid SimpleText codec.

> Memory codec
> ------------
>
>                 Key: LUCENE-3209
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3209
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3209.patch
>
>
> This codec stores all terms/postings in RAM.  It uses an
> FST<BytesRef>.  This is useful on a primary key field to ensure
> lookups don't need to hit disk, to keep NRT reopen time fast even
> under IO contention.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3209) Memory codec

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050909#comment-13050909 ] 

Simon Willnauer commented on LUCENE-3209:
-----------------------------------------

This seems to be related to LUCENE-3069 right?

> Memory codec
> ------------
>
>                 Key: LUCENE-3209
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3209
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3209.patch
>
>
> This codec stores all terms/postings in RAM.  It uses an
> FST<BytesRef>.  This is useful on a primary key field to ensure
> lookups don't need to hit disk, to keep NRT reopen time fast even
> under IO contention.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3209) Memory codec

Posted by "Dawid Weiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050912#comment-13050912 ] 

Dawid Weiss commented on LUCENE-3209:
-------------------------------------

Looks like a related thing to me.

> Memory codec
> ------------
>
>                 Key: LUCENE-3209
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3209
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3209.patch
>
>
> This codec stores all terms/postings in RAM.  It uses an
> FST<BytesRef>.  This is useful on a primary key field to ensure
> lookups don't need to hit disk, to keep NRT reopen time fast even
> under IO contention.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org