You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org> on 2012/02/01 18:28:58 UTC
[jira] [Updated] (LUCENE-3729) Allow using FST to hold terms data
in DocValues.BYTES_*_SORTED
[ https://issues.apache.org/jira/browse/LUCENE-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-3729:
---------------------------------------
Attachment: LUCENE-3729.patch
New patch, still just prototyping on FC, but now all tests pass.
I enabled packing and the wikipedia title data is now ~43.8% smaller than what FC does today (PagedBytes + PackedInts).
Results are about the same as before:
{noformat}
PKLookup 127.87 2.68 115.22 5.37 -15% - -3%
TermTitleSort 69.77 4.32 64.91 2.62 -15% - 3%
TermBGroup1M 35.85 1.03 34.49 0.92 -8% - 1%
TermGroup1M 26.58 0.72 26.13 0.38 -5% - 2%
Respell 75.04 2.63 74.28 1.33 -6% - 4%
Fuzzy1 86.35 1.93 86.27 1.34 -3% - 3%
Phrase 18.92 0.57 18.94 0.57 -5% - 6%
SpanNear 1.46 0.02 1.46 0.05 -4% - 5%
SloppyPhrase 15.85 0.69 15.93 0.69 -7% - 9%
Fuzzy2 31.37 0.61 31.65 0.53 -2% - 4%
TermBGroup1M1P 44.99 1.32 45.47 0.74 -3% - 5%
AndHighMed 40.22 1.00 41.43 0.32 0% - 6%
Wildcard 26.11 1.15 27.15 0.21 -1% - 9%
OrHighHigh 6.14 0.42 6.40 0.34 -7% - 17%
OrHighMed 10.65 0.72 11.10 0.60 -7% - 17%
AndHighHigh 9.16 0.33 9.56 0.04 0% - 8%
Prefix3 43.07 2.32 45.34 0.55 -1% - 12%
Term 34.11 1.60 36.09 1.19 -2% - 14%
IntNRQ 7.66 0.64 8.22 0.57 -7% - 25%
{noformat}
> Allow using FST to hold terms data in DocValues.BYTES_*_SORTED
> --------------------------------------------------------------
>
> Key: LUCENE-3729
> URL: https://issues.apache.org/jira/browse/LUCENE-3729
> Project: Lucene - Java
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Attachments: LUCENE-3729.patch, LUCENE-3729.patch, LUCENE-3729.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org