You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Simon Willnauer (Created) (JIRA)" <ji...@apache.org> on 2011/12/14 15:07:30 UTC

[jira] [Created] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Speed up SegementDocsEnum by making it more friendly for JIT optimizations
--------------------------------------------------------------------------

                 Key: LUCENE-3648
                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
             Project: Lucene - Java
          Issue Type: Improvement
          Components: core/codecs, core/search
    Affects Versions: 4.0
            Reporter: Simon Willnauer
             Fix For: 4.0


Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.

I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Issue Comment Edited] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Issue Comment Edited) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169383#comment-13169383 ] 

Simon Willnauer edited comment on LUCENE-3648 at 12/14/11 2:24 PM:
-------------------------------------------------------------------

here are my benchmark results with that patch:


Multisegment index 20 segments 10M medium wiki documents no deletes:

||Task||QPS-trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10% - 7%|
|PKLookup|89.89|5.44|88.82|5.52|-12% - 11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9% - 9%|
|Wildcard|21.42|1.13|21.45|0.90|-8% - 10%|
|SpanNear|3.39|0.13|3.41|0.16|-7% - 9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8% - 9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8% - 10%|
|Phrase|8.98|0.67|9.07|0.73|-13% - 17%|
|Prefix3|16.52|1.04|16.97|0.90|-8% - 15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1% - 8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1% - 8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2% - 11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7% - 22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1% - 16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13% - 32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1% - 18%|
|Term|68.03|3.73|73.70|5.76|-5% - 23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10% - 30%|


Multisegment index 20 segments 10M medium wiki documents with deletes:

||Task||QPS-trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|PKLookup|90.52|4.60|89.71|3.57|-9% - 8%|
|SpanNear|7.65|0.27|7.65|0.27|-6% - 7%|
|SloppyPhrase|10.96|0.47|10.98|0.49|-8% - 9%|
|Respell|48.17|2.39|48.24|2.33|-9% - 10%|
|Wildcard|20.11|0.68|20.16|0.79|-6% - 7%|
|Prefix3|18.27|0.80|18.33|0.89|-8% - 10%|
|Fuzzy1|59.70|2.99|60.08|2.71|-8% - 10%|
|Phrase|3.00|0.19|3.03|0.20|-11% - 14%|
|Fuzzy2|35.06|1.28|35.54|1.10|-5% - 8%|
|TermGroup1M|18.84|0.40|19.21|0.42|-2% - 6%|
|TermBGroup1M|26.25|0.58|26.83|0.58|-2% - 6%|
|AndHighHigh|6.96|0.51|7.21|0.55|-10% - 20%|
|AndHighMed|38.46|2.84|39.83|3.13|-11% - 20%|
|TermBGroup1M1P|8.02|0.23|8.30|0.29|-2% - 10%|
|IntNRQ|5.14|0.36|5.32|0.45|-11% - 20%|
|Term|62.76|2.66|65.05|3.33|-5% - 13%|
|OrHighMed|8.43|0.39|8.75|0.40|-5% - 13%|
|OrHighHigh|5.30|0.23|5.53|0.25|-4% - 14%|


executed on Linux with Java Version "1.6.0_26" cmd: java -XX:BiasedLockingStartupDelay=0 -Xms2g -Xmx2g -server


                
      was (Author: simonw):
    here are my benchmark results with that patch:


Multisegment index 20 segments 10M medium wiki documents no deletes:

||Task||QPS-trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10% - 7%|
|PKLookup|89.89|5.44|88.82|5.52|-12% - 11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9% - 9%|
|Wildcard|21.42|1.13|21.45|0.90|-8% - 10%|
|SpanNear|3.39|0.13|3.41|0.16|-7% - 9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8% - 9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8% - 10%|
|Phrase|8.98|0.67|9.07|0.73|-13% - 17%|
|Prefix3|16.52|1.04|16.97|0.90|-8% - 15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1% - 8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1% - 8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2% - 11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7% - 22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1% - 16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13% - 32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1% - 18%|
|Term|68.03|3.73|73.70|5.76|-5% - 23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10% - 30%|


Multisegment index 20 segments 10M medium wiki documents with deletes:

||Task||QPS trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10% - 7%|
|PKLookup|89.89|5.44|88.82|5.52|-12% - 11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9% - 9%|
|Wildcard|21.42|1.13|21.45|0.90|-8% - 10%|
|SpanNear|3.39|0.13|3.41|0.16|-7% - 9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8% - 9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8% - 10%|
|Phrase|8.98|0.67|9.07|0.73|-13% - 17%|
|Prefix3|16.52|1.04|16.97|0.90|-8% - 15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1% - 8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1% - 8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2% - 11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7% - 22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1% - 16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13% - 32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1% - 18%|
|Term|68.03|3.73|73.70|5.76|-5% - 23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10% - 30%|

executed on Linux with Java Version "1.6.0_26" cmd: java -XX:BiasedLockingStartupDelay=0 -Xms2g -Xmx2g -server


                  
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Updated] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-3648:
------------------------------------

    Attachment: LUCENE-3648.patch

latest patch... I plan to commit this tomorrow if nobody objects
                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch, LUCENE-3648.patch, LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169383#comment-13169383 ] 

Simon Willnauer commented on LUCENE-3648:
-----------------------------------------

here are my benchmark results with that patch:


Multisegment index 20 segments 10M medium wiki documents no deletes:

||Task||QPS-trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10%-7%|
|PKLookup|89.89|5.44|88.82|5.52|-12%-11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9%-9%|
|Wildcard|21.42|1.13|21.45|0.90|-8%-10%|
|SpanNear|3.39|0.13|3.41|0.16|-7%-9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8%-9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8%-10%|
|Phrase|8.98|0.67|9.07|0.73|-13%-17%|
|Prefix3|16.52|1.04|16.97|0.90|-8%-15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1%-8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1%-8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2%-11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7%-22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1%-16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13%-32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1%-18%|
|Term|68.03|3.73|73.70|5.76|-5%-23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10%-30%|


Multisegment index 20 segments 10M medium wiki documents with deletes:

||Task||QPS trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10%-7%|
|PKLookup|89.89|5.44|88.82|5.52|-12%-11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9%-9%|
|Wildcard|21.42|1.13|21.45|0.90|-8%-10%|
|SpanNear|3.39|0.13|3.41|0.16|-7%-9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8%-9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8%-10%|
|Phrase|8.98|0.67|9.07|0.73|-13%-17%|
|Prefix3|16.52|1.04|16.97|0.90|-8%-15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1%-8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1%-8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2%-11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7%-22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1%-16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13%-32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1%-18%|
|Term|68.03|3.73|73.70|5.76|-5%-23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10%-30%|

executed on Linux with Java Version "1.6.0_26" cmd: java -XX:BiasedLockingStartupDelay=0 -Xms2g -Xmx2g -server


                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170844#comment-13170844 ] 

Uwe Schindler commented on LUCENE-3648:
---------------------------------------

Cool thanks! I will test this now with Java 6 AggressiveOpts and/or Java 7, just to be sure. LOL :-)
                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch, LUCENE-3648.patch, LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170189#comment-13170189 ] 

Michael McCandless commented on LUCENE-3648:
--------------------------------------------

OK I ran again, with a more modern JVM (Java 1.7.0_01), and the results are better!

Deletes:
{noformat}
                Task    QPS base StdDev base   QPS patchStdDev patch      Pct diff
             Respell       62.87        2.65       61.37        2.86  -10% -    6%
              Fuzzy2       56.94        2.00       56.50        2.46   -8% -    7%
              Phrase        8.71        0.50        8.67        0.44  -10% -   10%
              Fuzzy1       76.38        3.10       76.07        3.20   -8% -    8%
         TermGroup1M       35.08        0.55       35.05        0.88   -4% -    4%
        SloppyPhrase       11.80        0.57       11.80        0.68  -10% -   11%
          AndHighMed       53.93        3.78       54.00        1.92   -9% -   11%
         AndHighHigh       16.33        0.72       16.37        0.65   -7% -    9%
      TermBGroup1M1P       13.69        0.55       13.76        0.61   -7% -    9%
            SpanNear        4.31        0.13        4.35        0.11   -4% -    6%
        TermBGroup1M        8.95        0.29        9.10        0.26   -4% -    7%
            PKLookup      161.50        5.00      165.99        8.45   -5% -   11%
           OrHighMed       16.34        0.44       17.15        0.70   -1% -   12%
            Wildcard       25.03        0.09       26.38        1.46    0% -   11%
                Term       58.62        2.18       61.85        3.05   -3% -   14%
          OrHighHigh       10.87        0.30       11.48        0.46   -1% -   12%
             Prefix3       47.57        0.42       50.88        2.93    0% -   14%
              IntNRQ        6.92        0.37        7.66        0.85   -6% -   29%
{noformat}


No deletes:
{noformat}
                Task    QPS base StdDev base   QPS patchStdDev patch      Pct diff
              Phrase        4.69        0.18        4.52        0.18  -10% -    4%
        SloppyPhrase        6.79        0.25        6.62        0.19   -8% -    3%
            SpanNear        2.08        0.02        2.05        0.04   -4% -    1%
              IntNRQ        9.80        0.36        9.69        1.04  -14% -   13%
        TermBGroup1M       42.63        1.09       42.94        2.17   -6% -    8%
             Respell       70.38        0.45       71.29        1.66   -1% -    4%
              Fuzzy1       64.88        0.61       65.77        2.15   -2% -    5%
         AndHighHigh       10.58        0.30       10.74        0.28   -3% -    7%
          AndHighMed       71.25        1.58       72.52        1.73   -2% -    6%
            Wildcard       24.29        0.38       24.72        0.87   -3% -    7%
              Fuzzy2       46.45        0.61       47.29        1.54   -2% -    6%
         TermGroup1M       32.16        0.66       32.75        0.72   -2% -    6%
           OrHighMed       13.65        0.40       13.96        0.34   -3% -    7%
             Prefix3       63.00        1.08       64.41        2.59   -3% -    8%
          OrHighHigh        5.22        0.15        5.34        0.13   -3% -    8%
      TermBGroup1M1P       52.66        0.61       53.99        1.42   -1% -    6%
            PKLookup      159.98        9.36      168.62        4.83   -3% -   15%
                Term       67.51        3.42       71.42        3.13   -3% -   16%
{noformat}

                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch, LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Resolved] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Resolved) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer resolved LUCENE-3648.
-------------------------------------

    Resolution: Fixed
    
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch, LUCENE-3648.patch, LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170182#comment-13170182 ] 

Michael McCandless commented on LUCENE-3648:
--------------------------------------------

This time I ran 20 iters per (total 40).

Deletes:
{noformat}
                Task    QPS base StdDev base   QPS patchStdDev patch      Pct diff
                Term       61.01        3.12       58.09        2.73  -13% -    5%
         AndHighHigh       14.18        0.73       13.53        0.50  -12% -    4%
             Prefix3       54.40        2.36       52.28        2.28  -11% -    4%
              IntNRQ        7.82        0.51        7.53        0.43  -14% -    8%
            Wildcard       27.19        1.15       26.24        1.07  -11% -    4%
          AndHighMed       67.68        3.96       65.42        3.37  -13% -    7%
           OrHighMed       12.38        0.68       12.01        0.89  -14% -   10%
          OrHighHigh        6.94        0.38        6.73        0.49  -14% -   10%
        TermBGroup1M       39.01        1.02       37.91        0.96   -7% -    2%
      TermBGroup1M1P       35.65        1.34       34.75        1.08   -8% -    4%
         TermGroup1M       31.31        0.69       30.67        0.64   -6% -    2%
            SpanNear        3.36        0.13        3.32        0.13   -8% -    6%
              Phrase        6.64        0.50        6.56        0.43  -14% -   13%
            PKLookup      160.01        5.21      158.48        7.02   -8% -    6%
        SloppyPhrase        6.52        0.29        6.49        0.25   -8% -    8%
              Fuzzy1       52.51        1.90       52.32        2.79   -8% -    8%
              Fuzzy2       41.82        2.28       42.34        2.85  -10% -   14%
             Respell       75.82        5.03       77.18        5.76  -11% -   17%
{noformat}

No deletes:
{noformat}
                Task    QPS base StdDev base   QPS patchStdDev patch      Pct diff
             Respell       70.35        5.66       69.88        4.87  -14% -   15%
              Phrase        2.10        0.13        2.10        0.13  -11% -   13%
              Fuzzy2       43.46        3.22       43.51        2.68  -12% -   14%
             Prefix3       28.88        2.03       28.94        1.58  -11% -   13%
            Wildcard       42.65        2.26       42.80        1.90   -8% -   10%
            PKLookup      156.39        6.43      157.10        4.92   -6% -    8%
              Fuzzy1       70.51        4.15       71.31        3.16   -8% -   12%
      TermBGroup1M1P       12.29        0.67       12.44        0.57   -8% -   11%
         TermGroup1M       30.92        1.01       31.30        0.56   -3% -    6%
           OrHighMed       17.84        1.17       18.08        0.68   -8% -   12%
              IntNRQ        8.05        0.76        8.16        0.69  -15% -   21%
        SloppyPhrase       19.02        0.87       19.31        1.01   -7% -   11%
        TermBGroup1M       43.87        1.44       44.61        1.00   -3% -    7%
            SpanNear        2.38        0.15        2.43        0.15   -9% -   15%
         AndHighHigh       15.25        1.06       15.57        0.52   -7% -   13%
          AndHighMed       50.55        3.17       51.80        1.77   -6% -   13%
          OrHighHigh        5.99        0.40        6.14        0.25   -7% -   14%
                Term       95.84        6.18       98.48        4.81   -8% -   15%
{noformat}

                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch, LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Assigned] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Assigned) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer reassigned LUCENE-3648:
---------------------------------------

    Assignee: Simon Willnauer
    
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Updated] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-3648:
------------------------------------

    Attachment: LUCENE-3648.patch

here is a new patch that gives me more stable results. 

without deletes:
{code}
                Task   QPS trunk StdDev trunk QPS patch StdDev patch      Pct diff
        SloppyPhrase        3.97        0.20        3.88        0.25  -12% -    9%
              Phrase       13.93        0.73       13.67        0.82  -12% -    9%
             Respell       45.64        1.99       45.08        2.32  -10% -    8%
              Fuzzy2       20.12        0.86       20.05        1.07   -9% -    9%
            SpanNear        5.97        0.27        5.98        0.21   -7% -    8%
              Fuzzy1       54.91        2.14       55.08        2.54   -7% -    9%
            PKLookup       88.94        5.86       90.98        4.66   -8% -   15%
         TermGroup1M       17.59        0.25       18.07        0.25    0% -    5%
            Wildcard       42.64        1.22       43.86        1.08   -2% -    8%
      TermBGroup1M1P       37.90        0.93       39.22        0.35    0% -    7%
        TermBGroup1M       13.14        0.24       13.64        0.18    0% -    7%
             Prefix3       32.60        1.14       34.01        0.90   -1% -   10%
              IntNRQ        5.06        0.45        5.36        0.40   -9% -   24%
          AndHighMed       32.92        1.87       35.00        2.26   -5% -   19%
         AndHighHigh        7.63        0.41        8.14        0.44   -4% -   18%
                Term       70.92        3.59       76.52        4.56   -3% -   20%
           OrHighMed        6.89        0.25        7.44        0.29    0% -   16%
          OrHighHigh        3.41        0.13        3.70        0.15    0% -   17%
{code}

with deletes:
{code}

                Task   QPS trunk StdDev trunk QPS patch StdDev patch      Pct diff
            PKLookup       92.65        5.06       91.04        4.36  -11% -    8%
         AndHighHigh        9.55        0.58        9.54        0.43  -10% -   11%
             Respell       47.56        2.19       47.60        2.22   -8% -    9%
              Fuzzy2       44.42        2.15       44.54        2.09   -8% -   10%
        SloppyPhrase        3.66        0.17        3.68        0.19   -8% -   10%
          AndHighMed       18.79        1.69       18.90        1.04  -12% -   16%
              Fuzzy1       51.89        2.44       52.25        2.30   -8% -   10%
      TermBGroup1M1P       26.23        0.61       26.57        0.63   -3% -    6%
              Phrase        9.25        0.72        9.38        0.69  -12% -   17%
            SpanNear        2.86        0.13        2.90        0.11   -6% -   10%
        TermBGroup1M       27.50        0.62       27.90        0.73   -3% -    6%
         TermGroup1M       25.48        0.58       25.87        0.59   -2% -    6%
            Wildcard       19.02        0.70       19.43        0.67   -4% -    9%
                Term       41.22        1.70       42.15        1.94   -6% -   11%
              IntNRQ        4.52        0.27        4.63        0.33  -10% -   16%
             Prefix3       18.09        0.77       18.59        0.70   -5% -   11%
           OrHighMed        9.09        0.42        9.70        0.34   -1% -   15%
          OrHighHigh        6.56        0.29        7.00        0.23   -1% -   15%
{code}
                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch, LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170194#comment-13170194 ] 

Simon Willnauer commented on LUCENE-3648:
-----------------------------------------

I ran my tests with 1.6.0_21 and I see the same results as mike. This is heavily depending on the JDK but since this improves on newer JDKs I think we should commit it. I will put up my latest cleanups in a second.

                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch, LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Updated] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-3648:
------------------------------------

    Attachment: LUCENE-3648.patch

here is a first patch applying the optimizations described above. I specialized the SegmentTermsEnum into NoDeletesSegmentDocsEnum & DeletesSegmentDocsEnum
(now that I think about it I think they should be named NoLiveDocs / LiveDocs) and changed the behavior slightly how we reuse the docsenum. This patch only reuses if the startFreqIn matches the codecs freqIn (identity) AND if the reuse.liveDocs == livedocs. Benchmark results follow in a second.
                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169519#comment-13169519 ] 

Robert Muir commented on LUCENE-3648:
-------------------------------------

Here is mine (remember though, i have the crazy cpu/setup where LUCENE-3584 speeds up term/or queries)

No deletes:
{noformat}
                Task   QPS trunkStdDev trunk   QPS patchStdDev patch      Pct diff
        SloppyPhrase        7.28        0.35        7.20        0.32   -9% -    8%
            PKLookup      186.27        5.44      184.56        4.65   -6% -    4%
              Fuzzy2       69.35        3.33       69.21        2.73   -8% -    8%
            SpanNear        7.50        0.63        7.49        0.64  -15% -   18%
              Phrase        9.88        0.70        9.87        0.80  -14% -   16%
              Fuzzy1       86.95        4.40       86.90        3.43   -8% -    9%
             Respell       74.25        4.15       74.28        3.72   -9% -   11%
         TermGroup1M       41.97        0.89       42.01        0.91   -4% -    4%
        TermBGroup1M       49.25        1.21       49.49        1.34   -4% -    5%
      TermBGroup1M1P       70.82        2.56       71.18        3.26   -7% -    9%
          OrHighHigh        7.83        0.47        7.93        0.48  -10% -   14%
           OrHighMed       12.60        0.72       12.77        0.78  -10% -   14%
            Wildcard       41.44        3.12       42.11        2.43  -10% -   16%
             Prefix3       24.57        2.16       25.32        1.76  -11% -   20%
         AndHighHigh       18.67        0.97       19.26        1.11   -7% -   15%
                Term       87.72        5.04       90.58        5.92   -8% -   16%
          AndHighMed       64.62        2.86       68.41        4.52   -5% -   18%
              IntNRQ        8.96        0.98        9.81        1.12  -12% -   37%
{noformat}

Deletes:
{noformat}
                Task   QPS trunkStdDev trunk   QPS patchStdDev patch      Pct diff
                Term       67.75        5.63       66.48        3.95  -14% -   13%
        SloppyPhrase       15.67        0.67       15.41        0.80  -10% -    8%
              Phrase       17.09        1.07       16.81        1.40  -15% -   13%
         AndHighHigh        8.34        0.61        8.21        0.46  -13% -   12%
          AndHighMed       93.05        5.64       91.72        5.23  -12% -   10%
            SpanNear        6.60        0.60        6.55        0.52  -16% -   17%
        TermBGroup1M       46.87        1.79       46.57        1.63   -7% -    6%
           OrHighMed       30.00        2.22       29.82        1.86  -13% -   14%
         TermGroup1M       40.82        1.20       40.71        1.24   -6% -    5%
      TermBGroup1M1P       53.46        2.36       53.35        1.80   -7% -    7%
          OrHighHigh       12.25        0.92       12.24        0.88  -13% -   15%
             Respell       76.30        2.98       76.82        2.53   -6% -    8%
              Fuzzy2       90.90        4.35       91.73        2.60   -6% -    8%
              Fuzzy1       98.02        4.70       99.03        2.67   -6% -    8%
            PKLookup      180.72        6.85      183.44        6.45   -5% -    9%
            Wildcard       21.16        1.22       21.58        1.12   -8% -   13%
             Prefix3       47.11        2.97       48.13        2.89   -9% -   15%
              IntNRQ        5.99        0.66        6.59        0.61  -10% -   34%
{noformat}

1.6.0_24, -Xms1g -Xmx2g -server
                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Issue Comment Edited] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Simon Willnauer (Issue Comment Edited) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169383#comment-13169383 ] 

Simon Willnauer edited comment on LUCENE-3648 at 12/14/11 2:20 PM:
-------------------------------------------------------------------

here are my benchmark results with that patch:


Multisegment index 20 segments 10M medium wiki documents no deletes:

||Task||QPS-trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10% - 7%|
|PKLookup|89.89|5.44|88.82|5.52|-12% - 11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9% - 9%|
|Wildcard|21.42|1.13|21.45|0.90|-8% - 10%|
|SpanNear|3.39|0.13|3.41|0.16|-7% - 9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8% - 9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8% - 10%|
|Phrase|8.98|0.67|9.07|0.73|-13% - 17%|
|Prefix3|16.52|1.04|16.97|0.90|-8% - 15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1% - 8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1% - 8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2% - 11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7% - 22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1% - 16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13% - 32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1% - 18%|
|Term|68.03|3.73|73.70|5.76|-5% - 23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10% - 30%|


Multisegment index 20 segments 10M medium wiki documents with deletes:

||Task||QPS trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10% - 7%|
|PKLookup|89.89|5.44|88.82|5.52|-12% - 11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9% - 9%|
|Wildcard|21.42|1.13|21.45|0.90|-8% - 10%|
|SpanNear|3.39|0.13|3.41|0.16|-7% - 9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8% - 9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8% - 10%|
|Phrase|8.98|0.67|9.07|0.73|-13% - 17%|
|Prefix3|16.52|1.04|16.97|0.90|-8% - 15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1% - 8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1% - 8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2% - 11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7% - 22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1% - 16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13% - 32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1% - 18%|
|Term|68.03|3.73|73.70|5.76|-5% - 23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10% - 30%|

executed on Linux with Java Version "1.6.0_26" cmd: java -XX:BiasedLockingStartupDelay=0 -Xms2g -Xmx2g -server


                
      was (Author: simonw):
    here are my benchmark results with that patch:


Multisegment index 20 segments 10M medium wiki documents no deletes:

||Task||QPS-trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10%-7%|
|PKLookup|89.89|5.44|88.82|5.52|-12%-11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9%-9%|
|Wildcard|21.42|1.13|21.45|0.90|-8%-10%|
|SpanNear|3.39|0.13|3.41|0.16|-7%-9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8%-9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8%-10%|
|Phrase|8.98|0.67|9.07|0.73|-13%-17%|
|Prefix3|16.52|1.04|16.97|0.90|-8%-15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1%-8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1%-8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2%-11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7%-22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1%-16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13%-32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1%-18%|
|Term|68.03|3.73|73.70|5.76|-5%-23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10%-30%|


Multisegment index 20 segments 10M medium wiki documents with deletes:

||Task||QPS trunk||StdDev trunk||QPS patch||StdDev patch||Pct diff||
|Respell|41.27|1.79|40.62|1.94|-10%-7%|
|PKLookup|89.89|5.44|88.82|5.52|-12%-11%|
|Fuzzy2|27.21|1.31|27.04|1.31|-9%-9%|
|Wildcard|21.42|1.13|21.45|0.90|-8%-10%|
|SpanNear|3.39|0.13|3.41|0.16|-7%-9%|
|Fuzzy1|53.48|2.49|53.77|2.31|-8%-9%|
|SloppyPhrase|2.67|0.11|2.68|0.13|-8%-10%|
|Phrase|8.98|0.67|9.07|0.73|-13%-17%|
|Prefix3|16.52|1.04|16.97|0.90|-8%-15%|
|TermGroup1M|18.50|0.44|19.13|0.45|-1%-8%|
|TermBGroup1M|24.51|0.57|25.40|0.63|-1%-8%|
|TermBGroup1M1P|14.92|0.42|15.57|0.55|-2%-11%|
|AndHighHigh|5.78|0.37|6.15|0.46|-7%-22%|
|OrHighMed|9.22|0.37|9.85|0.45|-1%-16%|
|IntNRQ|6.82|0.75|7.32|0.74|-13%-32%|
|OrHighHigh|5.88|0.27|6.35|0.32|-1%-18%|
|Term|68.03|3.73|73.70|5.76|-5%-23%|
|AndHighMed|20.10|1.64|21.78|2.22|-10%-30%|

executed on Linux with Java Version "1.6.0_26" cmd: java -XX:BiasedLockingStartupDelay=0 -Xms2g -Xmx2g -server


                  
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] [Commented] (LUCENE-3648) Speed up SegementDocsEnum by making it more friendly for JIT optimizations

Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169515#comment-13169515 ] 

Michael McCandless commented on LUCENE-3648:
--------------------------------------------

Hmm w/ current patch here I get mixed results (though, I'm only doing 4 JVM iterations... could be if I did 10 the results improve):

No deletes:

{noformat}

                Task    QPS base StdDev base   QPS patchStdDev patch      Pct diff
            PKLookup      160.82       11.26      153.11        4.65  -13% -    5%
          AndHighMed       47.23        2.46       45.37        2.32  -13% -    6%
         AndHighHigh       19.51        0.90       18.82        0.90  -12% -    5%
              Phrase        6.93        0.34        6.70        0.42  -13% -    7%
              Fuzzy2       33.42        0.49       32.82        1.10   -6% -    3%
             Respell       63.56        1.34       62.52        2.32   -7% -    4%
        SloppyPhrase        9.63        0.32        9.52        0.40   -8% -    6%
      TermBGroup1M1P       69.10        0.77       68.95        1.50   -3% -    3%
        TermBGroup1M       25.17        0.46       25.21        0.35   -3% -    3%
              Fuzzy1       83.56        1.30       83.70        2.21   -3% -    4%
            Wildcard       54.04        1.87       54.17        2.59   -7% -    8%
         TermGroup1M        9.80        0.19        9.87        0.06   -1% -    3%
             Prefix3       21.36        0.63       21.53        0.74   -5% -    7%
              IntNRQ        6.25        0.45        6.43        0.64  -13% -   21%
                Term       87.43        1.16       90.09        3.98   -2% -    9%
            SpanNear       19.63        1.21       20.28        0.54   -5% -   13%
           OrHighMed       14.33        0.19       15.44        0.31    4% -   11%
          OrHighHigh        6.75        0.10        7.30        0.15    4% -   11%
{noformat}

Deletes:
{noformat}
                Task    QPS base StdDev base   QPS patchStdDev patch      Pct diff
                Term       80.96        2.07       76.64        2.99  -11% -    0%
             Prefix3       38.28        1.20       36.64        1.61  -11% -    3%
            Wildcard       51.50        1.19       49.71        1.71   -8% -    2%
         AndHighHigh       14.04        0.86       13.56        0.45  -12% -    6%
          AndHighMed       44.42        2.45       42.90        1.57  -11% -    5%
      TermBGroup1M1P       39.19        0.92       38.26        0.96   -6% -    2%
              Phrase        8.32        0.34        8.16        0.13   -7% -    3%
          OrHighHigh        7.82        0.42        7.69        0.20   -9% -    6%
              IntNRQ        5.76        0.43        5.71        0.33  -13% -   13%
            SpanNear        1.60        0.04        1.59        0.03   -4% -    3%
           OrHighMed        5.82        0.41        5.79        0.12   -9% -    9%
        SloppyPhrase        3.28        0.05        3.26        0.07   -4% -    3%
         TermGroup1M       26.76        0.40       26.91        0.45   -2% -    3%
            PKLookup      155.58        9.40      157.09        5.50   -8% -   11%
        TermBGroup1M       42.88        1.04       43.30        1.19   -4% -    6%
              Fuzzy1       77.96        4.29       78.94        3.16   -7% -   11%
              Fuzzy2       45.14        2.58       46.23        1.62   -6% -   12%
             Respell       70.96        4.78       73.57        3.78   -7% -   16%
{noformat}

Java is 1.6.0_21, running java -Xms2g -Xmx2g -server, 10M docs multi-segment (15 segments).
                
> Speed up SegementDocsEnum by making it more friendly for JIT optimizations
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-3648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3648
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/codecs, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3648.patch
>
>
> Since we moved the bulk reading into the codec ie. make all  bulk reading codec private in LUCENE-3584 we have seen some performance [regression|http://people.apache.org/~mikemccand/lucenebench/Term.html] on different CPUs. I tried to optimize the implementation to make it more eligible for runtime optimizations, tried to make loops JIT friendly by moving out branches where I can, minimize member access in all loops, use final members where possible and specialize the two common cases With & Without LiveDocs.
> I will attache a patch and my benchmark results in a minute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org