You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Gautam Worah (Jira)" <ji...@apache.org> on 2021/08/01 22:13:00 UTC

[jira] [Comment Edited] (LUCENE-9918) Can PForUtil be further auto-vectorized?

    [ https://issues.apache.org/jira/browse/LUCENE-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391245#comment-17391245 ] 

Gautam Worah edited comment on LUCENE-9918 at 8/1/21, 10:12 PM:
----------------------------------------------------------------

Hey [~gsmiller] , I noticed that in the micro benchmark code in your lucene-pfor-benchmark [repo |#L15],] the main loop runs 10 times I think?

SomeĀ [sources|http://daniel-strecker.com/blog/2020-01-14_auto_vectorization_in_java/#Output%20Interpretation] suggest that usually the JIT compiler compiles and optimizes statements as and when it sees that a particular operation is repeated multiple times. So it first optimizes them a little and them some more iff it sees them again. So maybe we just need to repeat the experiment with say 100k iterations?


was (Author: gworah):
Hey @gmiller, I noticed that in the micro benchmark code in your lucene-pfor-benchmark [repo |[https://github.com/gsmiller/lucene-pfor-benchmark/blob/main/src/main/java/gsmiller/DecodeBenchmark.java#L15],] the main loop runs 10 times I think?

SomeĀ [sources|http://daniel-strecker.com/blog/2020-01-14_auto_vectorization_in_java/#Output%20Interpretation] suggest that usually the JIT compiler compiles and optimizes statements as and when it sees that a particular operation is repeated multiple times. So it first optimizes them a little and them some more iff it sees them again. So maybe we just need to repeat the experiment with say 100k iterations?

> Can PForUtil be further auto-vectorized?
> ----------------------------------------
>
>                 Key: LUCENE-9918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9918
>             Project: Lucene - Core
>          Issue Type: Task
>          Components: core/codecs
>    Affects Versions: main (9.0)
>            Reporter: Greg Miller
>            Priority: Minor
>
> While working on LUCENE-9850, we discovered the loop in PForUtil::prefixSumOf is not getting auto-vectorized by the HotSpot compiler. We tried a few different tweaks to see if we could change this, but came up empty. There are some additional suggestions in the related [PR|https://github.com/apache/lucene/pull/69#discussion_r608412309] that could still be experimented with, and it may be worth doing so to see if further improvements could be squeezed out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org