You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Gautam Worah (Jira)" <ji...@apache.org> on 2021/08/01 22:13:00 UTC
[jira] [Comment Edited] (LUCENE-9918) Can PForUtil be further
auto-vectorized?
[ https://issues.apache.org/jira/browse/LUCENE-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391245#comment-17391245 ]
Gautam Worah edited comment on LUCENE-9918 at 8/1/21, 10:12 PM:
----------------------------------------------------------------
Hey [~gsmiller] , I noticed that in the micro benchmark code in your lucene-pfor-benchmark [repo |#L15],] the main loop runs 10 times I think?
SomeĀ [sources|http://daniel-strecker.com/blog/2020-01-14_auto_vectorization_in_java/#Output%20Interpretation] suggest that usually the JIT compiler compiles and optimizes statements as and when it sees that a particular operation is repeated multiple times. So it first optimizes them a little and them some more iff it sees them again. So maybe we just need to repeat the experiment with say 100k iterations?
was (Author: gworah):
Hey @gmiller, I noticed that in the micro benchmark code in your lucene-pfor-benchmark [repo |[https://github.com/gsmiller/lucene-pfor-benchmark/blob/main/src/main/java/gsmiller/DecodeBenchmark.java#L15],] the main loop runs 10 times I think?
SomeĀ [sources|http://daniel-strecker.com/blog/2020-01-14_auto_vectorization_in_java/#Output%20Interpretation] suggest that usually the JIT compiler compiles and optimizes statements as and when it sees that a particular operation is repeated multiple times. So it first optimizes them a little and them some more iff it sees them again. So maybe we just need to repeat the experiment with say 100k iterations?
> Can PForUtil be further auto-vectorized?
> ----------------------------------------
>
> Key: LUCENE-9918
> URL: https://issues.apache.org/jira/browse/LUCENE-9918
> Project: Lucene - Core
> Issue Type: Task
> Components: core/codecs
> Affects Versions: main (9.0)
> Reporter: Greg Miller
> Priority: Minor
>
> While working on LUCENE-9850, we discovered the loop in PForUtil::prefixSumOf is not getting auto-vectorized by the HotSpot compiler. We tried a few different tweaks to see if we could change this, but came up empty. There are some additional suggestions in the related [PR|https://github.com/apache/lucene/pull/69#discussion_r608412309] that could still be experimented with, and it may be worth doing so to see if further improvements could be squeezed out.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org