You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jo...@apache.org on 2023/07/12 19:30:58 UTC

[impala] 02/02: IMPALA-12122: use isb instead of yield on arm64

This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit a875eb9704a60433d0c591cc8a57af60524dc1d7
Author: Sebastian Pop <sp...@amazon.com>
AuthorDate: Tue May 9 07:53:51 2023 -0500

    IMPALA-12122: use isb instead of yield on arm64
    
    A "yield" instruction in aarch64 is essentially a nop, and does not
    cause enough delay to help backoff. "isb" is a barrier that, especially
    inside a loop, creates a small delay without consuming ALU resources.
    Experiments show that adding the isb instruction improves stability and
    reduces result jitter.
    Adding more delay than a single isb reduces performance.
    
    Change-Id: If14eaa8a4b445034d81bf68037e702e6d16b1181
    Reviewed-on: http://gerrit.cloudera.org:8080/19865
    Reviewed-by: Joe McDonnell <jo...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 be/src/gutil/yield_processor.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/be/src/gutil/yield_processor.h b/be/src/gutil/yield_processor.h
index f8e24174d..5f067a8e4 100644
--- a/be/src/gutil/yield_processor.h
+++ b/be/src/gutil/yield_processor.h
@@ -36,7 +36,12 @@ inline void PauseCPU() {
   // to run, not speculate memory access, etc.
   __asm__ __volatile__("pause" : : : "memory");
 #elif defined(__aarch64__)
-  __asm__ __volatile__("yield" : : : "memory");
+  // A "yield" instruction in aarch64 is essentially a nop, and does not cause
+  // enough delay to help backoff. "isb" is a barrier that, especially inside a
+  // loop, creates a small delay without consuming ALU resources. Experiments
+  // show that adding the isb instruction improves stability and reduces result
+  // jitter. Adding more delay than a single isb reduces performance.
+  __asm__ __volatile__("isb" : : : "memory");
 #else
   // PauseCPU is not defined for other architectures.
 #endif