You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by db...@apache.org on 2023/08/04 12:18:15 UTC
[impala] branch master updated: IMPALA-12269: Codegen cache false negative because of function names hash

This is an automated email from the ASF dual-hosted git repository.

dbecker pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


The following commit(s) were added to refs/heads/master by this push:
     new 645abfc35 IMPALA-12269: Codegen cache false negative because of function names hash
645abfc35 is described below

commit 645abfc353795f0758e8641735c211794ac3847f
Author: Daniel Becker <da...@cloudera.com>
AuthorDate: Fri Jul 7 15:34:43 2023 +0200

    IMPALA-12269: Codegen cache false negative because of function names hash
    
    Codegen cache entries (execution engines holding an LLVM code module)
    are stored by keys derived from the unoptimised llvm modules: the key is
    either the whole unoptimised module (normal mode) or its hash (optimal
    mode). Because hash collisions are possible (in optimal mode), as an
    extra precaution we also compare the hashes of the function names in the
    current and the cached module. However, when assembling the function
    name list we do not filter out duplicate function names, which may
    result in cases where the unoptimised llvm modules are identical but the
    function name hashes do not match.
    
    Example:
    First query:
      select int_col, tinyint_col
      from alltypessmall
      order by int_col desc
      limit 20;
    
    Second query:
      select tinyint_col
      from alltypessmall
      order by int_col desc
      limit 20;
    
    In the first query, there are two 'SlotRef' objects referencing
    'tinyint_col' which want to codegen a 'GetSlotRef()' function. The
    second invokation of 'SlotRef::GetCodegendComputeFnImpl()' checks the
    already codegen'd functions, finds the function created by its first
    invokation and returns that. The two 'SlotRef' objects will use the same
    'llvm::Function' and there will be only one copy of it in the module,
    but both 'SlotRef's will call 'LlvmCodeGen::AddFunctionToJit()' with
    this function in order for their respective function pointers to be set
    after JIT-compilation.
    
    'LlvmCodeGen::GetAllFunctionNames()' will return the names of all
    functions with which 'LlvmCodeGen::AddFunctionToJit()' has been called,
    including duplicates.
    
    The second query generates the same unoptimised module as the first
    query (for the corresponding fragment), but does not have a duplicated
    'GetSlotRef()' function in its function name list, so the cached module
    is rejected.
    
    Note that this also results in the cached module being evicted when the
    new module from the second query is inserted into the cache because the
    new module will have the same key as the cached one (the modules are
    identical).
    
    This change fixes this problem by using a de-duplicated and sorted
    function name list.
    
    Testing:
      - Added a test in test_codegen_cache.py that asserts that there is a
        cache hit and no eviction in the above example.
    
    Change-Id: Ibf1d2b424c969fbba181ab90bf9c7bf22355f139
    Reviewed-on: http://gerrit.cloudera.org:8080/20168
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 be/src/codegen/llvm-codegen-cache-test.cc  |   2 +-
 be/src/codegen/llvm-codegen.cc             | 127 ++++++++++++++++-------------
 be/src/codegen/llvm-codegen.h              |  28 ++++++-
 tests/custom_cluster/test_codegen_cache.py |  29 +++++++
 4 files changed, 125 insertions(+), 61 deletions(-)

diff --git a/be/src/codegen/llvm-codegen-cache-test.cc b/be/src/codegen/llvm-codegen-cache-test.cc
index 2d3fbf7db..a07c51dda 100644
--- a/be/src/codegen/llvm-codegen-cache-test.cc
+++ b/be/src/codegen/llvm-codegen-cache-test.cc
@@ -365,7 +365,7 @@ void LlvmCodeGenCacheTest::TestSkipCache() {
   // Insert a new function to the codegen, and regenerate the function names hash
   // code, expect a failure because the hash code inconsistency with the code in
   // the cache.
-  codegen->fns_to_jit_compile_.emplace_back(empty_func, &fn_ptr);
+  codegen->AddFunctionToJitInternal(empty_func, &fn_ptr);
   codegen->GenerateFunctionNamesHashCode();
   // Expect a look up failure.
   EXPECT_FALSE(codegen->LookupCache(cache_key));
diff --git a/be/src/codegen/llvm-codegen.cc b/be/src/codegen/llvm-codegen.cc
index 89fa64933..b27033aba 100644
--- a/be/src/codegen/llvm-codegen.cc
+++ b/be/src/codegen/llvm-codegen.cc
@@ -1177,36 +1177,8 @@ bool LlvmCodeGen::LookupCache(CodeGenCacheKey& cache_key) {
       return false;
     }
 
-    // execution_engine_wrapper_cached_ is used to keep the life of the jitted functions
-    // in case the engine is evicted in the global cache.
-    DCHECK(execution_engine_wrapper_cached_ != nullptr);
-    llvm::ExecutionEngine* cached_execution_engine =
-        execution_engine_wrapper_cached_->execution_engine();
-    vector<void*> jitted_funcs;
-    // Get pointers to all codegen'd functions.
-    for (int i = 0; i < fns_to_jit_compile_.size(); ++i) {
-      llvm::Function* function = fns_to_jit_compile_[i].first;
-      DCHECK(function != nullptr);
-      // Using the function getFunctionAddress() with a non-existent function name
-      // would hit an assertion during the test, could be a bug in llvm 5, need to
-      // review after upgrade llvm. But because we already checked the names hashcode
-      // for key collision cases, we expect all the functions should be in the
-      // cached execution engine.
-      void* jitted_function = reinterpret_cast<void*>(
-          cached_execution_engine->getFunctionAddress(function->getName()));
-      if (jitted_function == nullptr) {
-        LOG(WARNING) << "Failed to get a jitted function from cache: "
-                     << function->getName().data()
-                     << " key hash_code=" << cache_key.hash_code();
-        cache->IncHitOrMissCount(/*hit*/ false);
-        return false;
-      }
-      jitted_funcs.emplace_back(jitted_function);
-    }
-    DCHECK_EQ(jitted_funcs.size(), fns_to_jit_compile_.size());
-    for (int i = 0; i < jitted_funcs.size(); i++) {
-      fns_to_jit_compile_[i].second->store(jitted_funcs[i]);
-    }
+    const bool fn_pointers_set_from_cache = SetFunctionPointers(cache, &cache_key);
+    if (!fn_pointers_set_from_cache) return false;
 
     // Because we cache the entire execution engine, the cached number of functions should
     // be the same as the total function number.
@@ -1221,11 +1193,11 @@ bool LlvmCodeGen::LookupCache(CodeGenCacheKey& cache_key) {
 string LlvmCodeGen::GetAllFunctionNames() {
   stringstream result;
   // The way to concat would be like "function1,function2".
-  const char separator = ',';
-  for (int i = 0; i < fns_to_jit_compile_.size(); ++i) {
-    llvm::Function* function = fns_to_jit_compile_[i].first;
-    DCHECK(function != nullptr);
-    result << function->getName().data() << separator;
+  // The function names are sorted in 'fns_to_jit_compile_'.
+  constexpr char separator = ',';
+  for (auto& entry : fns_to_jit_compile_) {
+    const llvm::StringRef& fn_name = entry.first;
+    result << fn_name.data() << separator;
   }
   return result.str();
 }
@@ -1254,19 +1226,14 @@ void LlvmCodeGen::PruneModule() {
   // global dead code elimination pass. This causes all functions not registered to be
   // JIT'd to be marked as internal, and any internal functions that are not used are
   // deleted by DCE pass. This greatly decreases compile time by removing unused code.
-  unordered_set<string> exported_fn_names;
-  for (auto& entry : fns_to_jit_compile_) {
-    exported_fn_names.insert(entry.first->getName().str());
-  }
-
   llvm::ModuleAnalysisManager module_analysis_manager;
   llvm::PassBuilder pass_builder(execution_engine()->getTargetMachine());
   pass_builder.registerModuleAnalyses(module_analysis_manager);
 
   llvm::ModulePassManager module_pass_manager;
   module_pass_manager.addPass(
-      llvm::InternalizePass([&exported_fn_names](const llvm::GlobalValue& gv) {
-        return exported_fn_names.find(gv.getName().str()) != exported_fn_names.end();
+      llvm::InternalizePass([this](const llvm::GlobalValue& gv) {
+        return fns_to_jit_compile_.count(gv.getName()) > 0;
       }));
   module_pass_manager.addPass(llvm::GlobalDCEPass());
   module_pass_manager.run(*module_, module_analysis_manager);
@@ -1451,10 +1418,12 @@ Status LlvmCodeGen::OptimizeModule() {
   // Create and run module pass manager
   pass_manager.run(*module_, MAM);
   if (FLAGS_print_llvm_ir_instruction_count) {
-    for (int i = 0; i < fns_to_jit_compile_.size(); ++i) {
+    for (auto& entry : fns_to_jit_compile_) {
       InstructionCounter counter;
-      counter.visit(*fns_to_jit_compile_[i].first);
-      VLOG(1) << fns_to_jit_compile_[i].first->getName().str();
+      llvm::Function* llvm_function = entry.second.first;
+      const llvm::StringRef& llvm_function_name = entry.first;
+      counter.visit(*llvm_function);
+      VLOG(1) << llvm_function_name.data();
       VLOG(1) << counter.PrintCounters();
     }
   }
@@ -1463,15 +1432,51 @@ Status LlvmCodeGen::OptimizeModule() {
   return Status::OK();
 }
 
-void LlvmCodeGen::SetFunctionPointers() {
+bool LlvmCodeGen::SetFunctionPointers(CodeGenCache* cache,
+    const CodeGenCacheKey* cache_key) {
   // Get pointers to all codegen'd functions.
-  for (const std::pair<llvm::Function*, CodegenFnPtrBase*>& fn_pair
-      : fns_to_jit_compile_) {
-    llvm::Function* function = fn_pair.first;
-    void* jitted_function = execution_engine()->getPointerToFunction(function);
-    DCHECK(jitted_function != nullptr) << "Failed to jit " << function->getName().data();
-    fn_pair.second->store(jitted_function);
+  for (auto& entry : fns_to_jit_compile_) {
+    const llvm::StringRef& function_name = entry.first;
+
+    LlvmFunctionWithFnPtrTargets& fn_with_targets = entry.second;
+    llvm::Function* function = fn_with_targets.first;
+    std::vector<CodegenFnPtrBase*>& jitted_fn_ptrs = fn_with_targets.second;
+
+    void* jitted_function = nullptr;
+    if (cache != nullptr) {
+      DCHECK(cache_key != nullptr);
+      // execution_engine_wrapper_cached_ is used to keep the life of the jitted functions
+      // in case the engine is evicted in the global cache.
+      DCHECK(execution_engine_wrapper_cached_ != nullptr);
+      llvm::ExecutionEngine* cached_execution_engine =
+          execution_engine_wrapper_cached_->execution_engine();
+
+      // Using the function getFunctionAddress() with a non-existent function name would
+      // hit an assertion during the test, could be a bug in llvm 5, need to review after
+      // upgrade llvm. But because we already checked the names hashcode for key collision
+      // cases, we expect all the functions should be in the cached execution engine.
+      jitted_function = reinterpret_cast<void*>(
+          cached_execution_engine->getFunctionAddress(function_name));
+      if (jitted_function == nullptr) {
+        LOG(WARNING) << "Failed to get a jitted function from cache: "
+                     << function_name.data()
+                     << " key hash_code=" << cache_key->hash_code();
+        cache->IncHitOrMissCount(/*hit*/ false);
+        return false;
+      }
+    } else {
+      DCHECK(cache_key == nullptr);
+      jitted_function = execution_engine()->getPointerToFunction(function);
+      DCHECK(jitted_function != nullptr) << "Failed to jit " << function_name.data();
+    }
+
+    DCHECK(jitted_function != nullptr);
+    for (CodegenFnPtrBase* jitted_fn_ptr : jitted_fn_ptrs) {
+      jitted_fn_ptr->store(jitted_function);
+    }
   }
+
+  return true;
 }
 
 void LlvmCodeGen::DestroyModule() {
@@ -1490,6 +1495,7 @@ void LlvmCodeGen::DestroyModule() {
 void LlvmCodeGen::AddFunctionToJit(llvm::Function* fn, CodegenFnPtrBase* fn_ptr) {
   DCHECK(finalized_functions_.find(fn) != finalized_functions_.end())
       << "Attempted to add a non-finalized function to Jit: " << fn->getName().str();
+  DCHECK(!is_compiled_);
   llvm::Type* decimal_val_type = GetNamedType(CodegenAnyVal::LLVM_DECIMALVAL_NAME);
   if (fn->getReturnType() == decimal_val_type) {
     // Per the x86 calling convention ABI, DecimalVals should be returned via an extra
@@ -1526,8 +1532,16 @@ void LlvmCodeGen::AddFunctionToJit(llvm::Function* fn, CodegenFnPtrBase* fn_ptr)
 }
 
 void LlvmCodeGen::AddFunctionToJitInternal(llvm::Function* fn, CodegenFnPtrBase* fn_ptr) {
-  DCHECK(!is_compiled_);
-  fns_to_jit_compile_.push_back(make_pair(fn, fn_ptr));
+  DCHECK(fn != nullptr);
+  DCHECK(fn_ptr != nullptr);
+  const llvm::StringRef& fn_name = fn->getName();
+
+  auto it = fns_to_jit_compile_.find(fn_name);
+  if (it == fns_to_jit_compile_.end()) {
+    fns_to_jit_compile_[fn_name] = make_pair(fn, vector<CodegenFnPtrBase*>{fn_ptr});
+  } else {
+    it->second.second.push_back(fn_ptr);
+  }
 }
 
 void LlvmCodeGen::CodegenDebugTrace(
@@ -2013,8 +2027,9 @@ string LlvmCodeGen::DebugCacheEntryString(CodeGenCacheKey& key, bool is_lookup,
     out << "\nFragment Plan: " << apache::thrift::ThriftDebugString(state_->fragment())
         << "\n";
     out << "CodeGen Functions: \n";
-    for (int i = 0; i < fns_to_jit_compile_.size(); ++i) {
-      out << "  " << fns_to_jit_compile_[i].first->getName().data() << "\n";
+    for (auto& entry : fns_to_jit_compile_) {
+      const llvm::StringRef& fn_name = entry.first;
+      out << "  " << fn_name.data() << "\n";
     }
   }
   return out.str();
diff --git a/be/src/codegen/llvm-codegen.h b/be/src/codegen/llvm-codegen.h
index b0ca8d828..5d05e4857 100644
--- a/be/src/codegen/llvm-codegen.h
+++ b/be/src/codegen/llvm-codegen.h
@@ -92,6 +92,7 @@ class ImpalaMCJITMemoryManager;
 class SubExprElimination;
 class Thread;
 class TupleDescriptor;
+class CodeGenCache;
 class CodeGenCacheKey;
 
 /// Define builder subclass in case we want to change the template arguments later
@@ -734,8 +735,12 @@ class LlvmCodeGen {
   /// Optimizes the module. This includes pruning the module of any unused functions.
   Status OptimizeModule();
 
-  /// Points the function pointers in 'fns_to_jit_compile_' to the compiled functions.
-  void SetFunctionPointers();
+  /// Points the function pointers in 'fns_to_jit_compile_' to the compiled functions. If
+  /// 'cache' and 'cache_key' are non-NULL, retrieves the functions from the cached
+  /// execution engine, otherwise from the current execution engine.
+  /// Note: either both or none of 'cache' and 'cache_key' should be NULL.
+  bool SetFunctionPointers(CodeGenCache* cache = nullptr,
+      const CodeGenCacheKey* cache_key = nullptr);
 
   /// Clears generated hash fns.  This is only used for testing.
   void ClearHashFns();
@@ -947,8 +952,23 @@ class LlvmCodeGen {
   /// Used to avoid linking the same module twice, which causes symbol collision errors.
   std::set<std::string> linked_modules_;
 
-  /// The vector of functions to automatically JIT compile after FinalizeModule().
-  std::vector<std::pair<llvm::Function*, CodegenFnPtrBase*>> fns_to_jit_compile_;
+  /// Stores the functions to automatically JIT compile after FinalizeModule(). The
+  /// 'CodegenFnPtrBase*' function pointers will be set to the functions compiled from the
+  /// corresponding 'llvm::Function' objects.
+  ///
+  /// The functions are stored in a sorted map where the keys are the function names.
+  /// This is because we need the function names in GetAllFunctionNames() (in sorted
+  /// order) and in PruneModule() (in any order).
+  ///
+  /// There is a one-to-one correspondence between function names and 'llvm::Function'
+  /// objects but an 'llvm::Function' object may correspond to multiple
+  /// 'CodegenFnPtrBase*'s, for example if multiple 'SlotRef' expressions refer to the
+  /// same slot and the 'llvm::Function' is reused. In these cases the function pointers
+  /// corresponding to a single 'llvm::Function' are owned by different objects but they
+  /// will be set to the same value.
+  using LlvmFunctionWithFnPtrTargets =
+      std::pair<llvm::Function*, std::vector<CodegenFnPtrBase*>>;
+  std::map<llvm::StringRef, LlvmFunctionWithFnPtrTargets> fns_to_jit_compile_;
 
   /// The hash code generated from all the function names in fns_to_jit_compile_.
   /// Used by the codegen cache only.
diff --git a/tests/custom_cluster/test_codegen_cache.py b/tests/custom_cluster/test_codegen_cache.py
index 19cea803a..c0cafa5f8 100644
--- a/tests/custom_cluster/test_codegen_cache.py
+++ b/tests/custom_cluster/test_codegen_cache.py
@@ -193,6 +193,35 @@ class TestCodegenCache(CustomClusterTestSuite):
     assert self.get_metric('impala.codegen-cache.entries-evicted') >= cache_entries_in_use
     assert self.get_metric('impala.codegen-cache.hits') == 0
 
+  @pytest.mark.execute_serially
+  @CustomClusterTestSuite.with_args(cluster_size=1,
+          impalad_args="--codegen_cache_capacity=1GB")
+  # Regression test for IMPALA-12269. The first query uses one of the codegen'd functions
+  # in two objects, so it is added to be jitted twice. For the second query it is added
+  # only once. The hash of the function names should be the same in both cases.
+  def test_codegen_cache_with_duplicate_fn_names(self, vector):
+    exec_options = copy(vector.get_value('exec_option'))
+    exec_options['exec_single_node_rows_threshold'] = 0
+
+    q1 = """select int_col, tinyint_col from functional_parquet.alltypessmall
+        order by int_col desc limit 20"""
+    q2 = """select tinyint_col from functional_parquet.alltypessmall
+        order by int_col desc limit 20"""
+
+    self._check_metric_expect_init()
+    self.execute_query_expect_success(self.client, q1, exec_options)
+    assert self.get_metric('impala.codegen-cache.entries-evicted') == 0
+
+    self.execute_query_expect_success(self.client, q2, exec_options)
+    # If the function name hashes of the first and the second query didn't match, there
+    # would be no cache hit and the cache entry from the first query would be evicted
+    # because the llvm modules of the two queries, hence the cache keys, are identical.
+    assert self.get_metric('impala.codegen-cache.entries-evicted') == 0
+    assert self.get_metric('impala.codegen-cache.hits') == 1
+    # Expect two misses for the two fragments of the first query and one for one of the
+    # fragments of the second query.
+    assert self.get_metric('impala.codegen-cache.misses') == 3
+
   def _check_metric_expect_init(self):
     # Verifies that the cache metrics are all zero.
     assert self.get_metric('impala.codegen-cache.entries-evicted') == 0