You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by mi...@apache.org on 2023/12/01 18:47:35 UTC

(impala) branch master updated (9011b81af -> 0d2177650)

This is an automated email from the ASF dual-hosted git repository.

michaelsmith pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


    from 9011b81af IMPALA-12486: Add catalog metrics for metadata loading
     new 465cc7acf IMPALA-11542: Import LLVM SectionMemoryManager for fixes
     new 67e4ff67c IMPALA-11542: Implement pre-allocation in LLVM memory manager
     new 0d2177650 IMPALA-12563: Fix UBSAN on ARM

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CMakeLists.txt                                  |   9 +
 LICENSE.txt                                     |  45 ++++
 be/src/codegen/CMakeLists.txt                   |   3 +
 be/src/codegen/mcjit-mem-mgr.h                  |   8 +-
 be/src/thirdparty/llvm/LICENSE.TXT              |  68 +++++
 be/src/thirdparty/llvm/SectionMemoryManager.cpp | 321 ++++++++++++++++++++++++
 be/src/thirdparty/llvm/SectionMemoryManager.h   | 149 +++++++++++
 bin/rat_exclude_files.txt                       |   1 +
 bin/run-all-tests.sh                            |   9 +
 9 files changed, 609 insertions(+), 4 deletions(-)
 create mode 100644 be/src/thirdparty/llvm/LICENSE.TXT
 create mode 100644 be/src/thirdparty/llvm/SectionMemoryManager.cpp
 create mode 100644 be/src/thirdparty/llvm/SectionMemoryManager.h


(impala) 03/03: IMPALA-12563: Fix UBSAN on ARM

Posted by mi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

michaelsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 0d21776502538ca2ea861825f7168daa60a1e0d4
Author: Michael Smith <mi...@cloudera.com>
AuthorDate: Tue Nov 21 01:00:47 2023 +0000

    IMPALA-12563: Fix UBSAN on ARM
    
    Links gcc after all other libraries when building with UBSAN. On ARM,
    several symbols are included that aren't present in libclang_rt (enabled
    by -rtlib=compiler-rt for UBSAN builds) or in libgcc_s.so (needed with
    the alternate rtlib); linking libgcc.a after all other libraries ensures
    the symbols are present. There may be other repercussions, so this is
    only done for UBSAN builds.
    
    Skips FE tests with UBSAN on ARM due to increased use of thread-local
    storage on ARM that exceeds some implementation-defined limit.
    Setting '-XX:ThreadStackSize=16m' didn't help.
    
    Change-Id: I799bedd1cc73c852b0edb928dc71166e534918ba
    Reviewed-on: http://gerrit.cloudera.org:8080/20721
    Reviewed-by: Michael Smith <mi...@cloudera.com>
    Tested-by: Michael Smith <mi...@cloudera.com>
---
 CMakeLists.txt       | 9 +++++++++
 bin/run-all-tests.sh | 9 +++++++++
 2 files changed, 18 insertions(+)

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 755d31bc8..b61982d7a 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -196,6 +196,15 @@ function(IMPALA_ADD_THIRDPARTY_LIB NAME HEADER STATIC_LIB SHARED_LIB)
     ADD_THIRDPARTY_LIB(${NAME} SHARED_LIB ${SHARED_LIB})
   else()
     ADD_THIRDPARTY_LIB(${NAME} STATIC_LIB ${STATIC_LIB})
+    if (CMAKE_SYSTEM_PROCESSOR STREQUAL "aarch64")
+      if ("${CMAKE_BUILD_TYPE}" STREQUAL "UBSAN" OR
+          "${CMAKE_BUILD_TYPE}" STREQUAL "UBSAN_FULL")
+        # UBSAN builds on ARM require that gcc is included last to cover several symbols
+        # omitted in libgcc_s, which is required because we use -rtlib=compiler-rt to
+        # work around https://bugs.llvm.org/show_bug.cgi?id=16404.
+        target_link_libraries(${NAME} INTERFACE gcc)
+      endif()
+    endif()
   endif()
 endfunction()
 
diff --git a/bin/run-all-tests.sh b/bin/run-all-tests.sh
index b6472e5e6..1a06ca9b2 100755
--- a/bin/run-all-tests.sh
+++ b/bin/run-all-tests.sh
@@ -116,6 +116,15 @@ if [[ "${ERASURE_CODING}" = true ]]; then
   FE_TEST=false
 fi
 
+if test -v CMAKE_BUILD_TYPE && [[ "${CMAKE_BUILD_TYPE}" =~ 'UBSAN' ]] \
+    && [[ "$(uname -p)" = "aarch64" ]]; then
+  # FE tests fail on ARM with
+  #   libfesupport.so: cannot allocate memory in static TLS block
+  # https://bugzilla.redhat.com/show_bug.cgi?id=1722181 mentions this is more likely
+  # on aarch64 due to how it uses thread-local storage (TLS). There's no clear fix.
+  FE_TEST=false
+fi
+
 # Indicates whether code coverage reports should be generated.
 : ${CODE_COVERAGE:=false}
 


(impala) 02/03: IMPALA-11542: Implement pre-allocation in LLVM memory manager

Posted by mi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

michaelsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 67e4ff67cff6d6a77bfca43ef0c06b91f7a6661e
Author: Michael Smith <mi...@cloudera.com>
AuthorDate: Thu Nov 9 10:09:18 2023 -0800

    IMPALA-11542: Implement pre-allocation in LLVM memory manager
    
    Implements up-front allocation for the LLVM memory manager to avoid
    disparate sections on ARM which can exceed the 4GB limit for ADRP
    instructions and crash (or hit an assertion in debug mode).
    
    This is an LLVM issue that we're fixing by providing a custom memory
    manager. See the JIRA and
    https://discourse.llvm.org/t/llvm-rtdyld-aarch64-abi-relocation-restrictions/74616
    for further discussion.
    
    Testing:
    - passes debug test run on ARM
    - pre-commit tests
    
    Change-Id: I9f224edcdbdcb05fce663c18b4a5f03c8e985675
    Reviewed-on: http://gerrit.cloudera.org:8080/20692
    Tested-by: Michael Smith <mi...@cloudera.com>
    Reviewed-by: Joe McDonnell <jo...@cloudera.com>
---
 be/src/thirdparty/llvm/SectionMemoryManager.cpp | 105 ++++++++++++++++++++++++
 be/src/thirdparty/llvm/SectionMemoryManager.h   |  17 ++++
 2 files changed, 122 insertions(+)

diff --git a/be/src/thirdparty/llvm/SectionMemoryManager.cpp b/be/src/thirdparty/llvm/SectionMemoryManager.cpp
index a391a83b3..5964275ad 100644
--- a/be/src/thirdparty/llvm/SectionMemoryManager.cpp
+++ b/be/src/thirdparty/llvm/SectionMemoryManager.cpp
@@ -11,13 +11,118 @@
 // execution engine and RuntimeDyld
 //
 //===----------------------------------------------------------------------===//
+// Impala: Copied from the LLVM project to customize private portions of the
+// implementation.
 
 #include "SectionMemoryManager.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/Process.h"
 
+#include "common/logging.h"
+
 namespace impala {
 
+// ---- Impala: llvm/llvm-project#71968 ----
+bool SectionMemoryManager::hasSpace(const MemoryGroup &MemGroup,
+                                    uintptr_t Size) const {
+  for (const FreeMemBlock &FreeMB : MemGroup.FreeMem) {
+    if (FreeMB.Free.size() >= Size)
+      return true;
+  }
+  return false;
+}
+
+static uintptr_t alignTo(uintptr_t Size, uint32_t Alignment) {
+  return (Size + Alignment - 1) & ~(uintptr_t)(Alignment - 1);
+}
+
+static uint32_t checkAlignment(uint32_t Alignment, unsigned PageSize) {
+  DCHECK_GT(Alignment, 0);
+  DCHECK(!(Alignment & (Alignment - 1))) << "Alignment must be a power of two.";
+  DCHECK_LT(Alignment, PageSize);
+  // Code alignment needs to be at least the stub alignment - however, we
+  // don't have an easy way to get that here so as a workaround, we assume
+  // it's 8, which is the largest value I observed across all platforms.
+  constexpr uint32_t StubAlign = 8;
+  return std::max(Alignment, StubAlign);
+}
+
+void SectionMemoryManager::reserveAllocationSpace(
+    uintptr_t CodeSize, uint32_t CodeAlign, uintptr_t RODataSize, uint32_t RODataAlign,
+    uintptr_t RWDataSize, uint32_t RWDataAlign) {
+  if (CodeSize == 0 && RODataSize == 0 && RWDataSize == 0) return;
+
+  static const unsigned PageSize = sys::Process::getPageSize();
+
+  CodeAlign = checkAlignment(CodeAlign, PageSize);
+  RODataAlign = checkAlignment(RODataAlign, PageSize);
+  RWDataAlign = checkAlignment(RWDataAlign, PageSize);
+
+  // Get space required for each section. Use the same calculation as
+  // allocateSection because we need to be able to satisfy it.
+  uintptr_t RequiredCodeSize = alignTo(CodeSize, CodeAlign) + CodeAlign;
+  uintptr_t RequiredRODataSize = alignTo(RODataSize, RODataAlign) + RODataAlign;
+  uintptr_t RequiredRWDataSize = alignTo(RWDataSize, RWDataAlign) + RWDataAlign;
+
+  if (hasSpace(CodeMem, RequiredCodeSize) &&
+      hasSpace(RODataMem, RequiredRODataSize) &&
+      hasSpace(RWDataMem, RequiredRWDataSize)) {
+    // Sufficient space in contiguous block already available.
+    return;
+  }
+
+  // MemoryManager does not have functions for releasing memory after it's
+  // allocated. Normally it tries to use any excess blocks that were allocated
+  // due to page alignment, but if we have insufficient free memory for the
+  // request this can lead to allocating disparate memory that can violate the
+  // ARM ABI. Clear free memory so only the new allocations are used, but do
+  // not release allocated memory as it may still be in-use.
+  CodeMem.FreeMem.clear();
+  RODataMem.FreeMem.clear();
+  RWDataMem.FreeMem.clear();
+
+  // Round up to the nearest page size. Blocks must be page-aligned.
+  RequiredCodeSize = alignTo(RequiredCodeSize, PageSize);
+  RequiredRODataSize = alignTo(RequiredRODataSize, PageSize);
+  RequiredRWDataSize = alignTo(RequiredRWDataSize, PageSize);
+  uintptr_t RequiredSize = RequiredCodeSize + RequiredRODataSize + RequiredRWDataSize;
+
+  std::error_code ec;
+  sys::MemoryBlock MB = sys::Memory::allocateMappedMemory(RequiredSize, nullptr,
+      sys::Memory::MF_READ | sys::Memory::MF_WRITE, ec);
+  if (ec) {
+    return;
+  }
+  // Request is page-aligned, so we should always get back exactly the request.
+  DCHECK_EQ(MB.size(), RequiredSize);
+  // CodeMem will arbitrarily own this MemoryBlock to handle cleanup.
+  CodeMem.AllocatedMem.push_back(MB);
+  uintptr_t Addr = (uintptr_t)MB.base();
+  FreeMemBlock FreeMB;
+  FreeMB.PendingPrefixIndex = (unsigned)-1;
+
+  if (CodeSize > 0) {
+    DCHECK_EQ(Addr, alignTo(Addr, CodeAlign));
+    FreeMB.Free = sys::MemoryBlock((void*)Addr, RequiredCodeSize);
+    CodeMem.FreeMem.push_back(FreeMB);
+    Addr += RequiredCodeSize;
+  }
+
+  if (RODataSize > 0) {
+    DCHECK_EQ(Addr, alignTo(Addr, RODataAlign));
+    FreeMB.Free = sys::MemoryBlock((void*)Addr, RequiredRODataSize);
+    RODataMem.FreeMem.push_back(FreeMB);
+    Addr += RequiredRODataSize;
+  }
+
+  if (RWDataSize > 0) {
+    DCHECK_EQ(Addr, alignTo(Addr, RWDataAlign));
+    FreeMB.Free = sys::MemoryBlock((void*)Addr, RequiredRWDataSize);
+    RWDataMem.FreeMem.push_back(FreeMB);
+  }
+}
+// ---- End Impala changes ----
+
 uint8_t *SectionMemoryManager::allocateDataSection(uintptr_t Size,
                                                    unsigned Alignment,
                                                    unsigned SectionID,
diff --git a/be/src/thirdparty/llvm/SectionMemoryManager.h b/be/src/thirdparty/llvm/SectionMemoryManager.h
index d8ce1fc66..23ba5de28 100644
--- a/be/src/thirdparty/llvm/SectionMemoryManager.h
+++ b/be/src/thirdparty/llvm/SectionMemoryManager.h
@@ -51,6 +51,20 @@ public:
   void operator=(const SectionMemoryManager&) = delete;
   ~SectionMemoryManager() override;
 
+  /// Impala: enable reserveAllocationSpace callback.
+  bool needsToReserveAllocationSpace() override { return true; }
+
+  /// Impala: Provides an option to reserveAllocationSpace and pre-allocate all
+  /// memory in a single block. This is required for ARM where ADRP instructions
+  /// have a limit of 4GB offsets. Large memory systems may allocate sections
+  /// further apart than this unless we pre-allocate.
+  ///
+  /// Should only be called once. Later calls might re-use free blocks rather
+  /// than allocating a new contiguous block.
+  void reserveAllocationSpace(uintptr_t CodeSize, uint32_t CodeAlign,
+      uintptr_t RODataSize, uint32_t RODataAlign,
+      uintptr_t RWDataSize, uint32_t RWDataAlign) override;
+
   /// \brief Allocates a memory block of (at least) the given size suitable for
   /// executable code.
   ///
@@ -122,6 +136,9 @@ private:
   std::error_code applyMemoryGroupPermissions(MemoryGroup &MemGroup,
                                               unsigned Permissions);
 
+  // Impala: added to identify possible MemoryBlock re-use
+  bool hasSpace(const MemoryGroup &MemGroup, uintptr_t Size) const;
+
   MemoryGroup CodeMem;
   MemoryGroup RWDataMem;
   MemoryGroup RODataMem;


(impala) 01/03: IMPALA-11542: Import LLVM SectionMemoryManager for fixes

Posted by mi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

michaelsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 465cc7acf7c524ba8957518f451563d5523b8053
Author: Michael Smith <mi...@cloudera.com>
AuthorDate: Fri Nov 10 07:52:54 2023 -0800

    IMPALA-11542: Import LLVM SectionMemoryManager for fixes
    
    Imports SectionMemoryManager so we can add some fixes that require
    modifying private code. This could be done as a patch on LLVM, but the
    memory manager seems like something we should own and may need to
    customize in other ways.
    
    Changes namespace and adds using statements to make it compile. Uses
    LLVM 5.0.1 with commit 787614d08bda40daf3e168bd46a8c2a86319ec63 added as
    a small security fix.
    
    Change-Id: I8917005094903ed0ece25e40eb445abb159b569b
    Reviewed-on: http://gerrit.cloudera.org:8080/20696
    Reviewed-by: Michael Smith <mi...@cloudera.com>
    Tested-by: Michael Smith <mi...@cloudera.com>
---
 LICENSE.txt                                     |  45 +++++
 be/src/codegen/CMakeLists.txt                   |   3 +
 be/src/codegen/mcjit-mem-mgr.h                  |   8 +-
 be/src/thirdparty/llvm/LICENSE.TXT              |  68 ++++++++
 be/src/thirdparty/llvm/SectionMemoryManager.cpp | 216 ++++++++++++++++++++++++
 be/src/thirdparty/llvm/SectionMemoryManager.h   | 132 +++++++++++++++
 bin/rat_exclude_files.txt                       |   1 +
 7 files changed, 469 insertions(+), 4 deletions(-)

diff --git a/LICENSE.txt b/LICENSE.txt
index 977c1abce..665e9f375 100644
--- a/LICENSE.txt
+++ b/LICENSE.txt
@@ -642,6 +642,51 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
 --------------------------------------------------------------------------------
 
+be/src/thirdparty/llvm: LLVM Release License
+
+University of Illinois/NCSA
+Open Source License
+
+Copyright (c) 2003-2017 University of Illinois at Urbana-Champaign.
+All rights reserved.
+
+Developed by:
+
+    LLVM Team
+
+    University of Illinois at Urbana-Champaign
+
+    http://llvm.org
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+this software and associated documentation files (the "Software"), to deal with
+the Software without restriction, including without limitation the rights to
+use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
+of the Software, and to permit persons to whom the Software is furnished to do
+so, subject to the following conditions:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimers.
+
+    * Redistributions in binary form must reproduce the above copyright notice,
+      this list of conditions and the following disclaimers in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the names of the LLVM Team, University of Illinois at
+      Urbana-Champaign, nor the names of its contributors may be used to
+      endorse or promote products derived from this Software without specific
+      prior written permission.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
+FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS WITH THE
+SOFTWARE.
+
+--------------------------------------------------------------------------------
+
 be/src/thirdparty/squeasel: MIT license
 
 Some portions Copyright (c) 2004-2013 Sergey Lyubka
diff --git a/be/src/codegen/CMakeLists.txt b/be/src/codegen/CMakeLists.txt
index ef89223f2..45f8ae0f1 100644
--- a/be/src/codegen/CMakeLists.txt
+++ b/be/src/codegen/CMakeLists.txt
@@ -16,6 +16,8 @@
 # under the License.
 
 
+set(THIRDPARTY_LLVM_SRC_DIR "${CMAKE_SOURCE_DIR}/be/src/thirdparty/llvm")
+
 # where to put generated libraries
 set(LIBRARY_OUTPUT_PATH "${BUILD_OUTPUT_ROOT_DIRECTORY}/codegen")
 
@@ -37,6 +39,7 @@ add_library(CodeGen
   llvm-codegen.cc
   llvm-codegen-cache.cc
   instruction-counter.cc
+  ${THIRDPARTY_LLVM_SRC_DIR}/SectionMemoryManager.cpp
   ${IR_O1_C_FILE}
   ${IR_O2_C_FILE}
   ${IR_Os_C_FILE}
diff --git a/be/src/codegen/mcjit-mem-mgr.h b/be/src/codegen/mcjit-mem-mgr.h
index 163249e95..6002b6087 100644
--- a/be/src/codegen/mcjit-mem-mgr.h
+++ b/be/src/codegen/mcjit-mem-mgr.h
@@ -19,7 +19,7 @@
 #ifndef IMPALA_CODEGEN_MCJIT_MEM_MGR_H
 #define IMPALA_CODEGEN_MCJIT_MEM_MGR_H
 
-#include <llvm/ExecutionEngine/SectionMemoryManager.h>
+#include "thirdparty/llvm/SectionMemoryManager.h"
 
 extern void *__dso_handle __attribute__ ((__visibility__ ("hidden")));
 
@@ -34,7 +34,7 @@ namespace impala {
 /// which come from global variables with destructors.
 ///
 /// We also use it to track how much memory is allocated for compiled code.
-class ImpalaMCJITMemoryManager : public llvm::SectionMemoryManager {
+class ImpalaMCJITMemoryManager : public SectionMemoryManager {
  public:
   ImpalaMCJITMemoryManager() : bytes_allocated_(0), bytes_tracked_(0){}
 
@@ -46,14 +46,14 @@ class ImpalaMCJITMemoryManager : public llvm::SectionMemoryManager {
   virtual uint8_t* allocateCodeSection(uintptr_t size, unsigned alignment,
       unsigned section_id, llvm::StringRef section_name) override {
     bytes_allocated_ += size;
-    return llvm::SectionMemoryManager::allocateCodeSection(
+    return SectionMemoryManager::allocateCodeSection(
         size, alignment, section_id, section_name);
   }
 
   virtual uint8_t* allocateDataSection(uintptr_t size, unsigned alignment,
       unsigned section_id, llvm::StringRef section_name, bool is_read_only) override {
     bytes_allocated_ += size;
-    return llvm::SectionMemoryManager::allocateDataSection(
+    return SectionMemoryManager::allocateDataSection(
         size, alignment, section_id, section_name, is_read_only);
   }
 
diff --git a/be/src/thirdparty/llvm/LICENSE.TXT b/be/src/thirdparty/llvm/LICENSE.TXT
new file mode 100644
index 000000000..ff63f2b6a
--- /dev/null
+++ b/be/src/thirdparty/llvm/LICENSE.TXT
@@ -0,0 +1,68 @@
+==============================================================================
+LLVM Release License
+==============================================================================
+University of Illinois/NCSA
+Open Source License
+
+Copyright (c) 2003-2017 University of Illinois at Urbana-Champaign.
+All rights reserved.
+
+Developed by:
+
+    LLVM Team
+
+    University of Illinois at Urbana-Champaign
+
+    http://llvm.org
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+this software and associated documentation files (the "Software"), to deal with
+the Software without restriction, including without limitation the rights to
+use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
+of the Software, and to permit persons to whom the Software is furnished to do
+so, subject to the following conditions:
+
+    * Redistributions of source code must retain the above copyright notice,
+      this list of conditions and the following disclaimers.
+
+    * Redistributions in binary form must reproduce the above copyright notice,
+      this list of conditions and the following disclaimers in the
+      documentation and/or other materials provided with the distribution.
+
+    * Neither the names of the LLVM Team, University of Illinois at
+      Urbana-Champaign, nor the names of its contributors may be used to
+      endorse or promote products derived from this Software without specific
+      prior written permission.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
+FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
+CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS WITH THE
+SOFTWARE.
+
+==============================================================================
+Copyrights and Licenses for Third Party Software Distributed with LLVM:
+==============================================================================
+The LLVM software contains code written by third parties.  Such software will
+have its own individual LICENSE.TXT file in the directory in which it appears.
+This file will describe the copyrights, license, and restrictions which apply
+to that code.
+
+The disclaimer of warranty in the University of Illinois Open Source License
+applies to all code in the LLVM Distribution, and nothing in any of the
+other licenses gives permission to use the names of the LLVM Team or the
+University of Illinois to endorse or promote products derived from this
+Software.
+
+The following pieces of software have additional or alternate copyrights,
+licenses, and/or restrictions:
+
+Program             Directory
+-------             ---------
+Google Test         llvm/utils/unittest/googletest
+OpenBSD regex       llvm/lib/Support/{reg*, COPYRIGHT.regex}
+pyyaml tests        llvm/test/YAMLParser/{*.data, LICENSE.TXT}
+ARM contributions   llvm/lib/Target/ARM/LICENSE.TXT
+md5 contributions   llvm/lib/Support/MD5.cpp llvm/include/llvm/Support/MD5.h
diff --git a/be/src/thirdparty/llvm/SectionMemoryManager.cpp b/be/src/thirdparty/llvm/SectionMemoryManager.cpp
new file mode 100644
index 000000000..a391a83b3
--- /dev/null
+++ b/be/src/thirdparty/llvm/SectionMemoryManager.cpp
@@ -0,0 +1,216 @@
+//===- SectionMemoryManager.cpp - Memory manager for MCJIT/RtDyld *- C++ -*-==//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements the section-based memory manager used by the MCJIT
+// execution engine and RuntimeDyld
+//
+//===----------------------------------------------------------------------===//
+
+#include "SectionMemoryManager.h"
+#include "llvm/Support/MathExtras.h"
+#include "llvm/Support/Process.h"
+
+namespace impala {
+
+uint8_t *SectionMemoryManager::allocateDataSection(uintptr_t Size,
+                                                   unsigned Alignment,
+                                                   unsigned SectionID,
+                                                   StringRef SectionName,
+                                                   bool IsReadOnly) {
+  if (IsReadOnly)
+    return allocateSection(RODataMem, Size, Alignment);
+  return allocateSection(RWDataMem, Size, Alignment);
+}
+
+uint8_t *SectionMemoryManager::allocateCodeSection(uintptr_t Size,
+                                                   unsigned Alignment,
+                                                   unsigned SectionID,
+                                                   StringRef SectionName) {
+  return allocateSection(CodeMem, Size, Alignment);
+}
+
+uint8_t *SectionMemoryManager::allocateSection(MemoryGroup &MemGroup,
+                                               uintptr_t Size,
+                                               unsigned Alignment) {
+  if (!Alignment)
+    Alignment = 16;
+
+  assert(!(Alignment & (Alignment - 1)) && "Alignment must be a power of two.");
+
+  uintptr_t RequiredSize = Alignment * ((Size + Alignment - 1)/Alignment + 1);
+  uintptr_t Addr = 0;
+
+  // Look in the list of free memory regions and use a block there if one
+  // is available.
+  for (FreeMemBlock &FreeMB : MemGroup.FreeMem) {
+    if (FreeMB.Free.size() >= RequiredSize) {
+      Addr = (uintptr_t)FreeMB.Free.base();
+      uintptr_t EndOfBlock = Addr + FreeMB.Free.size();
+      // Align the address.
+      Addr = (Addr + Alignment - 1) & ~(uintptr_t)(Alignment - 1);
+
+      if (FreeMB.PendingPrefixIndex == (unsigned)-1) {
+        // The part of the block we're giving out to the user is now pending
+        MemGroup.PendingMem.push_back(sys::MemoryBlock((void *)Addr, Size));
+
+        // Remember this pending block, such that future allocations can just
+        // modify it rather than creating a new one
+        FreeMB.PendingPrefixIndex = MemGroup.PendingMem.size() - 1;
+      } else {
+        sys::MemoryBlock &PendingMB = MemGroup.PendingMem[FreeMB.PendingPrefixIndex];
+        PendingMB = sys::MemoryBlock(PendingMB.base(), Addr + Size - (uintptr_t)PendingMB.base());
+      }
+
+      // Remember how much free space is now left in this block
+      FreeMB.Free = sys::MemoryBlock((void *)(Addr + Size), EndOfBlock - Addr - Size);
+      return (uint8_t*)Addr;
+    }
+  }
+
+  // No pre-allocated free block was large enough. Allocate a new memory region.
+  // Note that all sections get allocated as read-write.  The permissions will
+  // be updated later based on memory group.
+  //
+  // FIXME: It would be useful to define a default allocation size (or add
+  // it as a constructor parameter) to minimize the number of allocations.
+  //
+  // FIXME: Initialize the Near member for each memory group to avoid
+  // interleaving.
+  std::error_code ec;
+  sys::MemoryBlock MB = sys::Memory::allocateMappedMemory(RequiredSize,
+                                                          &MemGroup.Near,
+                                                          sys::Memory::MF_READ |
+                                                            sys::Memory::MF_WRITE,
+                                                          ec);
+  if (ec) {
+    // FIXME: Add error propagation to the interface.
+    return nullptr;
+  }
+
+  // Save this address as the basis for our next request
+  MemGroup.Near = MB;
+
+  // Remember that we allocated this memory
+  MemGroup.AllocatedMem.push_back(MB);
+  Addr = (uintptr_t)MB.base();
+  uintptr_t EndOfBlock = Addr + MB.size();
+
+  // Align the address.
+  Addr = (Addr + Alignment - 1) & ~(uintptr_t)(Alignment - 1);
+
+  // The part of the block we're giving out to the user is now pending
+  MemGroup.PendingMem.push_back(sys::MemoryBlock((void *)Addr, Size));
+
+  // The allocateMappedMemory may allocate much more memory than we need. In
+  // this case, we store the unused memory as a free memory block.
+  unsigned FreeSize = EndOfBlock-Addr-Size;
+  if (FreeSize > 16) {
+    FreeMemBlock FreeMB;
+    FreeMB.Free = sys::MemoryBlock((void*)(Addr + Size), FreeSize);
+    FreeMB.PendingPrefixIndex = (unsigned)-1;
+    MemGroup.FreeMem.push_back(FreeMB);
+  }
+
+  // Return aligned address
+  return (uint8_t*)Addr;
+}
+
+bool SectionMemoryManager::finalizeMemory(std::string *ErrMsg)
+{
+  // FIXME: Should in-progress permissions be reverted if an error occurs?
+  std::error_code ec;
+
+  // Make code memory executable.
+  ec = applyMemoryGroupPermissions(CodeMem,
+                                   sys::Memory::MF_READ | sys::Memory::MF_EXEC);
+  if (ec) {
+    if (ErrMsg) {
+      *ErrMsg = ec.message();
+    }
+    return true;
+  }
+
+  // Make read-only data memory read-only.
+  ec = applyMemoryGroupPermissions(RODataMem, sys::Memory::MF_READ);
+  if (ec) {
+    if (ErrMsg) {
+      *ErrMsg = ec.message();
+    }
+    return true;
+  }
+
+  // Read-write data memory already has the correct permissions
+
+  // Some platforms with separate data cache and instruction cache require
+  // explicit cache flush, otherwise JIT code manipulations (like resolved
+  // relocations) will get to the data cache but not to the instruction cache.
+  invalidateInstructionCache();
+
+  return false;
+}
+
+static sys::MemoryBlock trimBlockToPageSize(sys::MemoryBlock M) {
+  static const size_t PageSize = sys::Process::getPageSize();
+
+  size_t StartOverlap =
+      (PageSize - ((uintptr_t)M.base() % PageSize)) % PageSize;
+
+  size_t TrimmedSize = M.size();
+  TrimmedSize -= StartOverlap;
+  TrimmedSize -= TrimmedSize % PageSize;
+
+  sys::MemoryBlock Trimmed((void *)((uintptr_t)M.base() + StartOverlap), TrimmedSize);
+
+  assert(((uintptr_t)Trimmed.base() % PageSize) == 0);
+  assert((Trimmed.size() % PageSize) == 0);
+  assert(M.base() <= Trimmed.base() && Trimmed.size() <= M.size());
+
+  return Trimmed;
+}
+
+
+std::error_code
+SectionMemoryManager::applyMemoryGroupPermissions(MemoryGroup &MemGroup,
+                                                  unsigned Permissions) {
+  for (sys::MemoryBlock &MB : MemGroup.PendingMem)
+    if (std::error_code EC = sys::Memory::protectMappedMemory(MB, Permissions))
+      return EC;
+
+  MemGroup.PendingMem.clear();
+
+  // Now go through free blocks and trim any of them that don't span the entire
+  // page because one of the pending blocks may have overlapped it.
+  for (FreeMemBlock &FreeMB : MemGroup.FreeMem) {
+    FreeMB.Free = trimBlockToPageSize(FreeMB.Free);
+    // We cleared the PendingMem list, so all these pointers are now invalid
+    FreeMB.PendingPrefixIndex = (unsigned)-1;
+  }
+
+  // Remove all blocks which are now empty
+  MemGroup.FreeMem.erase(
+      remove_if(MemGroup.FreeMem,
+                [](FreeMemBlock &FreeMB) { return FreeMB.Free.size() == 0; }),
+      MemGroup.FreeMem.end());
+
+  return std::error_code();
+}
+
+void SectionMemoryManager::invalidateInstructionCache() {
+  for (sys::MemoryBlock &Block : CodeMem.PendingMem)
+    sys::Memory::InvalidateInstructionCache(Block.base(), Block.size());
+}
+
+SectionMemoryManager::~SectionMemoryManager() {
+  for (MemoryGroup *Group : {&CodeMem, &RWDataMem, &RODataMem}) {
+    for (sys::MemoryBlock &Block : Group->AllocatedMem)
+      sys::Memory::releaseMappedMemory(Block);
+  }
+}
+
+} // namespace impala
diff --git a/be/src/thirdparty/llvm/SectionMemoryManager.h b/be/src/thirdparty/llvm/SectionMemoryManager.h
new file mode 100644
index 000000000..d8ce1fc66
--- /dev/null
+++ b/be/src/thirdparty/llvm/SectionMemoryManager.h
@@ -0,0 +1,132 @@
+//===- SectionMemoryManager.h - Memory manager for MCJIT/RtDyld -*- C++ -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains the declaration of a section-based memory manager used by
+// the MCJIT execution engine and RuntimeDyld.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_EXECUTIONENGINE_SECTIONMEMORYMANAGER_H
+#define LLVM_EXECUTIONENGINE_SECTIONMEMORYMANAGER_H
+
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ExecutionEngine/RTDyldMemoryManager.h"
+#include "llvm/Support/Memory.h"
+#include <cstdint>
+#include <string>
+#include <system_error>
+
+// Impala: namespace changed to avoid symbol collisions with LLVM libraries.
+namespace sys = llvm::sys;
+using llvm::RTDyldMemoryManager;
+using llvm::SmallVector;
+using llvm::StringRef;
+
+namespace impala {
+
+/// This is a simple memory manager which implements the methods called by
+/// the RuntimeDyld class to allocate memory for section-based loading of
+/// objects, usually those generated by the MCJIT execution engine.
+///
+/// This memory manager allocates all section memory as read-write.  The
+/// RuntimeDyld will copy JITed section memory into these allocated blocks
+/// and perform any necessary linking and relocations.
+///
+/// Any client using this memory manager MUST ensure that section-specific
+/// page permissions have been applied before attempting to execute functions
+/// in the JITed object.  Permissions can be applied either by calling
+/// MCJIT::finalizeObject or by calling SectionMemoryManager::finalizeMemory
+/// directly.  Clients of MCJIT should call MCJIT::finalizeObject.
+class SectionMemoryManager : public RTDyldMemoryManager {
+public:
+  SectionMemoryManager() = default;
+  SectionMemoryManager(const SectionMemoryManager&) = delete;
+  void operator=(const SectionMemoryManager&) = delete;
+  ~SectionMemoryManager() override;
+
+  /// \brief Allocates a memory block of (at least) the given size suitable for
+  /// executable code.
+  ///
+  /// The value of \p Alignment must be a power of two.  If \p Alignment is zero
+  /// a default alignment of 16 will be used.
+  uint8_t *allocateCodeSection(uintptr_t Size, unsigned Alignment,
+                               unsigned SectionID,
+                               StringRef SectionName) override;
+
+  /// \brief Allocates a memory block of (at least) the given size suitable for
+  /// executable code.
+  ///
+  /// The value of \p Alignment must be a power of two.  If \p Alignment is zero
+  /// a default alignment of 16 will be used.
+  uint8_t *allocateDataSection(uintptr_t Size, unsigned Alignment,
+                               unsigned SectionID, StringRef SectionName,
+                               bool isReadOnly) override;
+
+  /// \brief Update section-specific memory permissions and other attributes.
+  ///
+  /// This method is called when object loading is complete and section page
+  /// permissions can be applied.  It is up to the memory manager implementation
+  /// to decide whether or not to act on this method.  The memory manager will
+  /// typically allocate all sections as read-write and then apply specific
+  /// permissions when this method is called.  Code sections cannot be executed
+  /// until this function has been called.  In addition, any cache coherency
+  /// operations needed to reliably use the memory are also performed.
+  ///
+  /// \returns true if an error occurred, false otherwise.
+  bool finalizeMemory(std::string *ErrMsg = nullptr) override;
+
+  /// \brief Invalidate instruction cache for code sections.
+  ///
+  /// Some platforms with separate data cache and instruction cache require
+  /// explicit cache flush, otherwise JIT code manipulations (like resolved
+  /// relocations) will get to the data cache but not to the instruction cache.
+  ///
+  /// This method is called from finalizeMemory.
+  virtual void invalidateInstructionCache();
+
+private:
+  struct FreeMemBlock {
+    // The actual block of free memory
+    sys::MemoryBlock Free;
+    // If there is a pending allocation from the same reservation right before
+    // this block, store it's index in PendingMem, to be able to update the
+    // pending region if part of this block is allocated, rather than having to
+    // create a new one
+    unsigned PendingPrefixIndex;
+  };
+
+  struct MemoryGroup {
+    // PendingMem contains all blocks of memory (subblocks of AllocatedMem)
+    // which have not yet had their permissions applied, but have been given
+    // out to the user. FreeMem contains all block of memory, which have
+    // neither had their permissions applied, nor been given out to the user.
+    SmallVector<sys::MemoryBlock, 16> PendingMem;
+    SmallVector<FreeMemBlock, 16> FreeMem;
+
+    // All memory blocks that have been requested from the system
+    SmallVector<sys::MemoryBlock, 16> AllocatedMem;
+
+    sys::MemoryBlock Near;
+  };
+
+  uint8_t *allocateSection(MemoryGroup &MemGroup, uintptr_t Size,
+                           unsigned Alignment);
+
+  std::error_code applyMemoryGroupPermissions(MemoryGroup &MemGroup,
+                                              unsigned Permissions);
+
+  MemoryGroup CodeMem;
+  MemoryGroup RWDataMem;
+  MemoryGroup RODataMem;
+};
+
+} // end namespace impala
+
+#endif // LLVM_EXECUTION_ENGINE_SECTION_MEMORY_MANAGER_H
diff --git a/bin/rat_exclude_files.txt b/bin/rat_exclude_files.txt
index 8ba7c66e6..f6bb10f9f 100644
--- a/bin/rat_exclude_files.txt
+++ b/bin/rat_exclude_files.txt
@@ -33,6 +33,7 @@ bin/banned_py3k_warnings.txt
 # See $IMPALA_HOME/LICENSE.txt
 be/src/gutil/*
 be/src/thirdparty/datasketches/*
+be/src/thirdparty/llvm/*
 be/src/thirdparty/murmurhash/*
 be/src/thirdparty/mpfit/*
 be/src/thirdparty/fast_double_parser/*