You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by lv...@apache.org on 2019/07/21 23:36:22 UTC
[impala] 04/05: IMPALA-5031: out-of-range enum values are undefined behavior

This is an automated email from the ASF dual-hosted git repository.

lv pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit adaf344d48920c8a66a317c3720ca090f470fe5a
Author: Jim Apple <jb...@apache.org>
AuthorDate: Thu Jul 4 17:14:17 2019 -0700

    IMPALA-5031: out-of-range enum values are undefined behavior
    
    This patch fixes an out-of-range enum value in the end-to-end tests.
    
    The [expr] section of the C++14 standard indicates that out-of-range
    enum values are undefined behavior: "If during the evaluation of an
    expression, the result is not mathematically defined or not in the
    range of representable values for its type, the behavior is
    undefined."
    
    The [decl.enum] section explains what values are "in the range of
    representable values for its type": "[F]or an enumeration where emin
    is the smallest enumerator and emax is the largest, the values of the
    enumeration are the values in the range bmin to bmax, defined as
    follows: Let K be 1 for a two's complement representation and 0 for a
    one's complement or sign-magnitude representation. bmax is the
    smallest value greater than or equal to max(|emin| - K, |emax|) and
    equal to 2M-1, where M is a non-negative integer. bmin is zero if emin
    is non-negative and -(bmax+K) otherwise."
    
    The Parquet PageType enum has emin = 0 and emax = 3, so bmin = 0 and
    bmax = 3. The out-of-range value in the tests is 4294967249, and is
    therefore undefined behavior. The interesting part of the backtrace
    is:
    
    parquet/parquet-column-readers.cc:1269:24: runtime error: load of
      value 4294967249, which is not a valid value for type
      'PageType::type'
        #0 BaseScalarColumnReader::InitDictionary()
           parquet/parquet-column-readers.cc:1269:24
        #1 BaseScalarColumnReader::InitDictionaries(
           vector<BaseScalarColumnReader*>)
           parquet/parquet-column-readers.cc:1381:53
        #2 HdfsParquetScanner::NextRowGroup()
           parquet/hdfs-parquet-scanner.cc:678:14
        #3 HdfsParquetScanner::GetNextInternal(RowBatch*)
           parquet/hdfs-parquet-scanner.cc:437:45
        #4 HdfsParquetScanner::ProcessSplit()
           parquet/hdfs-parquet-scanner.cc:353:21
        #5 HdfsScanNode::ProcessSplit(vector<FilterContext> const&,
           MemPool*, io::ScanRange*, long*) exec/hdfs-scan-node.cc:514:21
        #6 HdfsScanNode::ScannerThread(bool, long)
           hdfs-scan-node.cc:415:7
        #7 HdfsScanNode::ThreadTokenAvailableCb(ThreadResourcePool*)
           ::$_0::operator()() const hdfs-scan-node.cc:337:13
    
    Change-Id: I2d126a8f3e5910d23088a3f916c4cf31aac28d95
    Reviewed-on: http://gerrit.cloudera.org:8080/13805
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 be/src/exec/parquet/parquet-column-readers.cc | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/be/src/exec/parquet/parquet-column-readers.cc b/be/src/exec/parquet/parquet-column-readers.cc
index f115002..422c89f 100644
--- a/be/src/exec/parquet/parquet-column-readers.cc
+++ b/be/src/exec/parquet/parquet-column-readers.cc
@@ -44,6 +44,7 @@
 #include "util/dict-encoding.h"
 #include "util/pretty-printer.h"
 #include "util/rle-encoding.h"
+#include "util/ubsan.h"
 
 #include "common/names.h"
 
@@ -1266,7 +1267,9 @@ Status BaseScalarColumnReader::InitDictionary() {
   if (eos) return Status::OK();
   // The dictionary must be the first data page, so if the first page
   // is not a dictionary, then there is no dictionary.
-  if (next_page_header.type != parquet::PageType::DICTIONARY_PAGE) return Status::OK();
+  if (Ubsan::EnumToInt(&next_page_header.type) != parquet::PageType::DICTIONARY_PAGE) {
+    return Status::OK();
+  }
 
   current_page_header_ = next_page_header;
   Status status;