You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2019/05/15 22:37:42 UTC
[impala] branch master updated: IMPALA-4356, IMPALA-7331: codegen all ScalarExprs

This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


The following commit(s) were added to refs/heads/master by this push:
     new d4648e8  IMPALA-4356,IMPALA-7331: codegen all ScalarExprs
d4648e8 is described below

commit d4648e87b4bc0c05203f6af0f4448dc2017bda80
Author: Tim Armstrong <ta...@cloudera.com>
AuthorDate: Fri Mar 15 09:58:39 2019 -0700

    IMPALA-4356,IMPALA-7331: codegen all ScalarExprs
    
    Based on initial draft patch by Pooja Nilangekar.
    
    Codegen'd expressions can be executed in two ways - either by
    being called directly from a fully codegend function, or from
    interpreted code via a function pointer (previously
    ScalarFnCall::scalar_fn_wrapper_).
    
    This change moves the function pointer from ScalarFnCall to its
    base class ScalarExpr, so the full expr tree can be codegen'd, not
    just the ScalarFnCall subtrees. The key refactoring and improvements
    are:
    * ScalarExpr::Get*Val() switches between interpreted and the codegen'd
      function pointer code paths in an inline function, avoiding a
      virtual function call to ScalarFnCal::Get*Val().
    * Boilerplate logic is moved to ScalarExpr::GetCodegendComputeFn(),
      which calls a virtual function GetCodegenComputeFnImpl().
    * ScalarFnCall's logic for deciding whether to interpret or codegen is
      better abstracted and exposed to ScalarExpr as IsInterpretable()
      and ShouldCodegen() methods.
    * The ScalarExpr::codegend_compute_fn_ function pointer is only
      populated for expressions that are "codegen entry points". These
      include the roots of expr trees and non-root expressions
      where the parent expression calls Get*Val() from the
      pseudo-codegend GetCodegendComputeFnWrapper().
    * ScalarFnCall is always initialised for interpreted execution.
      Otherwise the function pointer is needed for non-root expressions,
      e.g. to support ScalarExprEvaluator::GetConstantVal().
    * Latent bugs/gaps for codegen of CollectionVal are fixed. CollectionVal
      is modified to use the StringVal memory layout to allow code sharing
      with StringVal. These fixes allowed simplification of
      IsNotEmptyPredicate codegen (from IMPALA-7657).
    
    I chose to tackle two problems in one change - adding support for
    generating codegen'd function pointers for all ScalarExprs, and adding
    the "entry point" concept - to avoid a blow-up in the number of
    codegen'd entry points that could lead to longer codegen times and/or
    worse code because of inlining changes.
    
    IMPALA-7331 (CHAR codegen support functions) is also fixed because
    it was simpler to enable CHAR codegen within ScalarExpr than to carry
    forward the exiting CHAR workarounds from ScalarFnCall. The
    CHAR-specific codegen support required in the scalar expr subsystem is
    very limited.  StringVal intermediates are used everywhere. Only
    SlotRef actually operates on the different tuple layout, and the
    required codegen support for SlotRef already exists for UDA
    intermediates anyway.
    
    Testing:
    * Ran exhaustive tests.
    
    Perf:
    * Ran a basic insert benchmark, which went from 10.1s to 7.6s
      create table foo stored as parquet as
      select case when l_orderkey % 2 = 0 then 'aaa' else 'bbb' end
      from tpch30_parquet.lineitem;
    * Ran a basic CHAR expr test:
      set num_nodes=1;
      set mt_dop=1;
      select count(*) from lineitem
      where cast(l_linestatus as CHAR(2)) = 'O ' and
            cast(l_returnflag as CHAR(2)) = 'N '
      The time spent in the scan went from 520ms to 220ms.
    * Added perf regression test to tpcds-insert, similar to the manual
      benchmark.
    * Ran single-node TPC-H with large and small scale factors, to estimate
      impact on execution perf and query startup time, respectively.
    
    +----------+-----------------------+---------+------------+------------+----------------+
    | Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
    +----------+-----------------------+---------+------------+------------+----------------+
    | TPCH(30) | parquet / none / none | 6.84    | -0.18%     | 4.49       | -0.31%         |
    +----------+-----------------------+---------+------------+------------+----------------+
    
    +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
    | Workload | Query    | File Format           | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval   |
    +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
    | TPCH(30) | TPCH-Q20 | parquet / none / none | 2.58   | 2.47        |   +4.18%   |   1.29%   |   0.88%        | 5     |   +4.12%       | 2.31    | 5.81   |
    | TPCH(30) | TPCH-Q17 | parquet / none / none | 4.81   | 4.61        |   +4.33%   |   2.18%   |   2.15%        | 5     |   +3.91%       | 1.73    | 3.09   |
    | TPCH(30) | TPCH-Q21 | parquet / none / none | 26.45  | 26.16       |   +1.09%   |   0.37%   |   0.50%        | 5     |   +1.36%       | 2.02    | 3.94   |
    | TPCH(30) | TPCH-Q9  | parquet / none / none | 15.92  | 15.75       |   +1.09%   |   2.87%   |   1.65%        | 5     |   +0.88%       | 0.29    | 0.73   |
    | TPCH(30) | TPCH-Q12 | parquet / none / none | 2.38   | 2.35        |   +1.12%   |   1.64%   |   1.11%        | 5     |   +0.80%       | 1.15    | 1.26   |
    | TPCH(30) | TPCH-Q14 | parquet / none / none | 2.94   | 2.91        |   +1.13%   |   7.68%   |   5.37%        | 5     |   -0.34%       | -0.29   | 0.27   |
    | TPCH(30) | TPCH-Q18 | parquet / none / none | 18.10  | 18.02       |   +0.42%   |   2.70%   |   0.56%        | 5     |   +0.28%       | 0.29    | 0.34   |
    | TPCH(30) | TPCH-Q8  | parquet / none / none | 4.72   | 4.72        |   -0.04%   |   1.20%   |   1.65%        | 5     |   +0.05%       | 0.00    | -0.04  |
    | TPCH(30) | TPCH-Q19 | parquet / none / none | 3.92   | 3.93        |   -0.26%   |   1.08%   |   2.36%        | 5     |   +0.20%       | 0.58    | -0.23  |
    | TPCH(30) | TPCH-Q6  | parquet / none / none | 1.27   | 1.27        |   -0.28%   |   0.22%   |   0.88%        | 5     |   +0.09%       | 0.29    | -0.68  |
    | TPCH(30) | TPCH-Q16 | parquet / none / none | 2.64   | 2.65        |   -0.45%   |   1.65%   |   0.65%        | 5     |   -0.24%       | -0.58   | -0.57  |
    | TPCH(30) | TPCH-Q22 | parquet / none / none | 3.10   | 3.13        |   -0.76%   |   1.47%   |   1.12%        | 5     |   -0.21%       | -0.29   | -0.93  |
    | TPCH(30) | TPCH-Q2  | parquet / none / none | 1.20   | 1.21        |   -0.80%   |   2.26%   |   2.47%        | 5     |   -0.82%       | -1.15   | -0.53  |
    | TPCH(30) | TPCH-Q4  | parquet / none / none | 1.97   | 1.99        |   -1.37%   |   1.84%   |   3.21%        | 5     |   -0.47%       | -0.58   | -0.83  |
    | TPCH(30) | TPCH-Q13 | parquet / none / none | 11.53  | 11.63       |   -0.91%   |   0.46%   |   0.49%        | 5     |   -0.95%       | -2.02   | -3.08  |
    | TPCH(30) | TPCH-Q10 | parquet / none / none | 5.13   | 5.21        |   -1.51%   |   2.24%   |   4.05%        | 5     |   -0.94%       | -0.58   | -0.73  |
    | TPCH(30) | TPCH-Q5  | parquet / none / none | 3.61   | 3.66        |   -1.40%   |   0.66%   |   0.79%        | 5     |   -1.33%       | -1.73   | -3.05  |
    | TPCH(30) | TPCH-Q7  | parquet / none / none | 19.42  | 19.71       |   -1.52%   |   1.34%   |   1.39%        | 5     |   -1.22%       | -1.44   | -1.76  |
    | TPCH(30) | TPCH-Q3  | parquet / none / none | 5.08   | 5.15        |   -1.49%   |   1.34%   |   0.73%        | 5     |   -1.35%       | -1.44   | -2.20  |
    | TPCH(30) | TPCH-Q15 | parquet / none / none | 3.42   | 3.49        |   -1.92%   |   0.93%   |   1.47%        | 5     |   -1.53%       | -1.15   | -2.49  |
    | TPCH(30) | TPCH-Q11 | parquet / none / none | 1.15   | 1.19        |   -3.17%   |   2.27%   |   1.95%        | 5     |   -4.21%       | -1.15   | -2.41  |
    | TPCH(30) | TPCH-Q1  | parquet / none / none | 9.26   | 9.63        |   -3.85%   |   0.62%   |   0.59%        | 5     |   -3.78%       | -2.31   | -10.25 |
    +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
    
    Cluster Name: UNKNOWN
    Lab Run Info: UNKNOWN
    Impala Version:          impalad version 3.2.0-SNAPSHOT RELEASE ()
    Baseline Impala Version: impalad version 3.2.0-SNAPSHOT RELEASE (2019-03-19)
    
    +----------+-----------------------+---------+------------+------------+----------------+
    | Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
    +----------+-----------------------+---------+------------+------------+----------------+
    | TPCH(2)  | parquet / none / none | 0.90    | -0.08%     | 0.80       | -0.05%         |
    +----------+-----------------------+---------+------------+------------+----------------+
    
    +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+-------+
    | Workload | Query    | File Format           | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval  |
    +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+-------+
    | TPCH(2)  | TPCH-Q18 | parquet / none / none | 1.22   | 1.19        |   +1.93%   |   3.81%   |   4.46%        | 20    |   +3.34%       | 1.62    | 1.46  |
    | TPCH(2)  | TPCH-Q10 | parquet / none / none | 0.74   | 0.73        |   +1.97%   |   3.36%   |   2.94%        | 20    |   +0.97%       | 1.88    | 1.95  |
    | TPCH(2)  | TPCH-Q11 | parquet / none / none | 0.49   | 0.48        |   +1.91%   |   6.19%   |   4.64%        | 20    |   +0.25%       | 0.95    | 1.09  |
    | TPCH(2)  | TPCH-Q4  | parquet / none / none | 0.43   | 0.43        |   +1.99%   |   6.26%   |   5.86%        | 20    |   +0.15%       | 0.92    | 1.03  |
    | TPCH(2)  | TPCH-Q15 | parquet / none / none | 0.50   | 0.49        |   +1.82%   |   7.32%   |   6.35%        | 20    |   +0.26%       | 1.01    | 0.83  |
    | TPCH(2)  | TPCH-Q1  | parquet / none / none | 0.98   | 0.97        |   +0.79%   |   4.64%   |   2.73%        | 20    |   +0.36%       | 0.77    | 0.65  |
    | TPCH(2)  | TPCH-Q19 | parquet / none / none | 0.83   | 0.83        |   +0.65%   |   3.33%   |   2.80%        | 20    |   +0.44%       | 2.18    | 0.67  |
    | TPCH(2)  | TPCH-Q14 | parquet / none / none | 0.62   | 0.62        |   +0.97%   |   2.86%   |   1.00%        | 20    |   +0.04%       | 0.13    | 1.42  |
    | TPCH(2)  | TPCH-Q3  | parquet / none / none | 0.88   | 0.87        |   +0.57%   |   2.17%   |   1.74%        | 20    |   +0.29%       | 1.15    | 0.92  |
    | TPCH(2)  | TPCH-Q12 | parquet / none / none | 0.53   | 0.53        |   +0.27%   |   4.58%   |   5.78%        | 20    |   +0.46%       | 1.47    | 0.16  |
    | TPCH(2)  | TPCH-Q17 | parquet / none / none | 0.72   | 0.72        |   +0.15%   |   3.64%   |   5.55%        | 20    |   +0.21%       | 0.86    | 0.10  |
    | TPCH(2)  | TPCH-Q21 | parquet / none / none | 2.05   | 2.05        |   +0.21%   |   1.99%   |   2.37%        | 20    |   +0.01%       | 0.25    | 0.30  |
    | TPCH(2)  | TPCH-Q5  | parquet / none / none | 1.28   | 1.27        |   +0.24%   |   1.61%   |   1.80%        | 20    |   -0.02%       | -0.57   | 0.44  |
    | TPCH(2)  | TPCH-Q13 | parquet / none / none | 1.27   | 1.27        |   -0.34%   |   1.69%   |   1.83%        | 20    |   -0.20%       | -1.65   | -0.61 |
    | TPCH(2)  | TPCH-Q7  | parquet / none / none | 1.72   | 1.73        |   -0.55%   |   2.40%   |   1.69%        | 20    |   -0.03%       | -0.42   | -0.83 |
    | TPCH(2)  | TPCH-Q8  | parquet / none / none | 1.27   | 1.28        |   -0.68%   |   3.10%   |   3.89%        | 20    |   -0.06%       | -0.54   | -0.62 |
    | TPCH(2)  | TPCH-Q6  | parquet / none / none | 0.36   | 0.36        |   -0.84%   |   0.79%   |   3.51%        | 20    |   -0.07%       | -0.36   | -1.04 |
    | TPCH(2)  | TPCH-Q2  | parquet / none / none | 0.65   | 0.65        |   -1.17%   |   4.76%   |   5.99%        | 20    |   -0.05%       | -0.25   | -0.69 |
    | TPCH(2)  | TPCH-Q9  | parquet / none / none | 1.59   | 1.62        |   -2.01%   |   1.45%   |   5.12%        | 20    |   -0.16%       | -1.24   | -1.69 |
    | TPCH(2)  | TPCH-Q20 | parquet / none / none | 0.68   | 0.69        |   -1.73%   |   4.35%   |   4.43%        | 20    |   -0.49%       | -1.74   | -1.25 |
    | TPCH(2)  | TPCH-Q22 | parquet / none / none | 0.38   | 0.40        |   -2.89%   |   7.42%   |   6.39%        | 20    |   -0.21%       | -0.66   | -1.34 |
    | TPCH(2)  | TPCH-Q16 | parquet / none / none | 0.59   | 0.62        |   -4.01%   |   6.33%   |   5.83%        | 20    |   -4.72%       | -1.39   | -2.13 |
    +----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+-------+
    
    Change-Id: I839d7a3a2f5e1309c33a1f66013ef11628c5dc11
    Reviewed-on: http://gerrit.cloudera.org:8080/12797
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 be/src/codegen/codegen-anyval.cc                   |  54 ++--
 be/src/codegen/codegen-anyval.h                    |   8 +-
 be/src/codegen/gen_ir_descriptions.py              |  26 +-
 be/src/codegen/impala-ir.cc                        |   2 -
 be/src/codegen/llvm-codegen.cc                     |  17 +-
 be/src/codegen/llvm-codegen.h                      |   1 +
 be/src/exec/aggregator.cc                          |   2 +-
 be/src/exec/exec-node.cc                           |   2 +-
 be/src/exec/filter-context.cc                      |   4 +-
 be/src/exec/grouping-aggregator.cc                 |   4 +-
 be/src/exec/hash-table-test.cc                     |   4 +-
 be/src/exec/hash-table.cc                          |   4 +-
 be/src/exec/hdfs-scanner.cc                        |   2 +-
 be/src/exec/union-node.cc                          |   4 +-
 be/src/exprs/CMakeLists.txt                        |   2 -
 be/src/exprs/agg-fn.cc                             |   2 +-
 be/src/exprs/case-expr.cc                          |  13 +-
 be/src/exprs/case-expr.h                           |  16 +-
 be/src/exprs/compound-predicates.cc                |  21 +-
 be/src/exprs/compound-predicates.h                 |  10 +-
 be/src/exprs/conditional-functions-ir.cc           |   8 +-
 be/src/exprs/conditional-functions.cc              |   5 +-
 be/src/exprs/conditional-functions.h               |  53 +---
 be/src/exprs/hive-udf-call.cc                      |  29 +-
 be/src/exprs/hive-udf-call.h                       |  23 +-
 be/src/exprs/is-not-empty-predicate.cc             | 125 +++-----
 be/src/exprs/is-not-empty-predicate.h              |   9 +-
 be/src/exprs/kudu-partition-expr.cc                |  13 +-
 be/src/exprs/kudu-partition-expr.h                 |  10 +-
 be/src/exprs/literal.cc                            |  35 +--
 be/src/exprs/literal.h                             |  16 +-
 be/src/exprs/null-literal.cc                       |  40 ++-
 be/src/exprs/null-literal.h                        |  18 +-
 be/src/exprs/scalar-expr-evaluator.cc              |   4 +-
 be/src/exprs/scalar-expr-ir.cc                     |  54 ++--
 be/src/exprs/scalar-expr.cc                        | 160 +++++-----
 be/src/exprs/scalar-expr.h                         | 208 ++++++++++---
 be/src/exprs/scalar-expr.inline.h                  |  67 +++++
 be/src/exprs/scalar-fn-call.cc                     | 322 +++++++--------------
 be/src/exprs/scalar-fn-call.h                      |  50 +---
 be/src/exprs/slot-ref-ir.cc                        |  34 ---
 be/src/exprs/slot-ref.cc                           |  46 +--
 be/src/exprs/slot-ref.h                            |  24 +-
 be/src/exprs/tuple-is-null-predicate.cc            |  12 +-
 be/src/exprs/tuple-is-null-predicate.h             |  10 +-
 be/src/exprs/valid-tuple-id.cc                     |  19 +-
 be/src/exprs/valid-tuple-id.h                      |   8 +-
 be/src/runtime/CMakeLists.txt                      |   1 +
 .../collection-value.cc}                           |  11 +-
 be/src/runtime/collection-value.h                  |   3 +
 be/src/runtime/data-stream-test.cc                 |   2 +-
 be/src/runtime/descriptors.cc                      |   2 -
 be/src/runtime/fragment-instance-state.cc          |   2 +-
 be/src/runtime/krpc-data-stream-sender.cc          |   3 +-
 be/src/runtime/runtime-state.cc                    |   6 +-
 be/src/runtime/runtime-state.h                     |  37 ++-
 be/src/runtime/tuple.cc                            |   2 +-
 be/src/service/fe-support.cc                       |   4 +-
 be/src/udf/udf-internal.h                          |  14 +-
 be/src/util/tuple-row-compare.cc                   |   2 +-
 .../QueryTest/datastream-sender-codegen.test       |   6 +-
 .../queries/QueryTest/disable-codegen.test         |   3 +-
 .../functional-query/queries/QueryTest/udf.test    |   3 +
 .../tpcds-insert/queries/expr-insert.test          |  21 ++
 tests/query_test/test_codegen.py                   |  39 ++-
 tests/query_test/test_tpcds_queries.py             |   3 +
 66 files changed, 843 insertions(+), 921 deletions(-)

diff --git a/be/src/codegen/codegen-anyval.cc b/be/src/codegen/codegen-anyval.cc
index feb8281..4bf4ccf 100644
--- a/be/src/codegen/codegen-anyval.cc
+++ b/be/src/codegen/codegen-anyval.cc
@@ -31,13 +31,14 @@ const char* CodegenAnyVal::LLVM_INTVAL_NAME       = "struct.impala_udf::IntVal";
 const char* CodegenAnyVal::LLVM_BIGINTVAL_NAME    = "struct.impala_udf::BigIntVal";
 const char* CodegenAnyVal::LLVM_FLOATVAL_NAME     = "struct.impala_udf::FloatVal";
 const char* CodegenAnyVal::LLVM_DOUBLEVAL_NAME    = "struct.impala_udf::DoubleVal";
-const char* CodegenAnyVal::LLVM_STRINGVAL_NAME    = "struct.impala_udf::StringVal";
+const char* CodegenAnyVal::LLVM_STRINGVAL_NAME = "struct.impala_udf::StringVal";
 const char* CodegenAnyVal::LLVM_TIMESTAMPVAL_NAME = "struct.impala_udf::TimestampVal";
 const char* CodegenAnyVal::LLVM_DECIMALVAL_NAME   = "struct.impala_udf::DecimalVal";
 const char* CodegenAnyVal::LLVM_DATEVAL_NAME      = "struct.impala_udf::DateVal";
+const char* CodegenAnyVal::LLVM_COLLECTIONVAL_NAME = "struct.impala_udf::CollectionVal";
 
 llvm::Type* CodegenAnyVal::GetLoweredType(LlvmCodeGen* cg, const ColumnType& type) {
-  switch(type.type) {
+  switch (type.type) {
     case TYPE_BOOLEAN: // i16
       return cg->i16_type();
     case TYPE_TINYINT: // i16
@@ -54,11 +55,11 @@ llvm::Type* CodegenAnyVal::GetLoweredType(LlvmCodeGen* cg, const ColumnType& typ
       return llvm::StructType::get(cg->i8_type(), cg->double_type());
     case TYPE_STRING: // { i64, i8* }
     case TYPE_VARCHAR: // { i64, i8* }
+    case TYPE_CHAR: // Uses StringVal, so same as STRING/VARCHAR.
     case TYPE_FIXED_UDA_INTERMEDIATE: // { i64, i8* }
+    case TYPE_ARRAY: // CollectionVal has same memory layout as StringVal.
+    case TYPE_MAP: // CollectionVal has same memory layout as StringVal.
       return llvm::StructType::get(cg->i64_type(), cg->ptr_type());
-    case TYPE_CHAR:
-      DCHECK(false) << "NYI:" << type.DebugString();
-      return NULL;
     case TYPE_TIMESTAMP: // { i64, i64 }
       return llvm::StructType::get(cg->i64_type(), cg->i64_type());
     case TYPE_DECIMAL: // %"struct.impala_udf::DecimalVal" (isn't lowered)
@@ -103,12 +104,10 @@ llvm::Type* CodegenAnyVal::GetUnloweredType(LlvmCodeGen* cg, const ColumnType& t
       break;
     case TYPE_STRING:
     case TYPE_VARCHAR:
+    case TYPE_CHAR:
     case TYPE_FIXED_UDA_INTERMEDIATE:
       result = cg->GetNamedType(LLVM_STRINGVAL_NAME);
       break;
-    case TYPE_CHAR:
-      DCHECK(false) << "NYI:" << type.DebugString();
-      return NULL;
     case TYPE_TIMESTAMP:
       result = cg->GetNamedType(LLVM_TIMESTAMPVAL_NAME);
       break;
@@ -118,6 +117,10 @@ llvm::Type* CodegenAnyVal::GetUnloweredType(LlvmCodeGen* cg, const ColumnType& t
     case TYPE_DATE:
       result = cg->GetNamedType(LLVM_DATEVAL_NAME);
       break;
+    case TYPE_ARRAY:
+    case TYPE_MAP:
+      result = cg->GetNamedType(LLVM_COLLECTIONVAL_NAME);
+      break;
     default:
       DCHECK(false) << "Unsupported type: " << type;
       return NULL;
@@ -207,16 +210,16 @@ llvm::Value* CodegenAnyVal::GetIsNull(const char* name) const {
     }
     case TYPE_STRING:
     case TYPE_VARCHAR:
+    case TYPE_CHAR:
     case TYPE_FIXED_UDA_INTERMEDIATE:
-    case TYPE_TIMESTAMP: {
+    case TYPE_TIMESTAMP:
+    case TYPE_ARRAY:
+    case TYPE_MAP: {
       // Lowered type is of form { i64, *}. Get the first byte of the i64 value.
       llvm::Value* v = builder_->CreateExtractValue(value_, 0);
       DCHECK(v->getType() == codegen_->i64_type());
       return builder_->CreateTrunc(v, codegen_->bool_type(), name);
     }
-    case TYPE_CHAR:
-      DCHECK(false) << "NYI:" << type_.DebugString();
-      return NULL;
     case TYPE_BOOLEAN:
     case TYPE_TINYINT:
     case TYPE_SMALLINT:
@@ -253,8 +256,11 @@ void CodegenAnyVal::SetIsNull(llvm::Value* is_null) {
     }
     case TYPE_STRING:
     case TYPE_VARCHAR:
+    case TYPE_CHAR:
     case TYPE_FIXED_UDA_INTERMEDIATE:
-    case TYPE_TIMESTAMP: {
+    case TYPE_TIMESTAMP:
+    case TYPE_ARRAY:
+    case TYPE_MAP: {
       // Lowered type is of the form { i64, * }. Set the first byte of the i64 value to
       // 'is_null'
       llvm::Value* v = builder_->CreateExtractValue(value_, 0);
@@ -265,9 +271,6 @@ void CodegenAnyVal::SetIsNull(llvm::Value* is_null) {
       value_ = builder_->CreateInsertValue(value_, v, 0, name_);
       break;
     }
-    case TYPE_CHAR:
-      DCHECK(false) << "NYI:" << type_.DebugString();
-      break;
     case TYPE_BOOLEAN:
     case TYPE_TINYINT:
     case TYPE_SMALLINT:
@@ -344,6 +347,7 @@ void CodegenAnyVal::SetVal(llvm::Value* val) {
       << "Use SetPtr and SetLen for FixedUdaIntermediate";
   DCHECK(type_.type != TYPE_TIMESTAMP)
       << "Use SetDate and SetTimeOfDay for TimestampVals";
+  DCHECK(!type_.IsCollectionType()) << "Use SetPtr and SetLen for CollectionVal";
   switch(type_.type) {
     case TYPE_BOOLEAN:
     case TYPE_TINYINT:
@@ -424,26 +428,28 @@ void CodegenAnyVal::SetVal(double val) {
 
 llvm::Value* CodegenAnyVal::GetPtr() {
   // Set the second pointer value to 'ptr'.
-  DCHECK(type_.IsStringType());
+  DCHECK(type_.IsStringType() || type_.IsCollectionType());
   return builder_->CreateExtractValue(value_, 1, name_);
 }
 
 llvm::Value* CodegenAnyVal::GetLen() {
   // Get the high bytes of the first value.
-  DCHECK(type_.IsStringType());
+  DCHECK(type_.IsStringType() || type_.IsCollectionType());
   llvm::Value* v = builder_->CreateExtractValue(value_, 0);
   return GetHighBits(32, v);
 }
 
 void CodegenAnyVal::SetPtr(llvm::Value* ptr) {
   // Set the second pointer value to 'ptr'.
-  DCHECK(type_.IsStringType() || type_.type == TYPE_FIXED_UDA_INTERMEDIATE);
+  DCHECK(type_.IsStringType() || type_.type == TYPE_FIXED_UDA_INTERMEDIATE
+      || type_.IsCollectionType());
   value_ = builder_->CreateInsertValue(value_, ptr, 1, name_);
 }
 
 void CodegenAnyVal::SetLen(llvm::Value* len) {
   // Set the high bytes of the first value to 'len'.
-  DCHECK(type_.IsStringType() || type_.type == TYPE_FIXED_UDA_INTERMEDIATE);
+  DCHECK(type_.IsStringType() || type_.type == TYPE_FIXED_UDA_INTERMEDIATE
+      || type_.IsCollectionType());
   llvm::Value* v = builder_->CreateExtractValue(value_, 0);
   v = SetHighBits(32, len, v);
   value_ = builder_->CreateInsertValue(value_, v, 0, name_);
@@ -533,15 +539,13 @@ void CodegenAnyVal::LoadFromNativePtr(llvm::Value* raw_val_ptr) {
       SetLen(builder_->CreateExtractValue(string_value, 1, "len"));
       break;
     }
+    case TYPE_CHAR:
     case TYPE_FIXED_UDA_INTERMEDIATE: {
       // Convert fixed-size slot to StringVal.
       SetPtr(builder_->CreateBitCast(raw_val_ptr, codegen_->ptr_type()));
       SetLen(codegen_->GetI32Constant(type_.len));
       break;
     }
-    case TYPE_CHAR:
-      DCHECK(false) << "NYI:" << type_.DebugString();
-      break;
     case TYPE_TIMESTAMP: {
       // Convert TimestampValue to TimestampVal
       // TimestampValue has type
@@ -601,6 +605,9 @@ void CodegenAnyVal::StoreToNativePtr(llvm::Value* raw_val_ptr, llvm::Value* pool
       builder_->CreateStore(string_value, raw_val_ptr);
       break;
     }
+    case TYPE_CHAR:
+      codegen_->CodegenMemcpy(builder_, raw_val_ptr, GetPtr(), type_.len);
+      break;
     case TYPE_FIXED_UDA_INTERMEDIATE:
       DCHECK(false) << "FIXED_UDA_INTERMEDIATE does not need to be copied: the "
                     << "StringVal must be set up to point to the output slot";
@@ -771,6 +778,7 @@ llvm::Value* CodegenAnyVal::EqToNativePtr(llvm::Value* native_ptr,
     }
     case TYPE_STRING:
     case TYPE_VARCHAR:
+    case TYPE_CHAR:
     case TYPE_FIXED_UDA_INTERMEDIATE: {
       llvm::Function* eq_fn =
           codegen_->GetFunction(IRFunction::CODEGEN_ANYVAL_STRING_VALUE_EQ, false);
diff --git a/be/src/codegen/codegen-anyval.h b/be/src/codegen/codegen-anyval.h
index 3d06102..bc271fe 100644
--- a/be/src/codegen/codegen-anyval.h
+++ b/be/src/codegen/codegen-anyval.h
@@ -41,7 +41,7 @@ namespace impala {
 /// generated instructions perform the integer manipulation equivalent to setting the
 /// fields of the original struct type.
 //
-/// Lowered types:
+/// Lowered types (in x86-64 ABI):
 /// TYPE_BOOLEAN/BooleanVal: i16
 /// TYPE_TINYINT/TinyIntVal: i16
 /// TYPE_SMALLINT/SmallIntVal: i32
@@ -50,6 +50,7 @@ namespace impala {
 /// TYPE_FLOAT/FloatVal: i64
 /// TYPE_DOUBLE/DoubleVal: { i8, double }
 /// TYPE_STRING,TYPE_VARCHAR,TYPE_CHAR,TYPE_FIXED_UDA_INTERMEDIATE/StringVal: { i64, i8* }
+/// TYPE_ARRAY/TYPE_MAP/CollectionVal: { i64, i8* }
 /// TYPE_TIMESTAMP/TimestampVal: { i64, i64 }
 /// TYPE_DECIMAL/DecimalVal (isn't lowered):
 /// %"struct.impala_udf::DecimalVal" { {i8}, [15 x i8], {i128} }
@@ -70,6 +71,7 @@ class CodegenAnyVal {
   static const char* LLVM_TIMESTAMPVAL_NAME;
   static const char* LLVM_DECIMALVAL_NAME;
   static const char* LLVM_DATEVAL_NAME;
+  static const char* LLVM_COLLECTIONVAL_NAME;
 
   /// Creates a call to 'fn', which should return a (lowered) *Val, and returns the result.
   /// This abstracts over the x64 calling convention, in particular for functions returning
@@ -168,11 +170,11 @@ class CodegenAnyVal {
   void SetVal(float val);
   void SetVal(double val);
 
-  /// Getters for StringVals.
+  /// Getters for StringVals and CollectionVals.
   llvm::Value* GetPtr();
   llvm::Value *GetLen();
 
-  /// Setters for StringVals.
+  /// Setters for StringVals and CollectionVals.
   void SetPtr(llvm::Value* ptr);
   void SetLen(llvm::Value* len);
 
diff --git a/be/src/codegen/gen_ir_descriptions.py b/be/src/codegen/gen_ir_descriptions.py
index 7bd7a90..1b369d9 100755
--- a/be/src/codegen/gen_ir_descriptions.py
+++ b/be/src/codegen/gen_ir_descriptions.py
@@ -76,31 +76,27 @@ ir_functions = [
   ["CODEGEN_ANYVAL_TIMESTAMP_VALUE_EQ",
    "_Z16TimestampValueEqRKN10impala_udf12TimestampValERKN6impala14TimestampValueE"],
   ["SCALAR_EXPR_GET_BOOLEAN_VAL",
-   "_ZN6impala10ScalarExpr13GetBooleanValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr24GetBooleanValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_TINYINT_VAL",
-   "_ZN6impala10ScalarExpr13GetTinyIntValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr24GetTinyIntValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_SMALLINT_VAL",
-   "_ZN6impala10ScalarExpr14GetSmallIntValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr25GetSmallIntValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_INT_VAL",
-   "_ZN6impala10ScalarExpr9GetIntValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr20GetIntValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_BIGINT_VAL",
-   "_ZN6impala10ScalarExpr12GetBigIntValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr23GetBigIntValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_FLOAT_VAL",
-   "_ZN6impala10ScalarExpr11GetFloatValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr22GetFloatValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_DOUBLE_VAL",
-   "_ZN6impala10ScalarExpr12GetDoubleValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr23GetDoubleValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_STRING_VAL",
-   "_ZN6impala10ScalarExpr12GetStringValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr23GetStringValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_TIMESTAMP_VAL",
-   "_ZN6impala10ScalarExpr15GetTimestampValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr26GetTimestampValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_DECIMAL_VAL",
-   "_ZN6impala10ScalarExpr13GetDecimalValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
-  ["SCALAR_EXPR_SLOT_REF_GET_COLLECTION_VAL",
-   "_ZNK6impala7SlotRef16GetCollectionValEPNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
-  ["SCALAR_EXPR_NULL_LITERAL_GET_COLLECTION_VAL",
-   "_ZNK6impala11NullLiteral16GetCollectionValEPNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr24GetDecimalValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["SCALAR_EXPR_GET_DATE_VAL",
-   "_ZN6impala10ScalarExpr10GetDateValEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
+   "_ZN6impala10ScalarExpr21GetDateValInterpretedEPS0_PNS_19ScalarExprEvaluatorEPKNS_8TupleRowE"],
   ["HASH_CRC", "IrCrcHash"],
   ["HASH_MURMUR", "IrMurmurHash"],
   ["PHJ_PROCESS_BUILD_BATCH",
diff --git a/be/src/codegen/impala-ir.cc b/be/src/codegen/impala-ir.cc
index b1e72a1..b07d9c2 100644
--- a/be/src/codegen/impala-ir.cc
+++ b/be/src/codegen/impala-ir.cc
@@ -49,10 +49,8 @@
 #include "exprs/is-null-predicate-ir.cc"
 #include "exprs/like-predicate-ir.cc"
 #include "exprs/math-functions-ir.cc"
-#include "exprs/null-literal-ir.cc"
 #include "exprs/operators-ir.cc"
 #include "exprs/scalar-expr-ir.cc"
-#include "exprs/slot-ref-ir.cc"
 #include "exprs/string-functions-ir.cc"
 #include "exprs/timestamp-functions-ir.cc"
 #include "exprs/udf-builtins-ir.cc"
diff --git a/be/src/codegen/llvm-codegen.cc b/be/src/codegen/llvm-codegen.cc
index 7b0417f..c518f0b 100644
--- a/be/src/codegen/llvm-codegen.cc
+++ b/be/src/codegen/llvm-codegen.cc
@@ -65,6 +65,7 @@
 #include "common/logging.h"
 #include "exprs/anyval-util.h"
 #include "impala-ir/impala-ir-names.h"
+#include "runtime/collection-value.h"
 #include "runtime/descriptors.h"
 #include "runtime/hdfs-fs-cache.h"
 #include "runtime/lib-cache.h"
@@ -181,7 +182,9 @@ Status LlvmCodeGen::InitializeLlvm(bool load_backend) {
     DCHECK_EQ(FN_MAPPINGS[i].fn, i);
     const string& fn_name = FN_MAPPINGS[i].fn_name;
     if (init_codegen->module_->getFunction(fn_name) == nullptr) {
-      return Status(Substitute("Failed to find function $0", fn_name));
+      const string& err_msg = Substitute("Failed to find function $0", fn_name);
+      LOG(ERROR) << err_msg;
+      return Status(err_msg);
     }
   }
 
@@ -385,6 +388,9 @@ Status LlvmCodeGen::CreateImpalaCodegen(RuntimeState* state,
   // Get type for TimestampValue
   codegen->timestamp_value_type_ = codegen->GetStructType<TimestampValue>();
 
+  // Get type for CollectionValue
+  codegen->collection_value_type_ = codegen->GetStructType<CollectionValue>();
+
   // Verify size is correct
   const llvm::DataLayout& data_layout = codegen->execution_engine()->getDataLayout();
   const llvm::StructLayout* layout = data_layout.getStructLayout(
@@ -532,20 +538,19 @@ llvm::Type* LlvmCodeGen::GetSlotType(const ColumnType& type) {
     case TYPE_STRING:
     case TYPE_VARCHAR:
       return string_value_type_;
+    case TYPE_CHAR:
     case TYPE_FIXED_UDA_INTERMEDIATE:
       // Represent this as an array of bytes.
       return llvm::ArrayType::get(i8_type(), type.len);
-    case TYPE_CHAR:
-      // IMPALA-3207: Codegen for CHAR is not yet implemented, this should not
-      // be called for TYPE_CHAR.
-      DCHECK(false) << "NYI";
-      return NULL;
     case TYPE_TIMESTAMP:
       return timestamp_value_type_;
     case TYPE_DECIMAL:
       return llvm::Type::getIntNTy(context(), type.GetByteSize() * 8);
     case TYPE_DATE:
       return i32_type();
+    case TYPE_ARRAY:
+    case TYPE_MAP:
+      return collection_value_type_;
     default:
       DCHECK(false) << "Invalid type: " << type;
       return NULL;
diff --git a/be/src/codegen/llvm-codegen.h b/be/src/codegen/llvm-codegen.h
index 97300a0..6e4ab95 100644
--- a/be/src/codegen/llvm-codegen.h
+++ b/be/src/codegen/llvm-codegen.h
@@ -869,6 +869,7 @@ class LlvmCodeGen {
   llvm::Type* void_type_;                   // void
   llvm::Type* string_value_type_;           // StringValue
   llvm::Type* timestamp_value_type_;        // TimestampValue
+  llvm::Type* collection_value_type_;       // CollectionValue
 
   /// llvm constants to help with code gen verbosity
   llvm::Constant* true_value_;
diff --git a/be/src/exec/aggregator.cc b/be/src/exec/aggregator.cc
index 672d927..5dfed29 100644
--- a/be/src/exec/aggregator.cc
+++ b/be/src/exec/aggregator.cc
@@ -322,7 +322,7 @@ Status Aggregator::CodegenUpdateSlot(LlvmCodeGen* codegen, int agg_fn_idx,
   for (int i = 0; i < num_inputs; ++i) {
     ScalarExpr* input_expr = agg_fn->GetChild(i);
     llvm::Function* input_expr_fn;
-    RETURN_IF_ERROR(input_expr->GetCodegendComputeFn(codegen, &input_expr_fn));
+    RETURN_IF_ERROR(input_expr->GetCodegendComputeFn(codegen, false, &input_expr_fn));
     DCHECK(input_expr_fn != nullptr);
 
     // Call input expr function with the matching evaluator to get src slot value.
diff --git a/be/src/exec/exec-node.cc b/be/src/exec/exec-node.cc
index 1dda4f2..833bd86 100644
--- a/be/src/exec/exec-node.cc
+++ b/be/src/exec/exec-node.cc
@@ -505,7 +505,7 @@ Status ExecNode::CodegenEvalConjuncts(LlvmCodeGen* codegen,
     const vector<ScalarExpr*>& conjuncts, llvm::Function** fn, const char* name) {
   llvm::Function* conjunct_fns[conjuncts.size()];
   for (int i = 0; i < conjuncts.size(); ++i) {
-    RETURN_IF_ERROR(conjuncts[i]->GetCodegendComputeFn(codegen, &conjunct_fns[i]));
+    RETURN_IF_ERROR(conjuncts[i]->GetCodegendComputeFn(codegen, false, &conjunct_fns[i]));
     if (i >= LlvmCodeGen::CODEGEN_INLINE_EXPRS_THRESHOLD) {
       // Avoid bloating EvalConjuncts by inlining everything into it.
       codegen->SetNoInline(conjunct_fns[i]);
diff --git a/be/src/exec/filter-context.cc b/be/src/exec/filter-context.cc
index 4ec39fc..e264192 100644
--- a/be/src/exec/filter-context.cc
+++ b/be/src/exec/filter-context.cc
@@ -165,7 +165,7 @@ Status FilterContext::CodegenEval(
       llvm::BasicBlock::Create(context, "eval_filter", eval_filter_fn);
 
   llvm::Function* compute_fn;
-  RETURN_IF_ERROR(filter_expr->GetCodegendComputeFn(codegen, &compute_fn));
+  RETURN_IF_ERROR(filter_expr->GetCodegendComputeFn(codegen, false, &compute_fn));
   DCHECK(compute_fn != nullptr);
 
   // The function for checking against the bloom filter for match.
@@ -334,7 +334,7 @@ Status FilterContext::CodegenInsert(LlvmCodeGen* codegen, ScalarExpr* filter_exp
       llvm::BasicBlock::Create(context, "insert_filter", insert_filter_fn);
 
   llvm::Function* compute_fn;
-  RETURN_IF_ERROR(filter_expr->GetCodegendComputeFn(codegen, &compute_fn));
+  RETURN_IF_ERROR(filter_expr->GetCodegendComputeFn(codegen, false, &compute_fn));
   DCHECK(compute_fn != nullptr);
 
   // Load 'expr_eval' from 'this_arg' FilterContext object.
diff --git a/be/src/exec/grouping-aggregator.cc b/be/src/exec/grouping-aggregator.cc
index 2490bff..6f465f2 100644
--- a/be/src/exec/grouping-aggregator.cc
+++ b/be/src/exec/grouping-aggregator.cc
@@ -114,7 +114,9 @@ Status GroupingAggregator::Init(const TAggregator& taggregator, RuntimeState* st
         pool_->Add(desc->type().type != TYPE_NULL ? new SlotRef(desc) :
                                                     new SlotRef(desc, TYPE_BOOLEAN));
     build_exprs_.push_back(build_expr);
-    RETURN_IF_ERROR(build_expr->Init(intermediate_row_desc_, state));
+    // Not an entry point because all hash table callers support codegen.
+    RETURN_IF_ERROR(
+        build_expr->Init(intermediate_row_desc_, /* is_entry_point */ false, state));
     if (build_expr->type().IsVarLenStringType()) string_grouping_exprs_.push_back(i);
   }
 
diff --git a/be/src/exec/hash-table-test.cc b/be/src/exec/hash-table-test.cc
index 9151d6e..7293df7 100644
--- a/be/src/exec/hash-table-test.cc
+++ b/be/src/exec/hash-table-test.cc
@@ -81,14 +81,14 @@ class HashTableTest : public testing::Test {
     // simplest.  The purpose of these tests is to exercise the hash map
     // internals so a simple build/probe expr is fine.
     ScalarExpr* build_expr = pool_.Add(new SlotRef(TYPE_INT, 1, true /* nullable */));
-    ASSERT_OK(build_expr->Init(desc, nullptr));
+    ASSERT_OK(build_expr->Init(desc, true, nullptr));
     build_exprs_.push_back(build_expr);
     ASSERT_OK(ScalarExprEvaluator::Create(build_exprs_, nullptr, &pool_, &mem_pool_,
         &mem_pool_, &build_expr_evals_));
     ASSERT_OK(ScalarExprEvaluator::Open(build_expr_evals_, nullptr));
 
     ScalarExpr* probe_expr = pool_.Add(new SlotRef(TYPE_INT, 1, true /* nullable */));
-    ASSERT_OK(probe_expr->Init(desc, nullptr));
+    ASSERT_OK(probe_expr->Init(desc, true, nullptr));
     probe_exprs_.push_back(probe_expr);
     ASSERT_OK(ScalarExprEvaluator::Create(probe_exprs_, nullptr, &pool_, &mem_pool_,
         &mem_pool_, &probe_expr_evals_));
diff --git a/be/src/exec/hash-table.cc b/be/src/exec/hash-table.cc
index 98d03b0..556434b 100644
--- a/be/src/exec/hash-table.cc
+++ b/be/src/exec/hash-table.cc
@@ -760,7 +760,7 @@ Status HashTableCtx::CodegenEvalRow(
 
     // Call expr
     llvm::Function* expr_fn;
-    Status status = exprs[i]->GetCodegendComputeFn(codegen, &expr_fn);
+    Status status = exprs[i]->GetCodegendComputeFn(codegen, false, &expr_fn);
     if (!status.ok()) {
       *fn = NULL;
       return Status(Substitute(
@@ -1112,7 +1112,7 @@ Status HashTableCtx::CodegenEquals(
 
     // call GetValue on build_exprs[i]
     llvm::Function* expr_fn;
-    Status status = build_exprs_[i]->GetCodegendComputeFn(codegen, &expr_fn);
+    Status status = build_exprs_[i]->GetCodegendComputeFn(codegen, false, &expr_fn);
     if (!status.ok()) {
       *fn = NULL;
       return Status(
diff --git a/be/src/exec/hdfs-scanner.cc b/be/src/exec/hdfs-scanner.cc
index ae04cec..da7f5a6 100644
--- a/be/src/exec/hdfs-scanner.cc
+++ b/be/src/exec/hdfs-scanner.cc
@@ -486,7 +486,7 @@ Status HdfsScanner::CodegenWriteCompleteTuple(const HdfsScanNodeBase* node,
       parse_block = llvm::BasicBlock::Create(context, "parse", fn, eval_fail_block);
       llvm::Function* conjunct_fn;
       Status status =
-          conjuncts[conjunct_idx]->GetCodegendComputeFn(codegen, &conjunct_fn);
+          conjuncts[conjunct_idx]->GetCodegendComputeFn(codegen, false, &conjunct_fn);
       if (!status.ok()) {
         stringstream ss;
         ss << "Failed to codegen conjunct: " << status.GetDetail();
diff --git a/be/src/exec/union-node.cc b/be/src/exec/union-node.cc
index cd331f2..9d6c8a1 100644
--- a/be/src/exec/union-node.cc
+++ b/be/src/exec/union-node.cc
@@ -111,8 +111,8 @@ void UnionNode::Codegen(RuntimeState* state) {
     codegen_status = Tuple::CodegenMaterializeExprs(codegen, false, *tuple_desc_,
         child_exprs_lists_[i], true, &tuple_materialize_exprs_fn);
     if (!codegen_status.ok()) {
-      // Codegen may fail in some corner cases (e.g. we don't handle TYPE_CHAR). If this
-      // happens, abort codegen for this and the remaining children.
+      // Codegen may fail in some corner cases. If this happens, abort codegen for this
+      // and the remaining children.
       codegen_message << "Codegen failed for child: " << children_[i]->id();
       break;
     }
diff --git a/be/src/exprs/CMakeLists.txt b/be/src/exprs/CMakeLists.txt
index 6158d59..734892d 100644
--- a/be/src/exprs/CMakeLists.txt
+++ b/be/src/exprs/CMakeLists.txt
@@ -48,13 +48,11 @@ add_library(Exprs
   literal.cc
   math-functions-ir.cc
   null-literal.cc
-  null-literal-ir.cc
   operators-ir.cc
   scalar-expr.cc
   scalar-expr-evaluator.cc
   scalar-expr-ir.cc
   slot-ref.cc
-  slot-ref-ir.cc
   string-functions.cc
   string-functions-ir.cc
   timestamp-functions.cc
diff --git a/be/src/exprs/agg-fn.cc b/be/src/exprs/agg-fn.cc
index a9d461f..2c619f9 100644
--- a/be/src/exprs/agg-fn.cc
+++ b/be/src/exprs/agg-fn.cc
@@ -63,7 +63,7 @@ AggFn::AggFn(const TExprNode& tnode, const SlotDescriptor& intermediate_slot_des
 Status AggFn::Init(const RowDescriptor& row_desc, RuntimeState* state) {
   // Initialize all children (i.e. input exprs to this aggregate expr).
   for (ScalarExpr* input_expr : children()) {
-    RETURN_IF_ERROR(input_expr->Init(row_desc, state));
+    RETURN_IF_ERROR(input_expr->Init(row_desc, /*is_entry_point*/ false, state));
   }
 
   // Initialize the aggregate expressions' internals.
diff --git a/be/src/exprs/case-expr.cc b/be/src/exprs/case-expr.cc
index 27a48d8..b244943 100644
--- a/be/src/exprs/case-expr.cc
+++ b/be/src/exprs/case-expr.cc
@@ -22,6 +22,7 @@
 #include "exprs/anyval-util.h"
 #include "exprs/conditional-functions.h"
 #include "exprs/scalar-expr-evaluator.h"
+#include "exprs/scalar-expr.inline.h"
 #include "runtime/runtime-state.h"
 
 #include "gen-cpp/Exprs_types.h"
@@ -175,16 +176,11 @@ string CaseExpr::DebugString() const {
 //                                   %"class.impala::TupleRow"* %row)
 //   ret i16 %else_val
 // }
-Status CaseExpr::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
-  if (ir_compute_fn_ != nullptr) {
-    *fn = ir_compute_fn_;
-    return Status::OK();
-  }
-
+Status CaseExpr::GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn) {
   const int num_children = GetNumChildren();
   llvm::Function* child_fns[num_children];
   for (int i = 0; i < num_children; ++i) {
-    RETURN_IF_ERROR(GetChild(i)->GetCodegendComputeFn(codegen, &child_fns[i]));
+    RETURN_IF_ERROR(GetChild(i)->GetCodegendComputeFn(codegen, false, &child_fns[i]));
   }
 
   llvm::LLVMContext& context = codegen->context();
@@ -279,7 +275,6 @@ Status CaseExpr::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
   }
   *fn = codegen->FinalizeFunction(function);
   if (UNLIKELY(*fn == nullptr)) return Status(TErrorCode::IR_VERIFY_FAILED, "CaseExpr");
-  ir_compute_fn_ = *fn;
   return Status::OK();
 }
 
@@ -368,7 +363,7 @@ bool CaseExpr::AnyValEq(
 }
 
 #define CASE_COMPUTE_FN(THEN_TYPE) \
-  THEN_TYPE CaseExpr::Get##THEN_TYPE( \
+  THEN_TYPE CaseExpr::Get##THEN_TYPE##Interpreted(            \
       ScalarExprEvaluator* eval, const TupleRow* row) const { \
     DCHECK(eval->opened()); \
     FunctionContext* fn_ctx = eval->fn_context(fn_ctx_idx_); \
diff --git a/be/src/exprs/case-expr.h b/be/src/exprs/case-expr.h
index b1037bd..cfc02d5 100644
--- a/be/src/exprs/case-expr.h
+++ b/be/src/exprs/case-expr.h
@@ -43,7 +43,7 @@ class TExprNode;
 
 class CaseExpr: public ScalarExpr {
  public:
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override;
 
@@ -61,19 +61,7 @@ class CaseExpr: public ScalarExpr {
       RuntimeState* state, ScalarExprEvaluator* eval)
       const override;
 
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
 
   bool has_case_expr() const { return has_case_expr_; }
   bool has_else_expr() const { return has_else_expr_; }
diff --git a/be/src/exprs/compound-predicates.cc b/be/src/exprs/compound-predicates.cc
index 55bc5c6..7b5f516 100644
--- a/be/src/exprs/compound-predicates.cc
+++ b/be/src/exprs/compound-predicates.cc
@@ -17,9 +17,10 @@
 
 #include <sstream>
 
-#include "exprs/compound-predicates.h"
 #include "codegen/codegen-anyval.h"
 #include "codegen/llvm-codegen.h"
+#include "exprs/compound-predicates.h"
+#include "exprs/scalar-expr.inline.h"
 #include "runtime/runtime-state.h"
 
 #include "common/names.h"
@@ -27,8 +28,8 @@
 using namespace impala;
 
 // (<> && false) is false, (true && NULL) is NULL
-BooleanVal AndPredicate::GetBooleanVal(ScalarExprEvaluator* eval,
-    const TupleRow* row) const {
+BooleanVal AndPredicate::GetBooleanValInterpreted(
+    ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(children_.size(), 2);
   BooleanVal val1 = children_[0]->GetBooleanVal(eval, row);
   if (!val1.is_null && !val1.val) return BooleanVal(false); // short-circuit
@@ -47,8 +48,8 @@ string AndPredicate::DebugString() const {
 }
 
 // (<> || true) is true, (false || NULL) is NULL
-BooleanVal OrPredicate::GetBooleanVal(ScalarExprEvaluator* eval,
-    const TupleRow* row) const {
+BooleanVal OrPredicate::GetBooleanValInterpreted(
+    ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(children_.size(), 2);
   BooleanVal val1 = children_[0]->GetBooleanVal(eval, row);
   if (!val1.is_null && val1.val) return BooleanVal(true); // short-circuit
@@ -121,16 +122,11 @@ string OrPredicate::DebugString() const {
 // }
 Status CompoundPredicate::CodegenComputeFn(
     bool and_fn, LlvmCodeGen* codegen, llvm::Function** fn) {
-  if (ir_compute_fn_ != NULL) {
-    *fn = ir_compute_fn_;
-    return Status::OK();
-  }
-
   DCHECK_EQ(GetNumChildren(), 2);
   llvm::Function* lhs_function;
-  RETURN_IF_ERROR(children()[0]->GetCodegendComputeFn(codegen, &lhs_function));
+  RETURN_IF_ERROR(children()[0]->GetCodegendComputeFn(codegen, false, &lhs_function));
   llvm::Function* rhs_function;
-  RETURN_IF_ERROR(children()[1]->GetCodegendComputeFn(codegen, &rhs_function));
+  RETURN_IF_ERROR(children()[1]->GetCodegendComputeFn(codegen, false, &rhs_function));
 
   llvm::LLVMContext& context = codegen->context();
   LlvmBuilder builder(context);
@@ -242,6 +238,5 @@ Status CompoundPredicate::CodegenComputeFn(
 
   *fn = codegen->FinalizeFunction(function);
   DCHECK(*fn != NULL);
-  ir_compute_fn_ = *fn;
   return Status::OK();
 }
diff --git a/be/src/exprs/compound-predicates.h b/be/src/exprs/compound-predicates.h
index 1b64208..6784ad7 100644
--- a/be/src/exprs/compound-predicates.h
+++ b/be/src/exprs/compound-predicates.h
@@ -41,9 +41,10 @@ class CompoundPredicate: public Predicate {
 /// Expr for evaluating and (&&) operators
 class AndPredicate: public CompoundPredicate {
  public:
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const;
+  virtual BooleanVal GetBooleanValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const;
 
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn) {
     return CompoundPredicate::CodegenComputeFn(true, codegen, fn);
   }
 
@@ -60,9 +61,10 @@ class AndPredicate: public CompoundPredicate {
 /// Expr for evaluating or (||) operators
 class OrPredicate: public CompoundPredicate {
  public:
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const;
+  virtual BooleanVal GetBooleanValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const;
 
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn) {
     return CompoundPredicate::CodegenComputeFn(false, codegen, fn);
   }
 
diff --git a/be/src/exprs/conditional-functions-ir.cc b/be/src/exprs/conditional-functions-ir.cc
index 5198bcd..4ce7b14 100644
--- a/be/src/exprs/conditional-functions-ir.cc
+++ b/be/src/exprs/conditional-functions-ir.cc
@@ -19,13 +19,14 @@
 
 #include "exprs/anyval-util.h"
 #include "exprs/scalar-expr-evaluator.h"
+#include "exprs/scalar-expr.inline.h"
 #include "udf/udf.h"
 
 using namespace impala;
 using namespace impala_udf;
 
 #define IS_NULL_COMPUTE_FUNCTION(type) \
-  type IsNullExpr::Get##type( \
+  type IsNullExpr::Get##type##Interpreted( \
       ScalarExprEvaluator* eval, const TupleRow* row) const { \
     DCHECK_EQ(children_.size(), 2); \
     type val = GetChild(0)->Get##type(eval, row);  \
@@ -92,7 +93,8 @@ ZERO_IF_NULL_COMPUTE_FUNCTION(DoubleVal);
 ZERO_IF_NULL_COMPUTE_FUNCTION(DecimalVal);
 
 #define IF_COMPUTE_FUNCTION(type) \
-  type IfExpr::Get##type(ScalarExprEvaluator* eval, const TupleRow* row) const { \
+  type IfExpr::Get##type##Interpreted( \
+      ScalarExprEvaluator* eval, const TupleRow* row) const { \
     DCHECK_EQ(children_.size(), 3); \
     BooleanVal cond = GetChild(0)->GetBooleanVal(eval, row); \
     if (cond.is_null || !cond.val) { \
@@ -114,7 +116,7 @@ IF_COMPUTE_FUNCTION(DecimalVal);
 IF_COMPUTE_FUNCTION(DateVal);
 
 #define COALESCE_COMPUTE_FUNCTION(type) \
-  type CoalesceExpr::Get##type( \
+  type CoalesceExpr::Get##type##Interpreted( \
       ScalarExprEvaluator* eval, const TupleRow* row) const { \
     DCHECK_GE(children_.size(), 1); \
     for (int i = 0; i < children_.size(); ++i) { \
diff --git a/be/src/exprs/conditional-functions.cc b/be/src/exprs/conditional-functions.cc
index 19e3231..369987e 100644
--- a/be/src/exprs/conditional-functions.cc
+++ b/be/src/exprs/conditional-functions.cc
@@ -24,8 +24,9 @@
 using namespace impala;
 using namespace impala_udf;
 
-#define CONDITIONAL_CODEGEN_FN(expr_class) \
-  Status expr_class::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) { \
+#define CONDITIONAL_CODEGEN_FN(expr_class)           \
+  Status expr_class::GetCodegendComputeFnImpl(       \
+      LlvmCodeGen* codegen, llvm::Function** fn) {   \
     return GetCodegendComputeFnWrapper(codegen, fn); \
   }
 
diff --git a/be/src/exprs/conditional-functions.h b/be/src/exprs/conditional-functions.h
index 620e9c1..c81665f 100644
--- a/be/src/exprs/conditional-functions.h
+++ b/be/src/exprs/conditional-functions.h
@@ -75,7 +75,7 @@ class ConditionalFunctions {
 
 class IsNullExpr : public ScalarExpr {
  public:
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override {
     return ScalarExpr::DebugString("IsNullExpr");
@@ -84,26 +84,13 @@ class IsNullExpr : public ScalarExpr {
  protected:
   friend class ScalarExpr;
   friend class ScalarExprEvaluator;
-
-  IsNullExpr(const TExprNode& node) : ScalarExpr(node) { }
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  IsNullExpr(const TExprNode& node) : ScalarExpr(node) {}
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
 };
 
 class IfExpr : public ScalarExpr {
  public:
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override {
     return ScalarExpr::DebugString("IfExpr");
@@ -113,25 +100,13 @@ class IfExpr : public ScalarExpr {
   friend class ScalarExpr;
   friend class ScalarExprEvaluator;
 
-  IfExpr(const TExprNode& node) : ScalarExpr(node) { }
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  IfExpr(const TExprNode& node) : ScalarExpr(node) {}
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
 };
 
 class CoalesceExpr : public ScalarExpr {
  public:
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override {
     return ScalarExpr::DebugString("CoalesceExpr");
@@ -142,19 +117,7 @@ class CoalesceExpr : public ScalarExpr {
   friend class ScalarExprEvaluator;
 
   CoalesceExpr(const TExprNode& node) : ScalarExpr(node) { }
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
 };
 
 }
diff --git a/be/src/exprs/hive-udf-call.cc b/be/src/exprs/hive-udf-call.cc
index 2b1794d..dd7c914 100644
--- a/be/src/exprs/hive-udf-call.cc
+++ b/be/src/exprs/hive-udf-call.cc
@@ -173,9 +173,10 @@ Status HiveUdfCall::InitEnv() {
   return Status::OK();
 }
 
-Status HiveUdfCall::Init(const RowDescriptor& row_desc, RuntimeState* state) {
+Status HiveUdfCall::Init(
+    const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) {
   // Initialize children first.
-  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, state));
+  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, is_entry_point, state));
 
   // Initialize input_byte_offsets_ and input_buffer_size_
   for (int i = 0; i < GetNumChildren(); ++i) {
@@ -278,7 +279,7 @@ void HiveUdfCall::CloseEvaluator(FunctionContext::FunctionStateScope scope,
   ScalarExpr::CloseEvaluator(scope, state, eval);
 }
 
-Status HiveUdfCall::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
+Status HiveUdfCall::GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn) {
   return GetCodegendComputeFnWrapper(codegen, fn);
 }
 
@@ -290,49 +291,49 @@ string HiveUdfCall::DebugString() const {
   return out.str();
 }
 
-BooleanVal HiveUdfCall::GetBooleanVal(
+BooleanVal HiveUdfCall::GetBooleanValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_BOOLEAN);
   return *reinterpret_cast<BooleanVal*>(Evaluate(eval, row));
 }
 
-TinyIntVal HiveUdfCall::GetTinyIntVal(
+TinyIntVal HiveUdfCall::GetTinyIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_TINYINT);
   return *reinterpret_cast<TinyIntVal*>(Evaluate(eval, row));
 }
 
-SmallIntVal HiveUdfCall::GetSmallIntVal(
+SmallIntVal HiveUdfCall::GetSmallIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_SMALLINT);
   return * reinterpret_cast<SmallIntVal*>(Evaluate(eval, row));
 }
 
-IntVal HiveUdfCall::GetIntVal(
+IntVal HiveUdfCall::GetIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_INT);
   return *reinterpret_cast<IntVal*>(Evaluate(eval, row));
 }
 
-BigIntVal HiveUdfCall::GetBigIntVal(
+BigIntVal HiveUdfCall::GetBigIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_BIGINT);
   return *reinterpret_cast<BigIntVal*>(Evaluate(eval, row));
 }
 
-FloatVal HiveUdfCall::GetFloatVal(
+FloatVal HiveUdfCall::GetFloatValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_FLOAT);
   return *reinterpret_cast<FloatVal*>(Evaluate(eval, row));
 }
 
-DoubleVal HiveUdfCall::GetDoubleVal(
+DoubleVal HiveUdfCall::GetDoubleValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DOUBLE);
   return *reinterpret_cast<DoubleVal*>(Evaluate(eval, row));
 }
 
-StringVal HiveUdfCall::GetStringVal(
+StringVal HiveUdfCall::GetStringValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_STRING);
   StringVal result = *reinterpret_cast<StringVal*>(Evaluate(eval, row));
@@ -344,19 +345,19 @@ StringVal HiveUdfCall::GetStringVal(
   return StringVal::CopyFrom(fn_ctx, result.ptr, result.len);
 }
 
-TimestampVal HiveUdfCall::GetTimestampVal(
+TimestampVal HiveUdfCall::GetTimestampValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_TIMESTAMP);
   return *reinterpret_cast<TimestampVal*>(Evaluate(eval, row));
 }
 
-DecimalVal HiveUdfCall::GetDecimalVal(
+DecimalVal HiveUdfCall::GetDecimalValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DECIMAL);
   return *reinterpret_cast<DecimalVal*>(Evaluate(eval, row));
 }
 
-DateVal HiveUdfCall::GetDateVal(
+DateVal HiveUdfCall::GetDateValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DATE);
   return *reinterpret_cast<DateVal*>(Evaluate(eval, row));
diff --git a/be/src/exprs/hive-udf-call.h b/be/src/exprs/hive-udf-call.h
index 379f5b5..7cf9ae6 100644
--- a/be/src/exprs/hive-udf-call.h
+++ b/be/src/exprs/hive-udf-call.h
@@ -77,7 +77,7 @@ class HiveUdfCall : public ScalarExpr {
   /// startup time.
   static Status InitEnv() WARN_UNUSED_RESULT;
 
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override;
 
@@ -90,27 +90,14 @@ class HiveUdfCall : public ScalarExpr {
 
   HiveUdfCall(const TExprNode& node);
 
-  virtual Status Init(const RowDescriptor& row_desc, RuntimeState* state)
-      override WARN_UNUSED_RESULT;
+  virtual Status Init(const RowDescriptor& row_desc, bool is_entry_point,
+      RuntimeState* state) override WARN_UNUSED_RESULT;
   virtual Status OpenEvaluator(FunctionContext::FunctionStateScope scope,
-      RuntimeState* state, ScalarExprEvaluator* eval) const override
-      WARN_UNUSED_RESULT;
+      RuntimeState* state, ScalarExprEvaluator* eval) const override WARN_UNUSED_RESULT;
   virtual void CloseEvaluator(FunctionContext::FunctionStateScope scope,
       RuntimeState* state, ScalarExprEvaluator* eval) const override;
 
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
 
  private:
   /// Evalutes the UDF over row. Returns the result as an AnyVal. This function
diff --git a/be/src/exprs/is-not-empty-predicate.cc b/be/src/exprs/is-not-empty-predicate.cc
index 9d5e277..a51ad36 100644
--- a/be/src/exprs/is-not-empty-predicate.cc
+++ b/be/src/exprs/is-not-empty-predicate.cc
@@ -26,6 +26,7 @@
 
 #include "common/names.h"
 #include "exprs/null-literal.h"
+#include "exprs/scalar-expr.inline.h"
 #include "exprs/slot-ref.h"
 
 namespace impala {
@@ -34,64 +35,50 @@ const char* IsNotEmptyPredicate::LLVM_CLASS_NAME = "class.impala::IsNotEmptyPred
 
 IsNotEmptyPredicate::IsNotEmptyPredicate(const TExprNode& node) : Predicate(node) {}
 
-BooleanVal IsNotEmptyPredicate::GetBooleanVal(
+BooleanVal IsNotEmptyPredicate::GetBooleanValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   CollectionVal coll = children_[0]->GetCollectionVal(eval, row);
   if (coll.is_null) return BooleanVal::null();
   return BooleanVal(coll.num_tuples != 0);
 }
 
-Status IsNotEmptyPredicate::Init(const RowDescriptor& row_desc, RuntimeState* state) {
-  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, state));
+Status IsNotEmptyPredicate::Init(
+    const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) {
+  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, is_entry_point, state));
   DCHECK_EQ(children_.size(), 1);
   return Status::OK();
 }
 
-// Sample IR output (when child is a SlotRef): // FIXME needs review
-//
-//   define i16 @IsNotEmptyPredicate(%"class.impala::ScalarExprEvaluator"* %eval,
-//     %"class.impala::TupleRow"* %row) #38 {
-//   entry:
+// Sample IR output (when child is a SlotRef):
+// define i16 @IsNotEmptyPredicate(%"class.impala::ScalarExprEvaluator"* %eval,
+//                                 %"class.impala::TupleRow"* %row) #37 {
+// entry:
 //   %0 = alloca i16
 //   %1 = alloca i16
-//   %collection_val = alloca %"struct.impala_udf::CollectionVal"
-//   call void @_ZNK6impala7SlotRef16GetCollectionValEPNS_
-//     19ScalarExprEvaluatorEPKNS_8TupleRowE(%"struct.impala_udf::CollectionVal"*
-//     %collection_val, %"class.impala::SlotRef"* inttoptr (i64 140731643050560
-//     to %"class.impala::SlotRef"*),
-//     %"class.impala::ScalarExprEvaluator"* %eval, %"class.impala::TupleRow"* %row)
-//   %anyval_ptr = getelementptr inbounds %"struct.impala_udf::CollectionVal",
-//     %"struct.impala_udf::CollectionVal"* %collection_val, i32 0, i32 0
-//   %is_null_ptr = getelementptr inbounds %"struct.impala_udf::AnyVal",
-//     %"struct.impala_udf::AnyVal"* %anyval_ptr, i32 0, i32 0
-//   %is_null = load i8, i8* %is_null_ptr
-//   %is_null_bool = icmp ne i8 %is_null, 0
-//   %num_tuples_ptr = getelementptr inbounds %"struct.impala_udf::CollectionVal",
-//     %"struct.impala_udf::CollectionVal"* %collection_val, i32 0, i32 3
-//   %num_tuples = load i32, i32* %num_tuples_ptr
-//   br i1 %is_null_bool, label %ret_null, label %check_count
+//   %coll_val = call { i64, i8* } @GetSlotRef(
+//      %"class.impala::ScalarExprEvaluator"* %eval, %"class.impala::TupleRow"* %row)
+//   %2 = extractvalue { i64, i8* } %coll_val, 0
+//   %coll_is_null = trunc i64 %2 to i1
+//   br i1 %coll_is_null, label %ret_null, label %check_count
 //
 // check_count:                                      ; preds = %entry
-//   %4 = icmp ne i32 %num_tuples, 0
-//   %ret2 = load i16, i16* %0
-//   %5 = zext i1 %4 to i16
-//   %6 = shl i16 %5, 8
-//   %7 = and i16 %ret2, 255
-//   %ret21 = or i16 %7, %6
-//   ret i16 %ret21
+//   %3 = extractvalue { i64, i8* } %coll_val, 0
+//   %4 = ashr i64 %3, 32
+//   %5 = trunc i64 %4 to i32
+//   %6 = icmp ne i32 %5, 0
+//   %has_values_result = load i16, i16* %0
+//   %7 = zext i1 %6 to i16
+//   %8 = shl i16 %7, 8
+//   %9 = and i16 %has_values_result, 255
+//   %has_values_result1 = or i16 %9, %8
+//   ret i16 %has_values_result1
 //
 // ret_null:                                         ; preds = %entry
-//   %ret = load i16, i16* %1
+//   %null_result = load i16, i16* %1
 //   ret i16 1
 // }
-//
-Status IsNotEmptyPredicate::GetCodegendComputeFn(
+Status IsNotEmptyPredicate::GetCodegendComputeFnImpl(
     LlvmCodeGen* codegen, llvm::Function** fn) {
-  if (ir_compute_fn_ != nullptr) {
-    *fn = ir_compute_fn_;
-    return Status::OK();
-  }
-
   // Create a method with the expected signature.
   llvm::LLVMContext& context = codegen->context();
   llvm::Value* args[2];
@@ -101,64 +88,25 @@ Status IsNotEmptyPredicate::GetCodegendComputeFn(
   LlvmBuilder builder(entry_block);
 
   ScalarExpr* child = children_[0]; // The child node, on which to call GetCollectionVal.
-  llvm::Value* child_expr; //  a Value for 'child' with the correct subtype of ScalarExpr.
-  llvm::Function* get_collection_val_fn; // a type specific GetCollectionVal method
-
-  //  To save a virtual method call, find the type of child and lookup the non-virtual
-  //  GetCollection method to call.
-  if (dynamic_cast<SlotRef*>(child)) {
-    // Lookup SlotRef::GetCollectionVal().
-    get_collection_val_fn =
-        codegen->GetFunction(IRFunction::SCALAR_EXPR_SLOT_REF_GET_COLLECTION_VAL, false);
-    child_expr = codegen->CastPtrToLlvmPtr(
-        codegen->GetNamedPtrType(SlotRef::LLVM_CLASS_NAME), child);
-  } else if (dynamic_cast<NullLiteral*>(child)) {
-    // Lookup NullLiteral::GetCollectionVal().
-    get_collection_val_fn = codegen->GetFunction(
-        IRFunction::SCALAR_EXPR_NULL_LITERAL_GET_COLLECTION_VAL, false);
-    child_expr = codegen->CastPtrToLlvmPtr(
-        codegen->GetNamedPtrType(NullLiteral::LLVM_CLASS_NAME), child);
-  } else {
-    // This may mean someone implemented GetCollectionVal in a new subclass of ScalarExpr.
-    DCHECK(false) << "Unknown GetCollectionVal implementation: " << typeid(*child).name();
-    return Status("Codegen'd IsNotEmptyPredicate function found unknown GetCollectionVal,"
-                  " see log.");
-  }
+  llvm::Function* get_collection_val_fn;
+  RETURN_IF_ERROR(child->GetCodegendComputeFn(codegen, false, &get_collection_val_fn));
   DCHECK(get_collection_val_fn != nullptr);
-  DCHECK(child_expr != nullptr);
 
   // Find type for the CollectionVal struct.
   llvm::Type* collection_type = codegen->GetNamedType("struct.impala_udf::CollectionVal");
   DCHECK(collection_type->isStructTy());
 
-  // Allocate space for the CollectionVal on the stack
-  llvm::Value* collection_val_ptr =
-      codegen->CreateEntryBlockAlloca(builder, collection_type, "collection_val");
-
-  // The get_collection_val_fn returns a CollectionVal by value. In llvm we have to pass a
-  // pointer as the first argument. The second argument is the object on which the call is
-  // made.
-  llvm::Value* get_coll_call_args[] = {collection_val_ptr, child_expr, args[0], args[1]};
-
   // Construct the call to the evaluation method, and return the result.
-  llvm::Value* collection = builder.CreateCall(get_collection_val_fn, get_coll_call_args);
-  DCHECK(collection != nullptr);
+  CodegenAnyVal coll_val = CodegenAnyVal::CreateCallWrapped(codegen, &builder,
+      child->type(), get_collection_val_fn, args, "coll_val");
 
   // Find the 'is_null' field of the CollectionVal.
-  llvm::Value* anyval_ptr =
-      builder.CreateStructGEP(nullptr, collection_val_ptr, 0, "anyval_ptr");
-  llvm::Value* is_null_ptr =
-      builder.CreateStructGEP(nullptr, anyval_ptr, 0, "is_null_ptr");
-  llvm::Value* is_null = builder.CreateLoad(is_null_ptr, "is_null");
-
-  // Check if 'is_null' is true.
-  llvm::Value* is_null_bool =
-      builder.CreateICmpNE(is_null, codegen->GetI8Constant(0), "is_null_bool");
+  llvm::Value* is_null = coll_val.GetIsNull("coll_is_null");
 
   llvm::BasicBlock* check_count =
       llvm::BasicBlock::Create(context, "check_count", new_fn);
   llvm::BasicBlock* ret_null = llvm::BasicBlock::Create(context, "ret_null", new_fn);
-  builder.CreateCondBr(is_null_bool, ret_null, check_count);
+  builder.CreateCondBr(is_null, ret_null, check_count);
 
   // Add code to the block that is executed if is_null was true.
   builder.SetInsertPoint(ret_null);
@@ -168,10 +116,8 @@ Status IsNotEmptyPredicate::GetCodegendComputeFn(
 
   // Back to the branch where 'is_null' is false.
   builder.SetInsertPoint(check_count);
-  // Load the value of 'num_tuples'
-  llvm::Value* num_tuples_ptr =
-      builder.CreateStructGEP(nullptr, collection_val_ptr, 3, "num_tuples_ptr");
-  llvm::Value* num_tuples = builder.CreateLoad(num_tuples_ptr, "num_tuples");
+  // Load the value of 'num_tuples'.
+  llvm::Value* num_tuples = coll_val.GetLen();
 
   llvm::Value* has_values = builder.CreateICmpNE(num_tuples, codegen->GetI32Constant(0));
   CodegenAnyVal has_values_result(
@@ -183,9 +129,6 @@ Status IsNotEmptyPredicate::GetCodegendComputeFn(
   if (UNLIKELY(*fn == nullptr)) {
     return Status(TErrorCode::IR_VERIFY_FAILED, "IsNotEmptyPredicate");
   }
-
-  ir_compute_fn_ = *fn;
-
   return Status::OK();
 }
 
diff --git a/be/src/exprs/is-not-empty-predicate.h b/be/src/exprs/is-not-empty-predicate.h
index bf20350..313252c 100644
--- a/be/src/exprs/is-not-empty-predicate.h
+++ b/be/src/exprs/is-not-empty-predicate.h
@@ -31,14 +31,17 @@ class TExprNode;
 class IsNotEmptyPredicate : public Predicate {
  public:
   static const char* LLVM_CLASS_NAME;
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) override;
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  virtual Status GetCodegendComputeFnImpl(
+      LlvmCodeGen* codegen, llvm::Function** fn) override;
+  virtual BooleanVal GetBooleanValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const override;
   virtual std::string DebugString() const override;
 
  protected:
   friend class ScalarExpr;
 
-  virtual Status Init(const RowDescriptor& row_desc, RuntimeState* state) override;
+  virtual Status Init(
+      const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) override;
   explicit IsNotEmptyPredicate(const TExprNode& node);
 };
 
diff --git a/be/src/exprs/kudu-partition-expr.cc b/be/src/exprs/kudu-partition-expr.cc
index ce38e24..3fa69ef 100644
--- a/be/src/exprs/kudu-partition-expr.cc
+++ b/be/src/exprs/kudu-partition-expr.cc
@@ -33,8 +33,9 @@ namespace impala {
 KuduPartitionExpr::KuduPartitionExpr(const TExprNode& node)
   : ScalarExpr(node), tkudu_partition_expr_(node.kudu_partition_expr) {}
 
-Status KuduPartitionExpr::Init(const RowDescriptor& row_desc, RuntimeState* state) {
-  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, state));
+Status KuduPartitionExpr::Init(
+    const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) {
+  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, is_entry_point, state));
   DCHECK_EQ(tkudu_partition_expr_.referenced_columns.size(), children_.size());
 
   // Create the KuduPartitioner we'll use to get the partition index for each row.
@@ -59,8 +60,8 @@ Status KuduPartitionExpr::Init(const RowDescriptor& row_desc, RuntimeState* stat
   return Status::OK();
 }
 
-IntVal KuduPartitionExpr::GetIntVal(ScalarExprEvaluator* eval,
-    const TupleRow* row) const {
+IntVal KuduPartitionExpr::GetIntValInterpreted(
+    ScalarExprEvaluator* eval, const TupleRow* row) const {
   for (int i = 0; i < children_.size(); ++i) {
     void* val = eval->GetValue(*GetChild(i), row);
     if (val == NULL) {
@@ -88,9 +89,9 @@ IntVal KuduPartitionExpr::GetIntVal(ScalarExprEvaluator* eval,
   return IntVal(kudu_partition);
 }
 
-Status KuduPartitionExpr::GetCodegendComputeFn(
+Status KuduPartitionExpr::GetCodegendComputeFnImpl(
     LlvmCodeGen* codegen, llvm::Function** fn) {
-  return Status("Error: KuduPartitionExpr::GetCodegendComputeFn not implemented.");
+  return GetCodegendComputeFnWrapper(codegen, fn);
 }
 
 } // namespace impala
diff --git a/be/src/exprs/kudu-partition-expr.h b/be/src/exprs/kudu-partition-expr.h
index 90be8d0..35722cc 100644
--- a/be/src/exprs/kudu-partition-expr.h
+++ b/be/src/exprs/kudu-partition-expr.h
@@ -40,13 +40,13 @@ class KuduPartitionExpr : public ScalarExpr {
 
   KuduPartitionExpr(const TExprNode& node);
 
-  virtual Status Init(const RowDescriptor& row_desc, RuntimeState* state)
-      override WARN_UNUSED_RESULT;
+  virtual Status Init(const RowDescriptor& row_desc, bool is_entry_point,
+      RuntimeState* state) override WARN_UNUSED_RESULT;
 
-  virtual IntVal GetIntVal(ScalarExprEvaluator* eval, const TupleRow* row)
-      const override;
+  virtual IntVal GetIntValInterpreted(
+      ScalarExprEvaluator* eval, const TupleRow* row) const override;
 
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
 
  private:
diff --git a/be/src/exprs/literal.cc b/be/src/exprs/literal.cc
index e20c58b..f1bd986 100644
--- a/be/src/exprs/literal.cc
+++ b/be/src/exprs/literal.cc
@@ -228,49 +228,49 @@ Literal::Literal(ColumnType type, const DateValue& v)
   value_.date_val = v;
 }
 
-BooleanVal Literal::GetBooleanVal(
+BooleanVal Literal::GetBooleanValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_BOOLEAN) << type_;
   return BooleanVal(value_.bool_val);
 }
 
-TinyIntVal Literal::GetTinyIntVal(
+TinyIntVal Literal::GetTinyIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_TINYINT) << type_;
   return TinyIntVal(value_.tinyint_val);
 }
 
-SmallIntVal Literal::GetSmallIntVal(
+SmallIntVal Literal::GetSmallIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_SMALLINT) << type_;
   return SmallIntVal(value_.smallint_val);
 }
 
-IntVal Literal::GetIntVal(
+IntVal Literal::GetIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_INT) << type_;
   return IntVal(value_.int_val);
 }
 
-BigIntVal Literal::GetBigIntVal(
+BigIntVal Literal::GetBigIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_BIGINT) << type_;
   return BigIntVal(value_.bigint_val);
 }
 
-FloatVal Literal::GetFloatVal(
+FloatVal Literal::GetFloatValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_FLOAT) << type_;
   return FloatVal(value_.float_val);
 }
 
-DoubleVal Literal::GetDoubleVal(
+DoubleVal Literal::GetDoubleValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DOUBLE) << type_;
   return DoubleVal(value_.double_val);
 }
 
-StringVal Literal::GetStringVal(
+StringVal Literal::GetStringValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK(type_.IsStringType()) << type_;
   StringVal result;
@@ -278,7 +278,7 @@ StringVal Literal::GetStringVal(
   return result;
 }
 
-DecimalVal Literal::GetDecimalVal(
+DecimalVal Literal::GetDecimalValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DECIMAL) << type_;
   switch (type().GetByteSize()) {
@@ -295,7 +295,7 @@ DecimalVal Literal::GetDecimalVal(
   return DecimalVal();
 }
 
-TimestampVal Literal::GetTimestampVal(
+TimestampVal Literal::GetTimestampValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_TIMESTAMP) << type_;
   TimestampVal result;
@@ -303,7 +303,7 @@ TimestampVal Literal::GetTimestampVal(
   return result;
 }
 
-DateVal Literal::GetDateVal(
+DateVal Literal::GetDateValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DATE) << type_;
   return value_.date_val.ToDateVal();
@@ -371,16 +371,7 @@ string Literal::DebugString() const {
 // entry:
 //   ret { i8, i64 } { i8 0, i64 10 }
 // }
-Status Literal::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
-  if (ir_compute_fn_ != nullptr) {
-    *fn = ir_compute_fn_;
-    return Status::OK();
-  }
-
-  if (type_.type == TYPE_CHAR) {
-    return Status::Expected("Codegen not supported for CHAR");
-  }
-
+Status Literal::GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn) {
   DCHECK_EQ(GetNumChildren(), 0);
   llvm::Value* args[2];
   *fn = CreateIrFunctionPrototype("Literal", codegen, &args);
@@ -413,6 +404,7 @@ Status Literal::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
       break;
     case TYPE_STRING:
     case TYPE_VARCHAR:
+    case TYPE_CHAR:
       v.SetLen(builder.getInt32(value_.string_val.len));
       v.SetPtr(codegen->GetStringConstant(
           &builder, value_.string_val.ptr, value_.string_val.len));
@@ -451,7 +443,6 @@ Status Literal::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
   builder.CreateRet(v.GetLoweredValue());
   *fn = codegen->FinalizeFunction(*fn);
   if (UNLIKELY(*fn == nullptr)) return Status(TErrorCode::IR_VERIFY_FAILED, "Literal");
-  ir_compute_fn_ = *fn;
   return Status::OK();
 }
 
diff --git a/be/src/exprs/literal.h b/be/src/exprs/literal.h
index 0180144..e1522ae 100644
--- a/be/src/exprs/literal.h
+++ b/be/src/exprs/literal.h
@@ -47,7 +47,7 @@ class TExprNode;
 class Literal: public ScalarExpr {
  public:
   virtual bool IsLiteral() const override { return true; }
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override;
 
@@ -71,19 +71,7 @@ class Literal: public ScalarExpr {
   Literal(ColumnType type, const TimestampValue& v);
   Literal(ColumnType type, const DateValue& v);
 
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
 
  private:
   ExprValue value_;
diff --git a/be/src/exprs/null-literal.cc b/be/src/exprs/null-literal.cc
index 175f61e..c6edc04 100644
--- a/be/src/exprs/null-literal.cc
+++ b/be/src/exprs/null-literal.cc
@@ -31,72 +31,78 @@ namespace impala {
 
 const char* NullLiteral::LLVM_CLASS_NAME = "class.impala::NullLiteral";
 
-BooleanVal NullLiteral::GetBooleanVal(
+BooleanVal NullLiteral::GetBooleanValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_BOOLEAN) << type_;
   return BooleanVal::null();
 }
 
-TinyIntVal NullLiteral::GetTinyIntVal(
+TinyIntVal NullLiteral::GetTinyIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_TINYINT) << type_;
   return TinyIntVal::null();
 }
 
-SmallIntVal NullLiteral::GetSmallIntVal(
+SmallIntVal NullLiteral::GetSmallIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_SMALLINT) << type_;
   return SmallIntVal::null();
 }
 
-IntVal NullLiteral::GetIntVal(
+IntVal NullLiteral::GetIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_INT) << type_;
   return IntVal::null();
 }
 
-BigIntVal NullLiteral::GetBigIntVal(
+BigIntVal NullLiteral::GetBigIntValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_BIGINT) << type_;
   return BigIntVal::null();
 }
 
-FloatVal NullLiteral::GetFloatVal(
+FloatVal NullLiteral::GetFloatValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_FLOAT) << type_;
   return FloatVal::null();
 }
 
-DoubleVal NullLiteral::GetDoubleVal(
+DoubleVal NullLiteral::GetDoubleValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DOUBLE) << type_;
   return DoubleVal::null();
 }
 
-StringVal NullLiteral::GetStringVal(
+StringVal NullLiteral::GetStringValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK(type_.IsStringType()) << type_;
   return StringVal::null();
 }
 
-TimestampVal NullLiteral::GetTimestampVal(
+TimestampVal NullLiteral::GetTimestampValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_TIMESTAMP) << type_;
   return TimestampVal::null();
 }
 
-DecimalVal NullLiteral::GetDecimalVal(
+DecimalVal NullLiteral::GetDecimalValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DECIMAL) << type_;
   return DecimalVal::null();
 }
 
-DateVal NullLiteral::GetDateVal(
+DateVal NullLiteral::GetDateValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DATE) << type_;
   return DateVal::null();
 }
 
+CollectionVal NullLiteral::GetCollectionValInterpreted(
+    ScalarExprEvaluator* eval, const TupleRow* row) const {
+  DCHECK(type_.IsCollectionType());
+  return CollectionVal::null();
+}
+
 // Generated IR for a bigint NULL literal:
 //
 // define { i8, i64 } @NullLiteral(
@@ -104,16 +110,7 @@ DateVal NullLiteral::GetDateVal(
 // entry:
 //   ret { i8, i64 } { i8 1, i64 0 }
 // }
-Status NullLiteral::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
-  if (ir_compute_fn_ != nullptr) {
-    *fn = ir_compute_fn_;
-    return Status::OK();
-  }
-
-  if (type_.type == TYPE_CHAR) {
-    return Status::Expected("Codegen not supported for CHAR");
-  }
-
+Status NullLiteral::GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn) {
   DCHECK_EQ(GetNumChildren(), 0);
   llvm::Value* args[2];
   *fn = CreateIrFunctionPrototype("NullLiteral", codegen, &args);
@@ -127,7 +124,6 @@ Status NullLiteral::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function**
   if (UNLIKELY(*fn == nullptr)) {
     return Status(TErrorCode::IR_VERIFY_FAILED, "NullLiteral");
   }
-  ir_compute_fn_ = *fn;
   return Status::OK();
 }
 
diff --git a/be/src/exprs/null-literal.h b/be/src/exprs/null-literal.h
index 53b7e80..a1e3956 100644
--- a/be/src/exprs/null-literal.h
+++ b/be/src/exprs/null-literal.h
@@ -41,7 +41,7 @@ class TExprNode;
 class NullLiteral: public ScalarExpr {
  public:
   virtual bool IsLiteral() const override { return true; }
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override;
 
@@ -56,21 +56,9 @@ class NullLiteral: public ScalarExpr {
 
   NullLiteral(const TExprNode& node) : ScalarExpr(node) { }
 
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
+  virtual CollectionVal GetCollectionValInterpreted(
       ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual CollectionVal GetCollectionVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
 };
 
 }
diff --git a/be/src/exprs/scalar-expr-evaluator.cc b/be/src/exprs/scalar-expr-evaluator.cc
index 149c6bf..40c971d 100644
--- a/be/src/exprs/scalar-expr-evaluator.cc
+++ b/be/src/exprs/scalar-expr-evaluator.cc
@@ -21,9 +21,8 @@
 
 #include "common/object-pool.h"
 #include "common/status.h"
-#include "exprs/anyval-util.h"
-#include "exprs/scalar-expr.h"
 #include "exprs/aggregate-functions.h"
+#include "exprs/anyval-util.h"
 #include "exprs/bit-byte-functions.h"
 #include "exprs/case-expr.h"
 #include "exprs/cast-functions.h"
@@ -41,6 +40,7 @@
 #include "exprs/null-literal.h"
 #include "exprs/operators.h"
 #include "exprs/scalar-expr-evaluator.h"
+#include "exprs/scalar-expr.inline.h"
 #include "exprs/scalar-fn-call.h"
 #include "exprs/slot-ref.h"
 #include "exprs/string-functions.h"
diff --git a/be/src/exprs/scalar-expr-ir.cc b/be/src/exprs/scalar-expr-ir.cc
index 8c698b1..97bc404 100644
--- a/be/src/exprs/scalar-expr-ir.cc
+++ b/be/src/exprs/scalar-expr-ir.cc
@@ -15,8 +15,8 @@
 // specific language governing permissions and limitations
 // under the License.
 
-#include "exprs/scalar-expr.h"
-#include "udf/udf.h"
+#include "exprs/scalar-expr.inline.h"
+#include "udf/udf-internal.h"
 
 #ifdef IR_COMPILE
 
@@ -30,7 +30,7 @@ void dummy(impala_udf::FunctionContext*, impala_udf::BooleanVal*, impala_udf::Ti
     impala_udf::SmallIntVal*, impala_udf::IntVal*, impala_udf::BigIntVal*,
     impala_udf::FloatVal*, impala_udf::DoubleVal*, impala_udf::StringVal*,
     impala_udf::TimestampVal*, impala_udf::DecimalVal*, impala_udf::DateVal*,
-    impala::ScalarExprEvaluator*) { }
+    impala_udf::CollectionVal*, impala::ScalarExprEvaluator*) { }
 #endif
 
 /// The following are compute functions that are cross-compiled to both native and IR
@@ -41,59 +41,59 @@ void dummy(impala_udf::FunctionContext*, impala_udf::BooleanVal*, impala_udf::Ti
 using namespace impala;
 using namespace impala_udf;
 
-/// Static wrappers around Get*Val() functions. We'd like to be able to call these from
-/// directly from native code as well as from generated IR functions.
-BooleanVal ScalarExpr::GetBooleanVal(
+/// Static wrappers around Get*ValInterpreted() functions. We'd like to be able to call
+/// these from directly from native code as well as from generated IR functions.
+BooleanVal ScalarExpr::GetBooleanValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetBooleanVal(eval, row);
+  return expr->GetBooleanValInterpreted(eval, row);
 }
 
-TinyIntVal ScalarExpr::GetTinyIntVal(
+TinyIntVal ScalarExpr::GetTinyIntValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetTinyIntVal(eval, row);
+  return expr->GetTinyIntValInterpreted(eval, row);
 }
 
-SmallIntVal ScalarExpr::GetSmallIntVal(
+SmallIntVal ScalarExpr::GetSmallIntValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetSmallIntVal(eval, row);
+  return expr->GetSmallIntValInterpreted(eval, row);
 }
 
-IntVal ScalarExpr::GetIntVal(
+IntVal ScalarExpr::GetIntValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetIntVal(eval, row);
+  return expr->GetIntValInterpreted(eval, row);
 }
 
-BigIntVal ScalarExpr::GetBigIntVal(
+BigIntVal ScalarExpr::GetBigIntValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetBigIntVal(eval, row);
+  return expr->GetBigIntValInterpreted(eval, row);
 }
 
-FloatVal ScalarExpr::GetFloatVal(
+FloatVal ScalarExpr::GetFloatValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetFloatVal(eval, row);
+  return expr->GetFloatValInterpreted(eval, row);
 }
 
-DoubleVal ScalarExpr::GetDoubleVal(
+DoubleVal ScalarExpr::GetDoubleValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetDoubleVal(eval, row);
+  return expr->GetDoubleValInterpreted(eval, row);
 }
 
-StringVal ScalarExpr::GetStringVal(
+StringVal ScalarExpr::GetStringValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetStringVal(eval, row);
+  return expr->GetStringValInterpreted(eval, row);
 }
 
-TimestampVal ScalarExpr::GetTimestampVal(
+TimestampVal ScalarExpr::GetTimestampValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetTimestampVal(eval, row);
+  return expr->GetTimestampValInterpreted(eval, row);
 }
 
-DecimalVal ScalarExpr::GetDecimalVal(
+DecimalVal ScalarExpr::GetDecimalValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetDecimalVal(eval, row);
+  return expr->GetDecimalValInterpreted(eval, row);
 }
 
-DateVal ScalarExpr::GetDateVal(
+DateVal ScalarExpr::GetDateValInterpreted(
     ScalarExpr* expr, ScalarExprEvaluator* eval, const TupleRow* row) {
-  return expr->GetDateVal(eval, row);
+  return expr->GetDateValInterpreted(eval, row);
 }
diff --git a/be/src/exprs/scalar-expr.cc b/be/src/exprs/scalar-expr.cc
index f272992..ba5ccc8 100644
--- a/be/src/exprs/scalar-expr.cc
+++ b/be/src/exprs/scalar-expr.cc
@@ -15,7 +15,7 @@
 // specific language governing permissions and limitations
 // under the License.
 
-#include "exprs/scalar-expr.h"
+#include "exprs/scalar-expr.inline.h"
 
 #include <sstream>
 #include <thrift/protocol/TDebugProtocol.h>
@@ -48,6 +48,7 @@
 #include "runtime/runtime-state.h"
 #include "runtime/tuple-row.h"
 #include "runtime/tuple.h"
+#include "runtime/types.h"
 #include "udf/udf-internal.h"
 #include "udf/udf.h"
 
@@ -79,7 +80,12 @@ Status ScalarExpr::Create(const TExpr& texpr, const RowDescriptor& row_desc,
   ScalarExpr* root;
   RETURN_IF_ERROR(CreateNode(texpr.nodes[0], pool, &root));
   RETURN_IF_ERROR(Expr::CreateTree(texpr, pool, root));
-  Status status = root->Init(row_desc, state);
+  // Assume that the root is a potential entry point for interpreted callers.
+  // This is not always true but would require some work to determine for
+  // each of the callsites of Create().
+  // TODO: fix this - reducing the number of entry points would reduce codegen overhead
+  // somewhat.
+  Status status = root->Init(row_desc, /*is_entry_point*/ true, state);
   if (UNLIKELY(!status.ok())) {
     root->Close();
     return status;
@@ -275,10 +281,18 @@ int ScalarExpr::ComputeResultsLayout(const vector<ScalarExpr*>& exprs,
   return byte_offset;
 }
 
-Status ScalarExpr::Init(const RowDescriptor& row_desc, RuntimeState* state) {
+Status ScalarExpr::Init(
+    const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) {
   DCHECK(type_.type != INVALID_TYPE);
   for (int i = 0; i < children_.size(); ++i) {
-    RETURN_IF_ERROR(children_[i]->Init(row_desc, state));
+    RETURN_IF_ERROR(children_[i]->Init(row_desc, false, state));
+  }
+  // Add the expression to the list of expressions to codegen in the codegen phase.
+  if (ShouldCodegen(state)) {
+    // If the expression is not interpretable, we need an entry point to evaluate
+    // the expression from interpreted code, e.g. GetConstValue().
+    bool is_codegen_entry_point = is_entry_point || !IsInterpretable();
+    state->AddScalarExprToCodegen(this, is_codegen_entry_point);
   }
   return Status::OK();
 }
@@ -303,6 +317,17 @@ string ScalarExpr::DebugString(const vector<ScalarExpr*>& exprs) {
   return out.str();
 }
 
+bool ScalarExpr::ShouldCodegen(const RuntimeState* state) const {
+  // Use the interpreted path and call the builtin without codegen if any of the
+  // followings is true:
+  // 1. The expression does not have an associated RuntimeState, e.g. is a partition
+  //    key expression in a descriptor table.
+  // 2. codegen is disabled by query option.
+  // 3. there is an optimization hint to disable codegen and the expr can be interpreted.
+  return state != nullptr && !state->CodegenDisabledByQueryOption()
+      && !(state->CodegenHasDisableHint() && IsInterpretable());
+}
+
 int ScalarExpr::GetSlotIds(vector<SlotId>* slot_ids) const {
   int n = 0;
   for (int i = 0; i < children_.size(); ++i) {
@@ -358,106 +383,73 @@ llvm::Function* ScalarExpr::CreateIrFunctionPrototype(
   return function;
 }
 
-Status ScalarExpr::GetCodegendComputeFnWrapper(
-    LlvmCodeGen* codegen, llvm::Function** fn) {
+Status ScalarExpr::GetCodegendComputeFn(
+    LlvmCodeGen* codegen, bool is_codegen_entry_point, llvm::Function** fn) {
   if (ir_compute_fn_ != nullptr) {
     *fn = ir_compute_fn_;
-    return Status::OK();
+  } else {
+    RETURN_IF_ERROR(GetCodegendComputeFnImpl(codegen, fn));
+    ir_compute_fn_ = *fn;
   }
+  if (is_codegen_entry_point && !added_to_jit_) {
+    // Ensure Get*Val() is made callable if this function is called at least once
+    // with is_codegen_entry_point=true.
+    added_to_jit_ = true;
+    codegen->AddFunctionToJit(*fn, &codegend_compute_fn_);
+  }
+  return Status::OK();
+}
+
+Status ScalarExpr::GetCodegendComputeFnWrapper(
+    LlvmCodeGen* codegen, llvm::Function** fn) {
+  for (ScalarExpr* expr : children_) {
+    llvm::Function* dummy;
+    // The codegen'd function will call expr->Get*Val(). Ensure that the child expr
+    // is a codegen entry point we expr->GetVal() uses the fast codegen'd path.
+    RETURN_IF_ERROR(expr->GetCodegendComputeFn(codegen, true, &dummy));
+  }
+
   llvm::Function* static_getval_fn = GetStaticGetValWrapper(type(), codegen);
 
   // Call it passing this as the additional first argument.
   llvm::Value* args[2];
-  ir_compute_fn_ = CreateIrFunctionPrototype("CodegenComputeFnWrapper", codegen, &args);
+  *fn = CreateIrFunctionPrototype("CodegenComputeFnWrapper", codegen, &args);
   llvm::BasicBlock* entry_block =
-      llvm::BasicBlock::Create(codegen->context(), "entry", ir_compute_fn_);
+      llvm::BasicBlock::Create(codegen->context(), "entry", *fn);
   LlvmBuilder builder(entry_block);
-  llvm::Value* this_ptr = codegen->CastPtrToLlvmPtr(
-      codegen->GetStructPtrType<ScalarExpr>(), this);
+  llvm::Value* this_ptr =
+      codegen->CastPtrToLlvmPtr(codegen->GetStructPtrType<ScalarExpr>(), this);
   llvm::Value* compute_fn_args[] = {this_ptr, args[0], args[1]};
   llvm::Value* ret = CodegenAnyVal::CreateCall(
       codegen, &builder, static_getval_fn, compute_fn_args, "ret");
   builder.CreateRet(ret);
-  *fn = codegen->FinalizeFunction(ir_compute_fn_);
+  *fn = codegen->FinalizeFunction(*fn);
   if (UNLIKELY(*fn == nullptr)) {
     return Status(TErrorCode::IR_VERIFY_FAILED, "CodegendComputeFnWrapper");
   }
-  ir_compute_fn_ = *fn;
   return Status::OK();
 }
 
-// At least one of these should always be overridden.
-BooleanVal ScalarExpr::GetBooleanVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return BooleanVal::null();
-}
-
-TinyIntVal ScalarExpr::GetTinyIntVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return TinyIntVal::null();
-}
-
-SmallIntVal ScalarExpr::GetSmallIntVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return SmallIntVal::null();
-}
-
-IntVal ScalarExpr::GetIntVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return IntVal::null();
-}
-
-BigIntVal ScalarExpr::GetBigIntVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return BigIntVal::null();
-}
-
-FloatVal ScalarExpr::GetFloatVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return FloatVal::null();
-}
-
-DoubleVal ScalarExpr::GetDoubleVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return DoubleVal::null();
-}
-
-StringVal ScalarExpr::GetStringVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return StringVal::null();
-}
-
-CollectionVal ScalarExpr::GetCollectionVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return CollectionVal::null();
-}
-
-TimestampVal ScalarExpr::GetTimestampVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return TimestampVal::null();
-}
-
-DecimalVal ScalarExpr::GetDecimalVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return DecimalVal::null();
-}
+#define SCALAR_EXPR_GET_VAL_INTERPRETED(type)                 \
+  type ScalarExpr::Get##type##Interpreted(                    \
+      ScalarExprEvaluator* eval, const TupleRow* row) const { \
+    DCHECK(false) << DebugString();                           \
+    return type::null();                                      \
+  }
 
-DateVal ScalarExpr::GetDateVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(false) << DebugString();
-  return DateVal::null();
-}
+// At least one of these should always be overridden.
+SCALAR_EXPR_GET_VAL_INTERPRETED(BooleanVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(TinyIntVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(SmallIntVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(IntVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(BigIntVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(FloatVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(DoubleVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(StringVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(TimestampVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(DecimalVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(DateVal);
+SCALAR_EXPR_GET_VAL_INTERPRETED(CollectionVal);
 
 string ScalarExpr::DebugString(const string& expr_name) const {
   stringstream out;
diff --git a/be/src/exprs/scalar-expr.h b/be/src/exprs/scalar-expr.h
index a6a55b8..64b56aa 100644
--- a/be/src/exprs/scalar-expr.h
+++ b/be/src/exprs/scalar-expr.h
@@ -85,8 +85,10 @@ class TupleRow;
 /// necessary on the arguments to generate the result. These compute functions have
 /// signature Get*Val(ScalarExprEvaluator*, const TupleRow*). One is implemented for each
 /// possible return type it supports (e.g. GetBooleanVal(), GetStringVal(), etc). The
-/// return type is a subclass of AnyVal (e.g. StringVal). One or more of these compute
-/// functions must be overridden by subclasses of ScalarExpr.
+/// return type is a subclass of AnyVal (e.g. StringVal). Get*Val() dispatches to either
+/// a codegen'd function pointer or to an interpreted implementation Get*ValInterpreted()
+/// These interpreted functions must be overridden by subclasses of ScalarExpr for every
+/// type that they may return.
 ///
 /// ScalarExpr contains query compile-time information about an expression (e.g.
 /// sub-expressions implicitly encoded in the tree structure) and the LLVM IR compute
@@ -96,14 +98,25 @@ class TupleRow;
 /// ScalarExpr's compute functions are codegend to replace calls to the generic compute
 /// function of child expressions with the exact compute functions based on the return
 /// types of the child expressions known at runtime. Subclasses should override
-/// GetCodegendComputeFn() to either generate custom IR compute functions using IRBuilder,
-/// which inline calls to child expressions' compute functions, or simply call
+/// GetCodegendComputeFnImpl() to either generate custom IR compute functions using
+/// IRBuilder, which inlines calls to child expressions' compute functions, or simply call
 /// GetCodegendComputeFnWrapper() to generate a wrapper function to call the interpreted
 /// compute function. Note that we do not need a separate GetCodegendComputeFn() for each
 /// type.
 ///
-/// TODO: Fix subclasses which call GetCodegendComputeFnWrapper() to not call interpreted
-/// functions.
+/// The two main usage patterns for ScalarExpr are:
+/// * The codegen'd expressions are called from other codegen'd functions, e.g. from a
+///   codegen'd join implementation
+/// * Get*Val() is called on the root of each expression subtree by interpreted code
+///   (e.g. an operator which doesn't support codegen yet).
+/// We can optimize for the second usage pattern by filling in the codegen'd function
+/// pointer (codegend_compute_fn_) in root of each ScalarExpr tree. Individual callsites
+/// can disable this optimisation if it's not needed. Expr subtrees can be evaluated
+/// (e.g. by ScalarExprEvaluator::GetConstValue()) but may fail back to a slower
+/// interpreted implementation.
+///
+/// TODO: remove GetCodegendComputeFnWrapper(), which is a hack to enable "codegen"
+/// by generating LLVM functions that actually call into the interpreted path.
 ///
 class ScalarExpr : public Expr {
  public:
@@ -148,11 +161,21 @@ class ScalarExpr : public Expr {
 
   /// Returns an llvm::Function* with signature:
   /// <subclass of AnyVal> ComputeFn(ScalarExprEvaluator*, const TupleRow*)
-  //
+  ///
   /// The function should evaluate this expr over 'row' and return the result as the
-  /// appropriate type of AnyVal. Returns error status on failure.
-  virtual Status GetCodegendComputeFn(
-      LlvmCodeGen* codegen, llvm::Function** fn) WARN_UNUSED_RESULT = 0;
+  /// appropriate type of AnyVal. If 'is_codegen_entry_point' is true, then the
+  /// appropriate setup is performed to make the Get*Val() method on this expr call into
+  /// a codegen'd implementation of the expression tree. If this expr is only called from
+  /// codegen'd code via 'fn', 'is_codegen_entry_point' should be false to reduce the
+  /// number of entry points into codegen'd code and therefore the overhead of
+  /// compilation. Returns error status on failure.
+  ///
+  /// This function is invoked either by other codegen functions (e.g. the codegen code
+  /// of a join) or by RuntimeState::CodegenScalarExprs() which is called from
+  /// FragmentInstanceState::Open() before LLVM compilation. These two call sites
+  /// correspond to the two usage patterns in the class comment.
+  Status GetCodegendComputeFn(LlvmCodeGen* codegen, bool is_codegen_entry_point,
+      llvm::Function** fn);
 
   /// Simple debug string that provides no expr subclass-specific information
   virtual std::string DebugString() const;
@@ -210,9 +233,6 @@ class ScalarExpr : public Expr {
   friend class ExprCodegenTest;
   friend class HashTableTest;
 
-  /// Cached LLVM IR for the compute function. Set this in GetCodegendComputeFn().
-  llvm::Function* ir_compute_fn_ = nullptr;
-
   /// Assigns indices into the FunctionContext vector 'fn_ctxs_' in an evaluator to
   /// nodes which need FunctionContext in the tree. 'next_fn_ctx_idx' is the index
   /// of the next available entry in the vector. It's updated as this function is
@@ -229,28 +249,58 @@ class ScalarExpr : public Expr {
   ScalarExpr(const ColumnType& type, bool is_constant);
   ScalarExpr(const TExprNode& node);
 
+  /// Implementation of GetCodegendComputeFn() to be overridden by each subclass of
+  /// ScalarExpr.
+  virtual Status GetCodegendComputeFnImpl(
+      LlvmCodeGen* codegen, llvm::Function** fn) WARN_UNUSED_RESULT = 0;
+
+  /// Entry points for ScalarExprEvaluator when interpreting this ScalarExpr. These
+  /// dispatch to the codegen'd function pointer if present, or otherwise use the
+  /// Get*ValInterpreted() implementation.
+  /// These functions should be called by other ScalarExprs and ScalarExprEvaluator only.
+  BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const;
+  TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const;
+  SmallIntVal GetSmallIntVal(ScalarExprEvaluator*, const TupleRow*) const;
+  IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const;
+  BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const;
+  FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const;
+  DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const;
+  StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const;
+  CollectionVal GetCollectionVal(ScalarExprEvaluator*, const TupleRow*) const;
+  TimestampVal GetTimestampVal(ScalarExprEvaluator*, const TupleRow*) const;
+  DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const;
+  DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const;
+
   /// Virtual compute functions for each return type. Each subclass should override
   /// the functions for the return type(s) it supports. For example, a boolean function
-  /// will only override GetBooleanVal(). Some Exprs, like Literal, have many possible
-  /// return types and will override multiple Get*Val() functions. These functions should
-  /// be called by other ScalarExpr and ScalarExprEvaluator only.
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual SmallIntVal GetSmallIntVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual CollectionVal GetCollectionVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual TimestampVal GetTimestampVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const;
-
-  /// Initializes all nodes in the expr tree. Subclasses overriding this function should
-  /// call ScalarExpr::Init() to recursively call Init() on the expr tree.
-  virtual Status Init(const RowDescriptor& row_desc, RuntimeState* state)
-      WARN_UNUSED_RESULT;
+  /// will only override GetBooleanValInterpreted(). Some Exprs, like Literal, have many
+  /// possible return types and will override multiple Get*ValInterpreted() functions.
+  virtual BooleanVal GetBooleanValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const;
+  virtual TinyIntVal GetTinyIntValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const;
+  virtual SmallIntVal GetSmallIntValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const;
+  virtual IntVal GetIntValInterpreted(ScalarExprEvaluator*, const TupleRow*) const;
+  virtual BigIntVal GetBigIntValInterpreted(ScalarExprEvaluator*, const TupleRow*) const;
+  virtual FloatVal GetFloatValInterpreted(ScalarExprEvaluator*, const TupleRow*) const;
+  virtual DoubleVal GetDoubleValInterpreted(ScalarExprEvaluator*, const TupleRow*) const;
+  virtual StringVal GetStringValInterpreted(ScalarExprEvaluator*, const TupleRow*) const;
+  virtual CollectionVal GetCollectionValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const;
+  virtual TimestampVal GetTimestampValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const;
+  virtual DecimalVal GetDecimalValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const;
+  virtual DateVal GetDateValInterpreted(ScalarExprEvaluator*, const TupleRow*) const;
+
+  /// Initializes all nodes in the expr tree. Subclasses overriding this function must
+  /// call ScalarExpr::Init(). If 'is_entry_point' is true, this indicates that Get*Val()
+  /// may be called directly from interpreted code and that we should generate an entry
+  /// point into the codegen'd code. Currently we assume all roots of ScalarExpr subtrees
+  /// exprs are potential entry points.
+  virtual Status Init(const RowDescriptor& row_desc, bool is_entry_point,
+      RuntimeState* state) WARN_UNUSED_RESULT;
 
   /// Initializes 'eval' for execution. If scope if FRAGMENT_LOCAL, both
   /// fragment-local and thread-local states should be initialized. If scope is
@@ -298,9 +348,21 @@ class ScalarExpr : public Expr {
       WARN_UNUSED_RESULT;
 
   /// Helper function for GetCodegendComputeFnWrapper(). Returns the cross-compiled IR
-  /// function of the static Get*Val wrapper function for return type 'type'.
+  /// function of the static Get*Val wrapper function for return type 'type'. Must not
+  /// be called for a type that doesn't have a wrapper function in the built-in
+  /// IR module. Never returns NULL.
   llvm::Function* GetStaticGetValWrapper(ColumnType type, LlvmCodeGen* codegen);
 
+ protected:
+  /// Return true if we should codegen this expression node, based on query options
+  /// and the properties of this ScalarExpr node.
+  bool ShouldCodegen(const RuntimeState* state) const;
+
+  /// Return true if it is possible to evaluate this expression node without codegen.
+  /// The vast majority of exprs support interpretation, so default to true. Scalars
+  /// that are not always interpretable must override this function.
+  virtual bool IsInterpretable() const { return true; }
+
  private:
   /// 'fn_ctx_idx_' is the index into the FunctionContext vector in ScalarExprEvaluator
   /// for storing FunctionContext needed to evaluate this ScalarExprNode. It's -1 if this
@@ -318,22 +380,74 @@ class ScalarExpr : public Expr {
   /// * This expr is a constant literal created in the backend.
   const bool is_constant_;
 
+  /// Cached LLVM IR for the compute function. Set in GetCodegendComputeFn().
+  llvm::Function* ir_compute_fn_ = nullptr;
+
+  /// Function pointer to the JIT'd function produced by GetCodegendComputeFn().
+  /// Has signature *Val (ScalarExprEvaluator*, const TupleRow*), and calls the scalar
+  /// function with signature like *Val (FunctionContext*, const *Val& arg1, ...)
+  /// Non-NULL if this expr is codegen'd and the constructor of this Expr requested
+  /// that this expr should be an entry point from interpreted to codegen'd code.
+  /// (see class comment for explanation of usage patterns and motivation).
+  void* codegend_compute_fn_ = nullptr;
+
+  /// True if 'codegend_compute_fn_' is registered with LlvmCodeGen as an entry point
+  /// to codegen to fill in . If this is true, then 'codegend_compute_fn_' will be set
+  /// to the JIT'd function produced by GetCodegendComputeFn() after LLVM compilation.
+  /// Set in GetCodegendComputeFn().
+  bool added_to_jit_ = false;
+
   /// Static wrappers which call the compute function of the given ScalarExpr, passing
   /// it the ScalarExprEvaluator and TupleRow. These are cross-compiled and called by
-  /// the IR wrapper functions generated by GetCodegendComputeFnWrapper().
-  static BooleanVal GetBooleanVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static TinyIntVal GetTinyIntVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static SmallIntVal GetSmallIntVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static IntVal GetIntVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static BigIntVal GetBigIntVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static FloatVal GetFloatVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static DoubleVal GetDoubleVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static StringVal GetStringVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static TimestampVal GetTimestampVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static DecimalVal GetDecimalVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
-  static DateVal GetDateVal(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  /// the IR wrapper functions generated by GetCodegendComputeFnWrapper(). The wrapper
+  /// functions avoid the need for codegen'd callers to do a virtual function call on
+  /// the ScalarExpr.
+  /// Note: the ScalarExpr subclass is known at codegen time, so codegen *could* replace
+  /// the virtual function call with a direct function call. However, it would be better
+  /// to focus on removing GetCodegendComputeFnWrapper() instead of micro-optimising this
+  /// inefficient mechanism.
+  static BooleanVal GetBooleanValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static TinyIntVal GetTinyIntValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static SmallIntVal GetSmallIntValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static IntVal GetIntValInterpreted(ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static BigIntVal GetBigIntValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static FloatVal GetFloatValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static DoubleVal GetDoubleValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static StringVal GetStringValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static TimestampVal GetTimestampValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static DecimalVal GetDecimalValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
+  static DateVal GetDateValInterpreted(
+      ScalarExpr*, ScalarExprEvaluator*, const TupleRow*);
 };
 
-}
+// Helper to generate the declaration for an override of Get*ValInterpreted()
+#define GET_VAL_INTERPRETED_OVERRIDE(val_type)                                       \
+  virtual val_type Get##val_type##Interpreted(ScalarExprEvaluator*, const TupleRow*) \
+      const override;
+
+// Helper to stamp out Get*ValInterpreted() declarations for all scalar types.
+#define GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES \
+  GET_VAL_INTERPRETED_OVERRIDE(BooleanVal)                          \
+  GET_VAL_INTERPRETED_OVERRIDE(TinyIntVal)                          \
+  GET_VAL_INTERPRETED_OVERRIDE(SmallIntVal)                         \
+  GET_VAL_INTERPRETED_OVERRIDE(IntVal)                              \
+  GET_VAL_INTERPRETED_OVERRIDE(BigIntVal)                           \
+  GET_VAL_INTERPRETED_OVERRIDE(FloatVal)                            \
+  GET_VAL_INTERPRETED_OVERRIDE(DoubleVal)                           \
+  GET_VAL_INTERPRETED_OVERRIDE(StringVal)                           \
+  GET_VAL_INTERPRETED_OVERRIDE(TimestampVal)                        \
+  GET_VAL_INTERPRETED_OVERRIDE(DecimalVal)                          \
+  GET_VAL_INTERPRETED_OVERRIDE(DateVal)
+
+} // namespace impala
 
 #endif
diff --git a/be/src/exprs/scalar-expr.inline.h b/be/src/exprs/scalar-expr.inline.h
new file mode 100644
index 0000000..ff17d2d
--- /dev/null
+++ b/be/src/exprs/scalar-expr.inline.h
@@ -0,0 +1,67 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "exprs/scalar-expr.h"
+
+namespace impala {
+
+/// Macro to generate implementations for the below functions. 'val_type' is
+/// a UDF type name, e.g. IntVal and 'type_validation' is a DCHECK expression
+/// referencing 'type_' to assert that the function is only called on expressions
+/// of the appropriate type.
+/// * ScalarExpr::GetBooleanVal()
+/// * ScalarExpr::GetTinyIntVal()
+/// * ScalarExpr::GetSmallIntVal()
+/// * ScalarExpr::GetIntVal()
+/// * ScalarExpr::GetBigIntVal()
+/// * ScalarExpr::GetFloatVal()
+/// * ScalarExpr::GetDoubleVal()
+/// * ScalarExpr::GetTimestampVal()
+/// * ScalarExpr::GetDecimalVal()
+/// * ScalarExpr::GetStringVal()
+/// * ScalarExpr::GetDateVal()
+/// * ScalarExpr::GetCollectionVal()
+#pragma push_macro("SCALAR_EXPR_GET_VAL")
+#define SCALAR_EXPR_GET_VAL(val_type, type_validation)                                 \
+  typedef val_type (*val_type##Wrapper)(ScalarExprEvaluator*, const TupleRow*);        \
+  inline val_type ScalarExpr::Get##val_type(                                           \
+      ScalarExprEvaluator* eval, const TupleRow* row) const {                          \
+    DCHECK(type_validation) << type_.DebugString();                                    \
+    DCHECK(eval != nullptr);                                                           \
+    if (codegend_compute_fn_ == nullptr) return Get##val_type##Interpreted(eval, row); \
+    val_type##Wrapper fn = reinterpret_cast<val_type##Wrapper>(codegend_compute_fn_);  \
+    return fn(eval, row);                                                              \
+  }
+
+SCALAR_EXPR_GET_VAL(BooleanVal, type_.type == PrimitiveType::TYPE_BOOLEAN);
+SCALAR_EXPR_GET_VAL(TinyIntVal, type_.type == PrimitiveType::TYPE_TINYINT);
+SCALAR_EXPR_GET_VAL(SmallIntVal, type_.type == PrimitiveType::TYPE_SMALLINT);
+SCALAR_EXPR_GET_VAL(IntVal, type_.type == PrimitiveType::TYPE_INT);
+SCALAR_EXPR_GET_VAL(BigIntVal, type_.type == PrimitiveType::TYPE_BIGINT);
+SCALAR_EXPR_GET_VAL(FloatVal, type_.type == PrimitiveType::TYPE_FLOAT);
+SCALAR_EXPR_GET_VAL(DoubleVal, type_.type == PrimitiveType::TYPE_DOUBLE);
+SCALAR_EXPR_GET_VAL(TimestampVal, type_.type == PrimitiveType::TYPE_TIMESTAMP);
+SCALAR_EXPR_GET_VAL(DecimalVal, type_.type == PrimitiveType::TYPE_DECIMAL);
+SCALAR_EXPR_GET_VAL(StringVal, type_.IsStringType()
+    || type_.type == PrimitiveType::TYPE_FIXED_UDA_INTERMEDIATE);
+SCALAR_EXPR_GET_VAL(DateVal, type_.type == PrimitiveType::TYPE_DATE);
+SCALAR_EXPR_GET_VAL(CollectionVal, type_.IsCollectionType());
+#pragma pop_macro("SCALAR_EXPR_GET_VAL")
+
+}
diff --git a/be/src/exprs/scalar-fn-call.cc b/be/src/exprs/scalar-fn-call.cc
index 1f4a26d..ca283ba 100644
--- a/be/src/exprs/scalar-fn-call.cc
+++ b/be/src/exprs/scalar-fn-call.cc
@@ -52,7 +52,6 @@ using std::pair;
 ScalarFnCall::ScalarFnCall(const TExprNode& node)
   : ScalarExpr(node),
     vararg_start_idx_(node.__isset.vararg_start_idx ? node.vararg_start_idx : -1),
-    scalar_fn_wrapper_(NULL),
     prepare_fn_(NULL),
     close_fn_(NULL),
     scalar_fn_(NULL) {
@@ -71,9 +70,10 @@ Status ScalarFnCall::LoadPrepareAndCloseFn(LlvmCodeGen* codegen) {
   return Status::OK();
 }
 
-Status ScalarFnCall::Init(const RowDescriptor& desc, RuntimeState* state) {
+Status ScalarFnCall::Init(
+    const RowDescriptor& desc, bool is_entry_point, RuntimeState* state) {
   // Initialize children first.
-  RETURN_IF_ERROR(ScalarExpr::Init(desc, state));
+  RETURN_IF_ERROR(ScalarExpr::Init(desc, is_entry_point, state));
 
   if (fn_.scalar_fn.symbol.empty()) {
     // This path is intended to only be used during development to test FE
@@ -86,44 +86,32 @@ Status ScalarFnCall::Init(const RowDescriptor& desc, RuntimeState* state) {
     return Status(ss.str());
   }
 
-  // Check if the function takes CHAR as input or returns CHAR.
-  bool has_char_arg_or_result = type_.type == TYPE_CHAR;
-  for (int i = 0; !has_char_arg_or_result && i < children_.size(); ++i) {
-    has_char_arg_or_result |= children_[i]->type_.type == TYPE_CHAR;
-  }
-
-  // Use the interpreted path and call the builtin without codegen if any of the
-  // followings is true:
-  // 1. codegen is disabled by query option
-  // 2. there are char arguments (as they aren't supported yet)
-  // 3. there is an optimization hint to disable codegen and UDF can be interpreted.
-  //    IR UDF or UDF with more than MAX_INTERP_ARGS number of fixed arguments
-  //    cannot be interpreted.
-  //
-  // TODO: codegen for char arguments
   bool is_ir_udf = fn_.binary_type == TFunctionBinaryType::IR;
-  bool too_many_args_to_interp = NumFixedArgs() > MAX_INTERP_ARGS;
-  bool udf_interpretable = !is_ir_udf && !too_many_args_to_interp;
-  if (state->CodegenDisabledByQueryOption() || has_char_arg_or_result ||
-      (state->CodegenHasDisableHint() && udf_interpretable)) {
+  if (!ShouldCodegen(state)) {
+    // The interpreted code path must be handled in different ways depending on why
+    // codegen was disabled. It may not be possible to evaluate the expr without
+    // codegen or we may need to prepare the function for execution.
     if (is_ir_udf) {
-      // CHAR or VARCHAR are not supported as input arguments or return values for UDFs.
-      DCHECK(!has_char_arg_or_result && state->CodegenDisabledByQueryOption());
+      DCHECK(state->CodegenDisabledByQueryOption());
       return Status(Substitute("Cannot interpret LLVM IR UDF '$0': Codegen is needed. "
-          "Please set DISABLE_CODEGEN to false.", fn_.name.function_name));
+                               "Please set DISABLE_CODEGEN to false.",
+          fn_.name.function_name));
     }
 
     // The templates for builtin or native UDFs used in the interpretation path
     // support up to MAX_INTERP_ARGS number of arguments only.
-    if (too_many_args_to_interp) {
+    if (NumFixedArgs() > MAX_INTERP_ARGS) {
       DCHECK_EQ(fn_.binary_type, TFunctionBinaryType::NATIVE);
       // CHAR or VARCHAR are not supported as input arguments or return values for UDFs.
-      DCHECK(!has_char_arg_or_result && state->CodegenDisabledByQueryOption());
-      return Status(Substitute("Cannot interpret native UDF '$0': number of arguments is "
+      DCHECK(state->CodegenDisabledByQueryOption());
+      return Status(Substitute(
+          "Cannot interpret native UDF '$0': number of arguments is "
           "more than $1. Codegen is needed. Please set DISABLE_CODEGEN to false.",
           fn_.name.function_name, MAX_INTERP_ARGS));
     }
+  }
 
+  if (!is_ir_udf) {
     Status status = LibCache::instance()->GetSoFunctionPtr(fn_.hdfs_location,
         fn_.scalar_fn.symbol, fn_.last_modified_time, &scalar_fn_, &cache_entry_);
     if (!status.ok()) {
@@ -137,14 +125,10 @@ Status ScalarFnCall::Init(const RowDescriptor& desc, RuntimeState* state) {
             fn_.name.function_name, status.GetDetail()));
       }
     }
-  } else {
-    // Add the expression to the list of expressions to codegen in the codegen phase.
-    state->AddScalarFnToCodegen(this);
+    // For IR UDF, the loading of the Init() and CloseContext() functions is deferred
+    // until the first time GetCodegendComputeFn() is invoked.
+    RETURN_IF_ERROR(LoadPrepareAndCloseFn(nullptr));
   }
-
-  // For IR UDF, the loading of the Init() and CloseContext() functions is deferred until
-  // first time GetCodegendComputeFn() is invoked.
-  if (!is_ir_udf) RETURN_IF_ERROR(LoadPrepareAndCloseFn(NULL));
   return Status::OK();
 }
 
@@ -154,19 +138,19 @@ Status ScalarFnCall::OpenEvaluator(FunctionContext::FunctionStateScope scope,
   RETURN_IF_ERROR(ScalarExpr::OpenEvaluator(scope, state, eval));
   DCHECK_GE(fn_ctx_idx_, 0);
   FunctionContext* fn_ctx = eval->fn_context(fn_ctx_idx_);
-  bool is_interpreted = scalar_fn_wrapper_ == nullptr;
-
-  if (is_interpreted) {
-    // We're in the interpreted path (i.e. no JIT). Populate our FunctionContext's
-    // staging_input_vals, which will be reused across calls to scalar_fn_.
-    DCHECK(scalar_fn_ != nullptr);
-    vector<AnyVal*>* input_vals = fn_ctx->impl()->staging_input_vals();
-    for (int i = 0; i < NumFixedArgs(); ++i) {
-      AnyVal* input_val;
-      RETURN_IF_ERROR(AllocateAnyVal(state, eval->expr_perm_pool(), children_[i]->type(),
-          "Could not allocate expression value", &input_val));
-      input_vals->push_back(input_val);
-    }
+
+  // Prepare staging_input_vals in case the interpreted evaluation path of
+  // this function is invoked. staging_input_vals is preallocated here
+  // so they can be reused across calls. If we have a codegen'd entry point
+  // for this expression, allocating these input values may be unnecessary,
+  // but they only add a small constant overhead on top of the ScalarExpr tree, so
+  // we always allocate them for simplicity.
+  vector<AnyVal*>* input_vals = fn_ctx->impl()->staging_input_vals();
+  for (int i = 0; i < NumFixedArgs(); ++i) {
+    AnyVal* input_val;
+    RETURN_IF_ERROR(AllocateAnyVal(state, eval->expr_perm_pool(), children_[i]->type(),
+        "Could not allocate expression value", &input_val));
+    input_vals->push_back(input_val);
   }
 
   // Only evaluate constant arguments at the top level of function contexts.
@@ -192,44 +176,42 @@ Status ScalarFnCall::OpenEvaluator(FunctionContext::FunctionStateScope scope,
     }
   }
 
-  if (is_interpreted) {
-    // Now we have the constant values, cache them so that the interpreted path can
-    // call the UDF without reevaluating the arguments. 'staging_input_vals' and
-    // 'varargs_buffer' in the FunctionContext are used to pass fixed and variable-length
-    // arguments respectively. 'non_constant_args()' in the FunctionContext will contain
-    // pointers to the remaining (non-constant) children that are evaluated for every row.
-    vector<pair<ScalarExpr*, AnyVal*>> non_constant_args;
-    uint8_t* varargs_buffer = fn_ctx->impl()->varargs_buffer();
-    for (int i = 0; i < children_.size(); ++i) {
-      AnyVal* input_arg;
-      int arg_bytes = AnyValUtil::AnyValSize(children_[i]->type());
-      if (i < NumFixedArgs()) {
-        input_arg = (*fn_ctx->impl()->staging_input_vals())[i];
-      } else {
-        input_arg = reinterpret_cast<AnyVal*>(varargs_buffer);
-        varargs_buffer += arg_bytes;
-      }
-      // IMPALA-4586: Cache constant arguments only if the frontend has rewritten them
-      // into literal expressions. This gives the frontend control over how expressions
-      // are evaluated. This means that setting enable_expr_rewrites=false will also
-      // disable caching of non-literal constant expressions, which gives the old
-      // behaviour (before this caching optimisation was added) of repeatedly evaluating
-      // exprs that are constant according to is_constant(). For exprs that are not truly
-      // constant (yet is_constant() returns true for) e.g. non-deterministic UDFs, this
-      // means that setting enable_expr_rewrites=false works as a safety valve to get
-      // back the old behaviour, before constant expr folding or caching was added.
-      // TODO: once we can annotate UDFs as non-deterministic (IMPALA-4606), we should
-      // be able to trust is_constant() and switch back to that.
-      if (children_[i]->IsLiteral()) {
-        const AnyVal* constant_arg = fn_ctx->impl()->constant_args()[i];
-        DCHECK(constant_arg != nullptr);
-        memcpy(input_arg, constant_arg, arg_bytes);
-      } else {
-        non_constant_args.emplace_back(children_[i], input_arg);
-      }
+  // Now we have the constant values, cache them so that the interpreted path can
+  // call the UDF without reevaluating the arguments. 'staging_input_vals' and
+  // 'varargs_buffer' in the FunctionContext are used to pass fixed and variable-length
+  // arguments respectively. 'non_constant_args()' in the FunctionContext will contain
+  // pointers to the remaining (non-constant) children that are evaluated for every row.
+  vector<pair<ScalarExpr*, AnyVal*>> non_constant_args;
+  uint8_t* varargs_buffer = fn_ctx->impl()->varargs_buffer();
+  for (int i = 0; i < children_.size(); ++i) {
+    AnyVal* input_arg;
+    int arg_bytes = AnyValUtil::AnyValSize(children_[i]->type());
+    if (i < NumFixedArgs()) {
+      input_arg = (*fn_ctx->impl()->staging_input_vals())[i];
+    } else {
+      input_arg = reinterpret_cast<AnyVal*>(varargs_buffer);
+      varargs_buffer += arg_bytes;
+    }
+    // IMPALA-4586: Cache constant arguments only if the frontend has rewritten them
+    // into literal expressions. This gives the frontend control over how expressions
+    // are evaluated. This means that setting enable_expr_rewrites=false will also
+    // disable caching of non-literal constant expressions, which gives the old
+    // behaviour (before this caching optimisation was added) of repeatedly evaluating
+    // exprs that are constant according to is_constant(). For exprs that are not truly
+    // constant (yet is_constant() returns true for) e.g. non-deterministic UDFs, this
+    // means that setting enable_expr_rewrites=false works as a safety valve to get
+    // back the old behaviour, before constant expr folding or caching was added.
+    // TODO: once we can annotate UDFs as non-deterministic (IMPALA-4606), we should
+    // be able to trust is_constant() and switch back to that.
+    if (children_[i]->IsLiteral()) {
+      const AnyVal* constant_arg = fn_ctx->impl()->constant_args()[i];
+      DCHECK(constant_arg != nullptr);
+      memcpy(input_arg, constant_arg, arg_bytes);
+    } else {
+      non_constant_args.emplace_back(children_[i], input_arg);
     }
-    fn_ctx->impl()->SetNonConstantArgs(move(non_constant_args));
   }
+  fn_ctx->impl()->SetNonConstantArgs(move(non_constant_args));
 
   if (prepare_fn_ != nullptr) {
     if (scope == FunctionContext::FRAGMENT_LOCAL) {
@@ -283,23 +265,9 @@ void ScalarFnCall::CloseEvaluator(FunctionContext::FunctionStateScope scope,
 //        i32 4,
 //        i64* inttoptr (i64 89111072 to i64*))
 //   ret { i8, double } %result
-Status ScalarFnCall::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
-  if (ir_compute_fn_ != NULL) {
-    *fn = ir_compute_fn_;
-    return Status::OK();
-  }
-  if (type_.type == TYPE_CHAR) {
-    return Status::Expected("ScalarFnCall Codegen not supported for CHAR");
-  }
-  for (int i = 0; i < GetNumChildren(); ++i) {
-    if (children_[i]->type().type == TYPE_CHAR) {
-      *fn = NULL;
-      return Status::Expected("ScalarFnCall Codegen not supported for CHAR");
-    }
-  }
-
+Status ScalarFnCall::GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn) {
   vector<ColumnType> arg_types;
-  for (const Expr* child : children_) arg_types.push_back(child->type());
+  for (ScalarExpr* child : children_) arg_types.push_back(child->type());
   llvm::Function* udf;
   RETURN_IF_ERROR(codegen->LoadFunction(fn_, fn_.scalar_fn.symbol, &type_, arg_types,
       NumFixedArgs(), vararg_start_idx_ != -1, &udf, &cache_entry_));
@@ -361,7 +329,7 @@ Status ScalarFnCall::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function**
     llvm::Function* child_fn = NULL;
     vector<llvm::Value*> child_fn_args;
     // Set 'child_fn' to the codegen'd function, sets child_fn == NULL if codegen fails
-    Status status = children_[i]->GetCodegendComputeFn(codegen, &child_fn);
+    Status status = children_[i]->GetCodegendComputeFn(codegen, false, &child_fn);
     if (UNLIKELY(!status.ok())) {
       DCHECK(child_fn == NULL);
       // Set 'child_fn' to the interpreted function
@@ -417,9 +385,6 @@ Status ScalarFnCall::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function**
     return Status(
         TErrorCode::UDF_VERIFY_FAILED, fn_.scalar_fn.symbol, fn_.hdfs_location);
   }
-  ir_compute_fn_ = *fn;
-  // TODO: don't do this for child exprs
-  codegen->AddFunctionToJit(ir_compute_fn_, &scalar_fn_wrapper_);
   return Status::OK();
 }
 
@@ -517,116 +482,43 @@ RETURN_TYPE ScalarFnCall::InterpretEval(ScalarExprEvaluator* eval,
   return RETURN_TYPE::null();
 }
 
-typedef BooleanVal (*BooleanWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef TinyIntVal (*TinyIntWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef SmallIntVal (*SmallIntWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef IntVal (*IntWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef BigIntVal (*BigIntWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef FloatVal (*FloatWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef DoubleVal (*DoubleWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef StringVal (*StringWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef TimestampVal (*TimestampWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef DecimalVal (*DecimalWrapper)(ScalarExprEvaluator*, const TupleRow*);
-typedef DateVal (*DateWrapper)(ScalarExprEvaluator*, const TupleRow*);
-
-// TODO: macroify this?
-BooleanVal ScalarFnCall::GetBooleanVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_BOOLEAN);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<BooleanVal>(eval, row);
-  BooleanWrapper fn = reinterpret_cast<BooleanWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-TinyIntVal ScalarFnCall::GetTinyIntVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_TINYINT);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<TinyIntVal>(eval, row);
-  TinyIntWrapper fn = reinterpret_cast<TinyIntWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-SmallIntVal ScalarFnCall::GetSmallIntVal(
-     ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_SMALLINT);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<SmallIntVal>(eval, row);
-  SmallIntWrapper fn = reinterpret_cast<SmallIntWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-IntVal ScalarFnCall::GetIntVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_INT);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<IntVal>(eval, row);
-  IntWrapper fn = reinterpret_cast<IntWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-BigIntVal ScalarFnCall::GetBigIntVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_BIGINT);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<BigIntVal>(eval, row);
-  BigIntWrapper fn = reinterpret_cast<BigIntWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-FloatVal ScalarFnCall::GetFloatVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_FLOAT);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<FloatVal>(eval, row);
-  FloatWrapper fn = reinterpret_cast<FloatWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-DoubleVal ScalarFnCall::GetDoubleVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_DOUBLE);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<DoubleVal>(eval, row);
-  DoubleWrapper fn = reinterpret_cast<DoubleWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-StringVal ScalarFnCall::GetStringVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(type_.IsStringType());
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<StringVal>(eval, row);
-  StringWrapper fn = reinterpret_cast<StringWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-TimestampVal ScalarFnCall::GetTimestampVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_TIMESTAMP);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<TimestampVal>(eval, row);
-  TimestampWrapper fn = reinterpret_cast<TimestampWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
-
-DecimalVal ScalarFnCall::GetDecimalVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_DECIMAL);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<DecimalVal>(eval, row);
-  DecimalWrapper fn = reinterpret_cast<DecimalWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
+// Macro to generate implementations for the below functions. 'val_type' is
+// a UDF type name, e.g. IntVal and 'type_validation' is a DCHECK expression
+// referencing 'type_' to assert that the function is only called on expressions
+// of the appropriate type.
+// * ScalarFnCall::GetBooleanValInterpreted()
+// * ScalarFnCall::GetTinyIntValInterpreted()
+// * ScalarFnCall::GetSmallIntValInterpreted()
+// * ScalarFnCall::GetIntValInterpreted()
+// * ScalarFnCall::GetBigIntValInterpreted()
+// * ScalarFnCall::GetFloatValInterpreted()
+// * ScalarFnCall::GetDoubleValInterpreted()
+// * ScalarFnCall::GetStringValInterpreted()
+// * ScalarFnCall::GetTimestampValInterpreted()
+// * ScalarFnCall::GetDecimalValInterpreted()
+// * ScalarFnCall::GetDateValInterpreted()
+#pragma push_macro("GET_VAL_INTERPRETED")
+#define GET_VAL_INTERPRETED(val_type, type_validation)        \
+  val_type ScalarFnCall::Get##val_type##Interpreted(          \
+      ScalarExprEvaluator* eval, const TupleRow* row) const { \
+    DCHECK(type_validation) << type_.DebugString();           \
+    DCHECK(eval != nullptr);                                  \
+    return InterpretEval<val_type>(eval, row);                \
+  }
 
-DateVal ScalarFnCall::GetDateVal(ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK_EQ(type_.type, TYPE_DATE);
-  DCHECK(eval != NULL);
-  if (scalar_fn_wrapper_ == NULL) return InterpretEval<DateVal>(eval, row);
-  DateWrapper fn = reinterpret_cast<DateWrapper>(scalar_fn_wrapper_);
-  return fn(eval, row);
-}
+GET_VAL_INTERPRETED(BooleanVal, type_.type == PrimitiveType::TYPE_BOOLEAN);
+GET_VAL_INTERPRETED(TinyIntVal, type_.type == PrimitiveType::TYPE_TINYINT);
+GET_VAL_INTERPRETED(SmallIntVal, type_.type == PrimitiveType::TYPE_SMALLINT);
+GET_VAL_INTERPRETED(IntVal, type_.type == PrimitiveType::TYPE_INT);
+GET_VAL_INTERPRETED(BigIntVal, type_.type == PrimitiveType::TYPE_BIGINT);
+GET_VAL_INTERPRETED(FloatVal, type_.type == PrimitiveType::TYPE_FLOAT);
+GET_VAL_INTERPRETED(DoubleVal, type_.type == PrimitiveType::TYPE_DOUBLE);
+GET_VAL_INTERPRETED(StringVal,
+    type_.IsStringType() || type_.type == PrimitiveType::TYPE_FIXED_UDA_INTERMEDIATE);
+GET_VAL_INTERPRETED(TimestampVal, type_.type == PrimitiveType::TYPE_TIMESTAMP);
+GET_VAL_INTERPRETED(DecimalVal, type_.type == PrimitiveType::TYPE_DECIMAL);
+GET_VAL_INTERPRETED(DateVal, type_.type == PrimitiveType::TYPE_DATE);
+#pragma pop_macro("GET_VAL_INTERPRETED")
 
 string ScalarFnCall::DebugString() const {
   stringstream out;
@@ -635,6 +527,10 @@ string ScalarFnCall::DebugString() const {
   return out.str();
 }
 
+bool ScalarFnCall::IsInterpretable() const {
+  return fn_.binary_type != TFunctionBinaryType::IR && NumFixedArgs() <= MAX_INTERP_ARGS;
+}
+
 int ScalarFnCall::ComputeVarArgsBufferSize() const {
   for (int i = NumFixedArgs(); i < children_.size(); ++i) {
     // All varargs should have same type.
diff --git a/be/src/exprs/scalar-fn-call.h b/be/src/exprs/scalar-fn-call.h
index e1c33b4..82f687b 100644
--- a/be/src/exprs/scalar-fn-call.h
+++ b/be/src/exprs/scalar-fn-call.h
@@ -43,13 +43,11 @@ using impala_udf::DateVal;
 class ScalarExprEvaluator;
 class TExprNode;
 
-///
 /// Expr for evaluating a pre-compiled native or LLVM IR function that uses the UDF
-/// interface (i.e. a scalar function). This class overrides GetCodegendComputeFn() to
-/// return a function that calls any child exprs and passes the results as arguments to the
-/// specified scalar function. If codegen is enabled, ScalarFnCall's Get*Val() compute
-/// functions are wrappers around this codegen'd function.
-//
+/// interface (i.e. a scalar function). This class overrides GetCodegendComputeFnImpl() to
+/// return a function that calls any child exprs and passes the results as arguments to
+/// the specified scalar function.
+///
 /// If codegen is disabled, some native functions can be called without codegen, depending
 /// on the native function's signature. However, since we can't write static code to call
 /// every possible function signature, codegen may be required to generate the call to the
@@ -69,7 +67,7 @@ class TExprNode;
 ///    - Allow more functions to be NULL in UDA test harness
 class ScalarFnCall : public ScalarExpr {
  public:
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
+  virtual Status GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn)
       override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override;
 
@@ -80,28 +78,17 @@ class ScalarFnCall : public ScalarExpr {
   virtual bool HasFnCtx() const override { return true; }
 
   ScalarFnCall(const TExprNode& node);
-  virtual Status Init(const RowDescriptor& row_desc, RuntimeState* state)
-      override WARN_UNUSED_RESULT;
+  virtual Status Init(const RowDescriptor& row_desc, bool is_entry_point,
+      RuntimeState* state) override WARN_UNUSED_RESULT;
   virtual Status OpenEvaluator(FunctionContext::FunctionStateScope scope,
-      RuntimeState* state, ScalarExprEvaluator* eval) const override
-      WARN_UNUSED_RESULT;
+      RuntimeState* state, ScalarExprEvaluator* eval) const override WARN_UNUSED_RESULT;
   virtual void CloseEvaluator(FunctionContext::FunctionStateScope scope,
       RuntimeState* state, ScalarExprEvaluator* eval) const override;
   virtual int ComputeVarArgsBufferSize() const override;
+  /// Not all scalars functions are interpretable - see class comment.
+  virtual bool IsInterpretable() const override;
 
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
 
  private:
   /// If this function has var args, children()[vararg_start_idx_] is the first vararg
@@ -114,11 +101,6 @@ class ScalarFnCall : public ScalarExpr {
   /// second element in the value it must be evaluated into.
   std::vector<std::pair<Expr*, impala_udf::AnyVal*>> non_constant_children_;
 
-  /// Function pointer to the JIT'd function produced by GetCodegendComputeFn().
-  /// Has signature *Val (ScalarExprEvaluator*, const TupleRow*), and calls the scalar
-  /// function with signature like *Val (FunctionContext*, const *Val& arg1, ...)
-  void* scalar_fn_wrapper_;
-
   /// The UDF's prepare function, if specified. This is initialized in Prepare() and
   /// called in Open() (since we may have needed to codegen the function if it's from an
   /// IR module).
@@ -128,8 +110,8 @@ class ScalarFnCall : public ScalarExpr {
   /// in Close().
   impala_udf::UdfClose close_fn_;
 
-  /// If running with codegen disabled, scalar_fn_ will be a pointer to the non-JIT'd
-  /// scalar function.
+  /// A pointer to the function implementation, used by the interpreted code path. Set in
+  /// Init() for BUILTIN and NATIVE functions. Not set for IR UDFs.
   void* scalar_fn_;
 
   /// Returns the number of non-vararg arguments
@@ -146,9 +128,9 @@ class ScalarFnCall : public ScalarExpr {
 
   /// Loads the native or IR function 'symbol' from HDFS and puts the result in *fn.
   /// If the function is loaded from an IR module, it cannot be called until the module
-  /// has been JIT'd (i.e. after GetCodegendComputeFn() has been called).
-  Status GetFunction(LlvmCodeGen* codegen, const std::string& symbol, void** fn)
-      WARN_UNUSED_RESULT;
+  /// has been JIT'd (i.e. after GetCodegendComputeFnImpl() has been called).
+  Status GetFunction(
+      LlvmCodeGen* codegen, const std::string& symbol, void** fn) WARN_UNUSED_RESULT;
 
   /// Loads the Prepare() and Close() functions for this ScalarFnCall. They could be
   /// native or IR functions. To load IR functions, the codegen object must have
diff --git a/be/src/exprs/slot-ref-ir.cc b/be/src/exprs/slot-ref-ir.cc
deleted file mode 100644
index c9bdd4f..0000000
--- a/be/src/exprs/slot-ref-ir.cc
+++ /dev/null
@@ -1,34 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-#include "exprs/slot-ref.h"
-#include "runtime/collection-value.h"
-#include "runtime/tuple-row.h"
-
-namespace impala {
-
-CollectionVal SlotRef::GetCollectionVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(type_.IsCollectionType());
-  Tuple* t = row->GetTuple(tuple_idx_);
-  if (t == NULL || t->IsNull(null_indicator_offset_)) return CollectionVal::null();
-  CollectionValue* coll_value =
-      reinterpret_cast<CollectionValue*>(t->GetSlot(slot_offset_));
-  return CollectionVal(coll_value->ptr, coll_value->num_tuples);
-}
-
-} // namespace impala
\ No newline at end of file
diff --git a/be/src/exprs/slot-ref.cc b/be/src/exprs/slot-ref.cc
index aca236d..816e0bc 100644
--- a/be/src/exprs/slot-ref.cc
+++ b/be/src/exprs/slot-ref.cc
@@ -69,11 +69,12 @@ SlotRef::SlotRef(const ColumnType& type, int offset, const bool nullable /* = fa
     tuple_idx_(0),
     slot_offset_(offset),
     null_indicator_offset_(0, nullable ? offset : -1),
-    slot_id_(-1) {
-}
+    slot_id_(-1) {}
 
-Status SlotRef::Init(const RowDescriptor& row_desc, RuntimeState* state) {
+Status SlotRef::Init(
+    const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) {
   DCHECK_EQ(children_.size(), 0);
+  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, is_entry_point, state));
   if (slot_id_ != -1) {
     const SlotDescriptor* slot_desc = state->desc_tbl().GetSlotDescriptor(slot_id_);
     if (slot_desc == NULL) {
@@ -158,16 +159,7 @@ string SlotRef::DebugString() const {
 //
 // TODO: We could generate a typed struct (and not a char*) for Tuple for llvm.  We know
 // the types from the TupleDesc.  It will likely make this code simpler to reason about.
-Status SlotRef::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
-  if (type_.type == TYPE_CHAR) {
-    *fn = NULL;
-    return Status("Codegen for Char not supported.");
-  }
-  if (ir_compute_fn_ != NULL) {
-    *fn = ir_compute_fn_;
-    return Status::OK();
-  }
-
+Status SlotRef::GetCodegendComputeFnImpl(LlvmCodeGen* codegen, llvm::Function** fn) {
   DCHECK_EQ(GetNumChildren(), 0);
   // SlotRefs are based on the slot_id and tuple_idx.  Combine them to make a
   // query-wide unique id. We also need to combine whether the tuple is nullable. For
@@ -250,12 +242,12 @@ Status SlotRef::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
   llvm::Value* len = NULL;
   llvm::Value* time_of_day = NULL;
   llvm::Value* date = NULL;
-  if (type_.IsStringType()) {
+  if (type_.IsVarLenStringType() || type_.IsCollectionType()) {
     llvm::Value* ptr_ptr = builder.CreateStructGEP(NULL, val_ptr, 0, "ptr_ptr");
     ptr = builder.CreateLoad(ptr_ptr, "ptr");
     llvm::Value* len_ptr = builder.CreateStructGEP(NULL, val_ptr, 1, "len_ptr");
     len = builder.CreateLoad(len_ptr, "len");
-  } else if (type_.type == TYPE_FIXED_UDA_INTERMEDIATE) {
+  } else if (type_.type == TYPE_CHAR || type_.type == TYPE_FIXED_UDA_INTERMEDIATE) {
     // ptr and len are the slot and its fixed length.
     ptr = builder.CreateBitCast(val_ptr, codegen->ptr_type());
     len = codegen->GetI32Constant(type_.len);
@@ -291,7 +283,8 @@ Status SlotRef::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
   // *Val. The optimizer does a better job when there is a phi node for each value, rather
   // than having get_slot_block generate an AnyVal and having a single phi node over that.
   // TODO: revisit this code, can possibly be simplified
-  if (type_.IsVarLenStringType() || type_.type == TYPE_FIXED_UDA_INTERMEDIATE) {
+  if (type_.IsStringType() || type_.type == TYPE_FIXED_UDA_INTERMEDIATE
+      || type_.IsCollectionType()) {
     DCHECK(ptr != NULL);
     DCHECK(len != NULL);
     llvm::PHINode* ptr_phi = builder.CreatePHI(ptr->getType(), 2, "ptr_phi");
@@ -371,13 +364,12 @@ Status SlotRef::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
 
   *fn = codegen->FinalizeFunction(*fn);
   if (UNLIKELY(*fn == NULL)) return Status(TErrorCode::IR_VERIFY_FAILED, "SlotRef");
-  ir_compute_fn_ = *fn;
   codegen->RegisterExprFn(unique_slot_id, *fn);
   return Status::OK();
 }
 
 #define SLOT_REF_GET_FUNCTION(type_lit, type_val, type_c) \
-    type_val SlotRef::Get##type_val( \
+    type_val SlotRef::Get##type_val##Interpreted( \
         ScalarExprEvaluator* eval, const TupleRow* row) const { \
       DCHECK_EQ(type_.type, type_lit); \
       Tuple* t = row->GetTuple(tuple_idx_); \
@@ -393,7 +385,7 @@ SLOT_REF_GET_FUNCTION(TYPE_BIGINT, BigIntVal, int64_t);
 SLOT_REF_GET_FUNCTION(TYPE_FLOAT, FloatVal, float);
 SLOT_REF_GET_FUNCTION(TYPE_DOUBLE, DoubleVal, double);
 
-StringVal SlotRef::GetStringVal(
+StringVal SlotRef::GetStringValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK(type_.IsStringType() || type_.type == TYPE_FIXED_UDA_INTERMEDIATE);
   Tuple* t = row->GetTuple(tuple_idx_);
@@ -409,7 +401,7 @@ StringVal SlotRef::GetStringVal(
   return result;
 }
 
-TimestampVal SlotRef::GetTimestampVal(
+TimestampVal SlotRef::GetTimestampValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_TIMESTAMP);
   Tuple* t = row->GetTuple(tuple_idx_);
@@ -420,7 +412,7 @@ TimestampVal SlotRef::GetTimestampVal(
   return result;
 }
 
-DecimalVal SlotRef::GetDecimalVal(
+DecimalVal SlotRef::GetDecimalValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DECIMAL);
   Tuple* t = row->GetTuple(tuple_idx_);
@@ -438,7 +430,7 @@ DecimalVal SlotRef::GetDecimalVal(
   }
 }
 
-DateVal SlotRef::GetDateVal(
+DateVal SlotRef::GetDateValInterpreted(
     ScalarExprEvaluator* eval, const TupleRow* row) const {
   DCHECK_EQ(type_.type, TYPE_DATE);
   Tuple* t = row->GetTuple(tuple_idx_);
@@ -447,4 +439,14 @@ DateVal SlotRef::GetDateVal(
   return dv.ToDateVal();
 }
 
+CollectionVal SlotRef::GetCollectionValInterpreted(
+    ScalarExprEvaluator* eval, const TupleRow* row) const {
+  DCHECK(type_.IsCollectionType());
+  Tuple* t = row->GetTuple(tuple_idx_);
+  if (t == nullptr || t->IsNull(null_indicator_offset_)) return CollectionVal::null();
+  CollectionValue* coll_value =
+      reinterpret_cast<CollectionValue*>(t->GetSlot(slot_offset_));
+  return CollectionVal(coll_value->ptr, coll_value->num_tuples);
+}
+
 } // namespace impala
diff --git a/be/src/exprs/slot-ref.h b/be/src/exprs/slot-ref.h
index e0c89fc..e9bd4e8 100644
--- a/be/src/exprs/slot-ref.h
+++ b/be/src/exprs/slot-ref.h
@@ -49,11 +49,11 @@ class SlotRef : public ScalarExpr {
   SlotRef(const ColumnType& type, int offset, const bool nullable = false);
 
   /// Exposed as public so AGG node can initialize its build expressions.
-  virtual Status Init(const RowDescriptor& row_desc, RuntimeState* state)
-      override WARN_UNUSED_RESULT;
+  virtual Status Init(const RowDescriptor& row_desc, bool is_entry_point,
+      RuntimeState* state) override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override;
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
-      override WARN_UNUSED_RESULT;
+  virtual Status GetCodegendComputeFnImpl(
+      LlvmCodeGen* codegen, llvm::Function** fn) override WARN_UNUSED_RESULT;
   virtual bool IsSlotRef() const override { return true; }
   virtual int GetSlotIds(std::vector<SlotId>* slot_ids) const override;
   const SlotId& slot_id() const { return slot_id_; }
@@ -63,20 +63,8 @@ class SlotRef : public ScalarExpr {
   friend class ScalarExpr;
   friend class ScalarExprEvaluator;
 
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TinyIntVal GetTinyIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual SmallIntVal GetSmallIntVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual BigIntVal GetBigIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual FloatVal GetFloatVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DoubleVal GetDoubleVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual StringVal GetStringVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual TimestampVal GetTimestampVal(
-      ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual DateVal GetDateVal(ScalarExprEvaluator*, const TupleRow*) const override;
-  virtual CollectionVal GetCollectionVal(
+  GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES
+  virtual CollectionVal GetCollectionValInterpreted(
       ScalarExprEvaluator*, const TupleRow*) const override;
 
  private:
diff --git a/be/src/exprs/tuple-is-null-predicate.cc b/be/src/exprs/tuple-is-null-predicate.cc
index bb532b2..769ca33 100644
--- a/be/src/exprs/tuple-is-null-predicate.cc
+++ b/be/src/exprs/tuple-is-null-predicate.cc
@@ -27,7 +27,7 @@
 
 namespace impala {
 
-BooleanVal TupleIsNullPredicate::GetBooleanVal(
+BooleanVal TupleIsNullPredicate::GetBooleanValInterpreted(
     ScalarExprEvaluator* evaluator, const TupleRow* row) const {
   int count = 0;
   for (int i = 0; i < tuple_idxs_.size(); ++i) {
@@ -41,11 +41,11 @@ BooleanVal TupleIsNullPredicate::GetBooleanVal(
 TupleIsNullPredicate::TupleIsNullPredicate(const TExprNode& node)
   : Predicate(node),
     tuple_ids_(node.tuple_is_null_pred.tuple_ids.begin(),
-               node.tuple_is_null_pred.tuple_ids.end()) {
-}
+        node.tuple_is_null_pred.tuple_ids.end()) {}
 
-Status TupleIsNullPredicate::Init(const RowDescriptor& row_desc, RuntimeState* state) {
-  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, state));
+Status TupleIsNullPredicate::Init(
+    const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) {
+  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, is_entry_point, state));
   DCHECK_EQ(0, children_.size());
   // Resolve tuple ids to tuple indexes.
   for (int i = 0; i < tuple_ids_.size(); ++i) {
@@ -61,7 +61,7 @@ Status TupleIsNullPredicate::Init(const RowDescriptor& row_desc, RuntimeState* s
   return Status::OK();
 }
 
-Status TupleIsNullPredicate::GetCodegendComputeFn(LlvmCodeGen* codegen,
+Status TupleIsNullPredicate::GetCodegendComputeFnImpl(LlvmCodeGen* codegen,
     llvm::Function** fn) {
   return GetCodegendComputeFnWrapper(codegen, fn);
 }
diff --git a/be/src/exprs/tuple-is-null-predicate.h b/be/src/exprs/tuple-is-null-predicate.h
index ea631c0..552b325 100644
--- a/be/src/exprs/tuple-is-null-predicate.h
+++ b/be/src/exprs/tuple-is-null-predicate.h
@@ -39,12 +39,14 @@ class TupleIsNullPredicate: public Predicate {
 
   TupleIsNullPredicate(const TExprNode& node);
 
-  virtual Status Init(const RowDescriptor& row_desc, RuntimeState* state) override;
-  virtual Status GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn)
-      override WARN_UNUSED_RESULT;
+  virtual Status Init(
+      const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) override;
+  virtual Status GetCodegendComputeFnImpl(
+      LlvmCodeGen* codegen, llvm::Function** fn) override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override;
 
-  virtual BooleanVal GetBooleanVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  virtual BooleanVal GetBooleanValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const override;
 
  private:
   /// Tuple ids to check for NULL. May contain ids of nullable and non-nullable tuples.
diff --git a/be/src/exprs/valid-tuple-id.cc b/be/src/exprs/valid-tuple-id.cc
index d450aad..2884dcb 100644
--- a/be/src/exprs/valid-tuple-id.cc
+++ b/be/src/exprs/valid-tuple-id.cc
@@ -31,8 +31,9 @@ const char* ValidTupleIdExpr::LLVM_CLASS_NAME = "class.impala::ValidTupleIdExpr"
 
 ValidTupleIdExpr::ValidTupleIdExpr(const TExprNode& node) : ScalarExpr(node) {}
 
-Status ValidTupleIdExpr::Init(const RowDescriptor& row_desc, RuntimeState* state) {
-  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, state));
+Status ValidTupleIdExpr::Init(
+    const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) {
+  RETURN_IF_ERROR(ScalarExpr::Init(row_desc, is_entry_point, state));
   DCHECK_EQ(0, children_.size());
   tuple_ids_.reserve(row_desc.tuple_descriptors().size());
   for (TupleDescriptor* tuple_desc : row_desc.tuple_descriptors()) {
@@ -47,7 +48,8 @@ int ValidTupleIdExpr::ComputeNonNullCount(const TupleRow* row, int num_tuples) {
   return non_null_count;
 }
 
-IntVal ValidTupleIdExpr::GetIntVal(ScalarExprEvaluator* eval, const TupleRow* row) const {
+IntVal ValidTupleIdExpr::GetIntValInterpreted(
+    ScalarExprEvaluator* eval, const TupleRow* row) const {
   // Validate that exactly one tuple is non-NULL.
   int num_tuples = tuple_ids_.size();
   DCHECK_EQ(1, ComputeNonNullCount(row, num_tuples));
@@ -113,12 +115,8 @@ IntVal ValidTupleIdExpr::GetIntVal(ScalarExprEvaluator* eval, const TupleRow* ro
 //   ret i64 %ret17
 // }
 //
-Status ValidTupleIdExpr::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Function** fn) {
-  if (ir_compute_fn_ != nullptr) {
-    *fn = ir_compute_fn_;
-    return Status::OK();
-  }
-
+Status ValidTupleIdExpr::GetCodegendComputeFnImpl(
+    LlvmCodeGen* codegen, llvm::Function** fn) {
   // Create a method with the expected signature.
   llvm::Value* args[2];
   llvm::Function* new_fn = CreateIrFunctionPrototype("ValidTupleId", codegen, &args);
@@ -165,9 +163,6 @@ Status ValidTupleIdExpr::GetCodegendComputeFn(LlvmCodeGen* codegen, llvm::Functi
   if (UNLIKELY(*fn == nullptr)) {
     return Status(TErrorCode::IR_VERIFY_FAILED, "ValidTupleId");
   }
-
-  ir_compute_fn_ = *fn;
-
   return Status::OK();
 }
 
diff --git a/be/src/exprs/valid-tuple-id.h b/be/src/exprs/valid-tuple-id.h
index 7de6af0..342baf8 100644
--- a/be/src/exprs/valid-tuple-id.h
+++ b/be/src/exprs/valid-tuple-id.h
@@ -35,12 +35,14 @@ class ValidTupleIdExpr : public ScalarExpr {
 
   explicit ValidTupleIdExpr(const TExprNode& node);
 
-  virtual Status Init(const RowDescriptor& row_desc, RuntimeState* state) override;
-  virtual Status GetCodegendComputeFn(
+  virtual Status Init(
+      const RowDescriptor& row_desc, bool is_entry_point, RuntimeState* state) override;
+  virtual Status GetCodegendComputeFnImpl(
       LlvmCodeGen* codegen, llvm::Function** fn) override WARN_UNUSED_RESULT;
   virtual std::string DebugString() const override;
 
-  virtual IntVal GetIntVal(ScalarExprEvaluator*, const TupleRow*) const override;
+  virtual IntVal GetIntValInterpreted(
+      ScalarExprEvaluator*, const TupleRow*) const override;
 
  private:
   /// Maps from tuple index in the row to its corresponding tuple id.
diff --git a/be/src/runtime/CMakeLists.txt b/be/src/runtime/CMakeLists.txt
index a8c9b8b..ff0e43d 100644
--- a/be/src/runtime/CMakeLists.txt
+++ b/be/src/runtime/CMakeLists.txt
@@ -30,6 +30,7 @@ set_source_files_properties(${ROW_BATCH_PROTO_SRCS} PROPERTIES GENERATED TRUE)
 add_library(Runtime
   buffered-tuple-stream.cc
   client-cache.cc
+  collection-value.cc
   coordinator.cc
   coordinator-backend-state.cc
   datetime-parse-util.cc
diff --git a/be/src/exprs/null-literal-ir.cc b/be/src/runtime/collection-value.cc
similarity index 77%
rename from be/src/exprs/null-literal-ir.cc
rename to be/src/runtime/collection-value.cc
index 7c5fd86..4e6e8ed 100644
--- a/be/src/exprs/null-literal-ir.cc
+++ b/be/src/runtime/collection-value.cc
@@ -15,15 +15,10 @@
 // specific language governing permissions and limitations
 // under the License.
 
-#include "null-literal.h"
-#include "udf/udf.h"
+#include "runtime/collection-value.h"
 
 namespace impala {
 
-CollectionVal NullLiteral::GetCollectionVal(
-    ScalarExprEvaluator* eval, const TupleRow* row) const {
-  DCHECK(type_.IsCollectionType());
-  return CollectionVal::null();
-}
+const char* CollectionValue::LLVM_CLASS_NAME = "struct.impala::CollectionValue";
 
-} // namespace impala
\ No newline at end of file
+}
diff --git a/be/src/runtime/collection-value.h b/be/src/runtime/collection-value.h
index a71eff8..eea9c26 100644
--- a/be/src/runtime/collection-value.h
+++ b/be/src/runtime/collection-value.h
@@ -41,6 +41,9 @@ struct __attribute__((__packed__)) CollectionValue {
   inline int64_t ByteSize(const TupleDescriptor& item_tuple_desc) const {
     return static_cast<int64_t>(num_tuples) * item_tuple_desc.byte_size();
   }
+
+  /// For C++/IR interop, we need to be able to look up types by name.
+  static const char* LLVM_CLASS_NAME;
 };
 
 }
diff --git a/be/src/runtime/data-stream-test.cc b/be/src/runtime/data-stream-test.cc
index 648356c..764070d 100644
--- a/be/src/runtime/data-stream-test.cc
+++ b/be/src/runtime/data-stream-test.cc
@@ -338,7 +338,7 @@ class DataStreamTest : public testing::Test {
   // Create a tuple comparator to sort in ascending order on the single bigint column.
   void CreateTupleComparator() {
     SlotRef* lhs_slot = obj_pool_.Add(new SlotRef(TYPE_BIGINT, 0));
-    ASSERT_OK(lhs_slot->Init(RowDescriptor(), runtime_state_.get()));
+    ASSERT_OK(lhs_slot->Init(RowDescriptor(), true, runtime_state_.get()));
     ordering_exprs_.push_back(lhs_slot);
     less_than_ = obj_pool_.Add(new TupleRowComparator(ordering_exprs_,
         is_asc_, nulls_first_));
diff --git a/be/src/runtime/descriptors.cc b/be/src/runtime/descriptors.cc
index 5ee81e4..d3ea12c 100644
--- a/be/src/runtime/descriptors.cc
+++ b/be/src/runtime/descriptors.cc
@@ -716,8 +716,6 @@ llvm::StructType* TupleDescriptor::GetLlvmStruct(LlvmCodeGen* codegen) const {
   vector<llvm::Type*> struct_fields;
   int curr_struct_offset = 0;
   for (SlotDescriptor* slot: sorted_slots) {
-    // IMPALA-3207: Codegen for CHAR is not yet implemented: bail out of codegen here.
-    if (slot->type().type == TYPE_CHAR) return nullptr;
     DCHECK_EQ(curr_struct_offset, slot->tuple_offset());
     slot->llvm_field_idx_ = struct_fields.size();
     struct_fields.push_back(codegen->GetSlotType(slot->type()));
diff --git a/be/src/runtime/fragment-instance-state.cc b/be/src/runtime/fragment-instance-state.cc
index 749e135..5d5e3c4 100644
--- a/be/src/runtime/fragment-instance-state.cc
+++ b/be/src/runtime/fragment-instance-state.cc
@@ -331,7 +331,7 @@ Status FragmentInstanceState::Open() {
       // It shouldn't be fatal to fail codegen. However, until IMPALA-4233 is fixed,
       // ScalarFnCall has no fall back to interpretation when codegen fails so propagates
       // the error status for now.
-      RETURN_IF_ERROR(runtime_state_->CodegenScalarFns());
+      RETURN_IF_ERROR(runtime_state_->CodegenScalarExprs());
     }
 
     LlvmCodeGen* codegen = runtime_state_->codegen();
diff --git a/be/src/runtime/krpc-data-stream-sender.cc b/be/src/runtime/krpc-data-stream-sender.cc
index 123f302..156017e 100644
--- a/be/src/runtime/krpc-data-stream-sender.cc
+++ b/be/src/runtime/krpc-data-stream-sender.cc
@@ -722,7 +722,8 @@ Status KrpcDataStreamSender::CodegenHashRow(LlvmCodeGen* codegen, llvm::Function
   // Unroll the loop and codegen each of the partition expressions
   for (int i = 0; i < partition_exprs_.size(); ++i) {
     llvm::Function* compute_fn;
-    RETURN_IF_ERROR(partition_exprs_[i]->GetCodegendComputeFn(codegen, &compute_fn));
+    RETURN_IF_ERROR(
+        partition_exprs_[i]->GetCodegendComputeFn(codegen, false, &compute_fn));
 
     // Load the expression evaluator for the i-th partition expression
     llvm::Function* get_expr_eval_fn =
diff --git a/be/src/runtime/runtime-state.cc b/be/src/runtime/runtime-state.cc
index 09dbd60..1443ae1 100644
--- a/be/src/runtime/runtime-state.cc
+++ b/be/src/runtime/runtime-state.cc
@@ -174,10 +174,10 @@ Status RuntimeState::CreateCodegen() {
   return Status::OK();
 }
 
-Status RuntimeState::CodegenScalarFns() {
-  for (ScalarFnCall* scalar_fn : scalar_fns_to_codegen_) {
+Status RuntimeState::CodegenScalarExprs() {
+  for (auto& item : scalar_exprs_to_codegen_) {
     llvm::Function* fn;
-    RETURN_IF_ERROR(scalar_fn->GetCodegendComputeFn(codegen_.get(), &fn));
+    RETURN_IF_ERROR(item.first->GetCodegendComputeFn(codegen_.get(), item.second, &fn));
   }
   return Status::OK();
 }
diff --git a/be/src/runtime/runtime-state.h b/be/src/runtime/runtime-state.h
index d619bb0..3c8c321 100644
--- a/be/src/runtime/runtime-state.h
+++ b/be/src/runtime/runtime-state.h
@@ -20,6 +20,7 @@
 #define IMPALA_RUNTIME_RUNTIME_STATE_H
 
 #include <boost/scoped_ptr.hpp>
+#include <utility>
 #include <vector>
 #include <string>
 
@@ -43,7 +44,7 @@ class MemTracker;
 class ObjectPool;
 class ReservationTracker;
 class RuntimeFilterBank;
-class ScalarFnCall;
+class ScalarExpr;
 class Status;
 class TimestampValue;
 class ThreadResourcePool;
@@ -135,15 +136,20 @@ class RuntimeState {
 
   const std::string& GetEffectiveUser() const;
 
-  /// Add ScalarFnCall expression 'udf' to be codegen'd later if it's not disabled by
-  /// query option. This is for cases in which the UDF cannot be interpreted or if the
-  /// plan fragment doesn't contain any codegen enabled operator.
-  void AddScalarFnToCodegen(ScalarFnCall* udf) { scalar_fns_to_codegen_.push_back(udf); }
+  /// Add ScalarExpr expression 'expr' to be codegen'd later if it's not disabled by
+  /// query option. If 'is_codegen_entry_point' is true, 'expr' will be an entry
+  /// point into codegen'd evaluation (i.e. it will have a function pointer populated).
+  /// Adding an expr here ensures that it will be codegen'd (i.e. fragment execution
+  /// will fail with an error if the expr cannot be codegen'd).
+  void AddScalarExprToCodegen(ScalarExpr* expr, bool is_codegen_entry_point) {
+    scalar_exprs_to_codegen_.push_back({expr, is_codegen_entry_point});
+  }
 
-  /// Returns true if there are ScalarFnCall expressions in the fragments which can't be
-  /// interpreted. This should only be used after the Prepare() phase in which all
-  /// expressions' Prepare() are invoked.
-  bool ScalarFnNeedsCodegen() const { return !scalar_fns_to_codegen_.empty(); }
+  /// Returns true if there are ScalarExpr expressions in the fragments that we want
+  /// to codegen (because they can't be interpreted or based on options/hints).
+  /// This should only be used after the Prepare() phase in which all expressions'
+  /// Prepare() are invoked.
+  bool ScalarExprNeedsCodegen() const { return !scalar_exprs_to_codegen_.empty(); }
 
   /// Check if codegen was disabled and if so, add a message to the runtime profile.
   void CheckAndAddCodegenDisabledMessage(RuntimeProfile* profile) {
@@ -157,7 +163,7 @@ class RuntimeState {
   /// Returns true if there is a hint to disable codegen. This can be true for single node
   /// optimization or expression evaluation request from FE to BE (see fe-support.cc).
   /// Note that this internal flag is advisory and it may be ignored if the fragment has
-  /// any UDF which cannot be interpreted. See ScalarFnCall::Prepare() for details.
+  /// any UDF which cannot be interpreted. See ScalarExpr::Prepare() for details.
   inline bool CodegenHasDisableHint() const {
     return query_ctx().disable_codegen_hint;
   }
@@ -166,7 +172,7 @@ class RuntimeState {
   /// fragment can be interpreted. This should only be used after the Prepare() phase
   /// in which all expressions' Prepare() are invoked.
   inline bool CodegenDisabledByHint() const {
-    return CodegenHasDisableHint() && !ScalarFnNeedsCodegen();
+    return CodegenHasDisableHint() && !ScalarExprNeedsCodegen();
   }
 
   /// Returns true if codegen is disabled by query option.
@@ -283,11 +289,11 @@ class RuntimeState {
   /// Create a codegen object accessible via codegen() if it doesn't exist already.
   Status CreateCodegen();
 
-  /// Codegen all ScalarFnCall expressions in 'scalar_fns_to_codegen_'. If codegen fails
+  /// Codegen all ScalarExpr expressions in 'scalar_exprs_to_codegen_'. If codegen fails
   /// for any expressions, return immediately with the error status. Once IMPALA-4233 is
   /// fixed, it's not fatal to fail codegen if the expression can be interpreted.
   /// TODO: Fix IMPALA-4233
-  Status CodegenScalarFns();
+  Status CodegenScalarExprs();
 
   /// Helper to call QueryState::StartSpilling().
   Status StartSpilling(MemTracker* mem_tracker);
@@ -334,8 +340,9 @@ class RuntimeState {
 
   boost::scoped_ptr<LlvmCodeGen> codegen_;
 
-  /// Contains all ScalarFnCall expressions which need to be codegen'd.
-  vector<ScalarFnCall*> scalar_fns_to_codegen_;
+  /// Contains all ScalarExpr expressions which need to be codegen'd. The second element
+  /// is true if we want to generate a codegen entry point for this expr.
+  std::vector<std::pair<ScalarExpr*, bool>> scalar_exprs_to_codegen_;
 
   /// Thread resource management object for this fragment's execution.  The runtime
   /// state is responsible for returning this pool to the thread mgr.
diff --git a/be/src/runtime/tuple.cc b/be/src/runtime/tuple.cc
index 4f9448a..916ae6f 100644
--- a/be/src/runtime/tuple.cc
+++ b/be/src/runtime/tuple.cc
@@ -328,7 +328,7 @@ Status Tuple::CodegenMaterializeExprs(LlvmCodeGen* codegen, bool collect_string_
   llvm::Function* materialize_expr_fns[slot_materialize_exprs.size()];
   for (int i = 0; i < slot_materialize_exprs.size(); ++i) {
     Status status = slot_materialize_exprs[i]->GetCodegendComputeFn(
-        codegen, &materialize_expr_fns[i]);
+        codegen, false, &materialize_expr_fns[i]);
     if (!status.ok()) {
       return Status::Expected(Substitute("Could not codegen CodegenMaterializeExprs: $0",
             status.GetDetail()));
diff --git a/be/src/service/fe-support.cc b/be/src/service/fe-support.cc
index 07e5c01..830169f 100644
--- a/be/src/service/fe-support.cc
+++ b/be/src/service/fe-support.cc
@@ -225,12 +225,12 @@ Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow(
   }
 
   // UDFs which cannot be interpreted need to be handled by codegen.
-  if (state.ScalarFnNeedsCodegen()) {
+  if (state.ScalarExprNeedsCodegen()) {
     status = state.CreateCodegen();
     if (!status.ok()) goto error;
     LlvmCodeGen* codegen = state.codegen();
     DCHECK(codegen != NULL);
-    status = state.CodegenScalarFns();
+    status = state.CodegenScalarExprs();
     if (!status.ok()) goto error;
     codegen->EnableOptimizations(false);
     status = codegen->FinalizeModule();
diff --git a/be/src/udf/udf-internal.h b/be/src/udf/udf-internal.h
index 840efa9..41f8562 100644
--- a/be/src/udf/udf-internal.h
+++ b/be/src/udf/udf-internal.h
@@ -296,13 +296,16 @@ namespace impala_udf {
 /// ready for public consumption because users must have access to our internal tuple
 /// layout.
 struct CollectionVal : public AnyVal {
-  uint8_t* ptr;
+  // Put num_tuples before ptr so that 'AnyVal::is_null', 'num_tuples' and 'ptr' can be
+  // packed into 16 bytes. This matches the memory layout of StringVal, which allows
+  // sharing of support in CodegenAnyval.
   int num_tuples;
+  uint8_t* ptr;
 
   /// Construct an CollectionVal from ptr/num_tuples. Note: this does not make a copy of
   /// ptr so the buffer must exist as long as this CollectionVal does.
   CollectionVal(uint8_t* ptr = NULL, int num_tuples = 0)
-      : ptr(ptr), num_tuples(num_tuples) {}
+      : num_tuples(num_tuples), ptr(ptr) {}
 
   static CollectionVal null() {
     CollectionVal cv;
@@ -311,6 +314,11 @@ struct CollectionVal : public AnyVal {
   }
 };
 
-}
+#pragma GCC diagnostic ignored "-Winvalid-offsetof"
+static_assert(sizeof(CollectionVal) == sizeof(StringVal), "Wrong size.");
+static_assert(
+    offsetof(CollectionVal, num_tuples) == offsetof(StringVal, len), "Wrong offset.");
+static_assert(offsetof(CollectionVal, ptr) == offsetof(StringVal, ptr), "Wrong offset.");
+} // namespace impala_udf
 
 #endif
diff --git a/be/src/util/tuple-row-compare.cc b/be/src/util/tuple-row-compare.cc
index f05a88e..f424eb4 100644
--- a/be/src/util/tuple-row-compare.cc
+++ b/be/src/util/tuple-row-compare.cc
@@ -207,7 +207,7 @@ Status TupleRowComparator::CodegenCompare(LlvmCodeGen* codegen, llvm::Function**
   const vector<ScalarExpr*>& ordering_exprs = ordering_exprs_;
   llvm::Function* key_fns[ordering_exprs.size()];
   for (int i = 0; i < ordering_exprs.size(); ++i) {
-    Status status = ordering_exprs[i]->GetCodegendComputeFn(codegen, &key_fns[i]);
+    Status status = ordering_exprs[i]->GetCodegendComputeFn(codegen, false, &key_fns[i]);
     if (!status.ok()) {
       return Status::Expected(Substitute(
             "Could not codegen TupleRowComparator::Compare(): $0", status.GetDetail()));
diff --git a/testdata/workloads/functional-query/queries/QueryTest/datastream-sender-codegen.test b/testdata/workloads/functional-query/queries/QueryTest/datastream-sender-codegen.test
index ad396ad..07c15c2 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/datastream-sender-codegen.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/datastream-sender-codegen.test
@@ -36,6 +36,8 @@ select count(*) from chars_tiny t1
 ---- TYPES
 bigint
 ---- RUNTIME_PROFILE
-# Verify that codegen was disabled
-row_regex: .*Hash Partitioned Sender Codegen Disabled: Codegen for Char not supported.*
+# Verify that CHAR codegen was enabled for hash partitioning even though CHAR
+# codegen isn't supported everywhere.
+row_regex: .*Hash Partitioned Sender Codegen Enabled.*
+row_regex: .*Char isn't supported for CodegenWriteSlot.*
 ====
diff --git a/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test b/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test
index 600ca45..bb94195 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test
@@ -29,8 +29,7 @@ bigint
 row_regex: .*Codegen Disabled: disabled due to optimization hints.*
 ====
 ---- QUERY
-# IMPALA-6435: We do not codegen char columns. This fix checks for a
-# CHAR type literal in the expr and disables codegen. This query will crash
+# IMPALA-6435: codegen for NULL CHAR literals was broken. This query crashed
 # impala without the fix.
 set disable_codegen_rows_threshold=0;
 select count(*) from (
diff --git a/testdata/workloads/functional-query/queries/QueryTest/udf.test b/testdata/workloads/functional-query/queries/QueryTest/udf.test
index 7e304f4..7a9a31f 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/udf.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/udf.test
@@ -111,6 +111,9 @@ date
 2013-10-09
 ====
 ---- QUERY
+# This provides coverage for ScalarExprEvaluator::GetConstValue(), which will interpret
+# constant_timestamp(). This means that for both native and IR UDFs, constant_timestamp()
+# needs to support evaluation from interpreted code.
 select from_utc_timestamp(constant_timestamp(), "UTC");
 ---- TYPES
 timestamp
diff --git a/testdata/workloads/tpcds-insert/queries/expr-insert.test b/testdata/workloads/tpcds-insert/queries/expr-insert.test
new file mode 100644
index 0000000..1084029
--- /dev/null
+++ b/testdata/workloads/tpcds-insert/queries/expr-insert.test
@@ -0,0 +1,21 @@
+====
+---- QUERY: TPDCS-STR-INSERT-DROP
+DROP TABLE IF EXISTS str_insert
+====
+---- QUERY: TPDCS-STR-INSERT-SETUP
+CREATE TABLE str_insert (s string) STORED AS PARQUET
+---- RESULTS
+'Table has been created.'
+====
+---- QUERY: TPDCS-STR-INSERT-CASE
+INSERT INTO str_insert
+SELECT case when ss_promo_sk % 2 = 0 then 'even' else 'odd' end
+FROM store_sales
+---- RESULTS
+: 2880404
+====
+---- QUERY: TPCDS-STR-INSERT-CASE
+SELECT COUNT(*) FROM str_insert
+---- RESULTS
+2880404
+====
diff --git a/tests/query_test/test_codegen.py b/tests/query_test/test_codegen.py
index 7436c13..af723c8 100644
--- a/tests/query_test/test_codegen.py
+++ b/tests/query_test/test_codegen.py
@@ -58,17 +58,38 @@ class TestCodegen(ImpalaTestSuite):
 
   def test_codegen_failure_for_char_type(self, vector):
     """IMPALA-7288: Regression tests for the codegen failure path when working with a
-    CHAR column type"""
-    # Test failure path in HashTableCtx::CodegenEquals().
+    CHAR column type. Until IMPALA-3207 is completely fixed there are various paths where
+    we need to bail out of codegen."""
+    # Previously codegen for this join failed in HashTableCtx::CodegenEquals() because of
+    # missing ScalarFnCall codegen support, which was added in IMPALA-7331.
     result = self.execute_query("select 1 from functional.chars_tiny t1, "
                                 "functional.chars_tiny t2 "
-                                "where t1.cs = cast(t2.cs as string)");
-    assert "Codegen Disabled: Problem with HashTableCtx::CodegenEquals: ScalarFnCall" \
-           " Codegen not supported for CHAR" in str(result.runtime_profile)
+                                "where t1.cs = cast(t2.cs as string)")
+    profile_str = str(result.runtime_profile)
+    assert "Probe Side Codegen Enabled" in profile_str, profile_str
+    assert "Build Side Codegen Enabled" in profile_str, profile_str
+    assert ("TextConverter::CodegenWriteSlot(): Char isn't supported for CodegenWriteSlot"
+            in profile_str), profile_str
 
-    # Test failure path in HashTableCtx::CodegenEvalRow().
+    # Codegen for this join fails because it is joining two CHAR exprs.
+    result = self.execute_query("select 1 from functional.chars_tiny t1, "
+                                "functional.chars_tiny t2 "
+                                "where t1.cs = t2.cs")
+    profile_str = str(result.runtime_profile)
+    assert ("Probe Side Codegen Disabled: HashTableCtx::CodegenHashRow(): CHAR NYI"
+            in profile_str), profile_str
+    assert ("Build Side Codegen Disabled: HashTableCtx::CodegenHashRow(): CHAR NYI"
+            in profile_str), profile_str
+    assert ("TextConverter::CodegenWriteSlot(): Char isn't supported for CodegenWriteSlot"
+            in profile_str), profile_str
+
+    # Previously codegen for this join failed in HashTableCtx::CodegenEvalRow() because of
+    # missing ScalarFnCall codegen support, which was added in IMPALA-7331.
     result = self.execute_query("select 1 from functional.chars_tiny t1, "
                                 "functional.chars_tiny t2 where t1.cs = "
-                                "FROM_TIMESTAMP(cast(t2.cs as string), 'yyyyMMdd')");
-    assert "Codegen Disabled: Problem with HashTableCtx::CodegenEvalRow(): ScalarFnCall" \
-           " Codegen not supported for CHAR" in str(result.runtime_profile)
+                                "FROM_TIMESTAMP(cast(t2.cs as string), 'yyyyMMdd')")
+    profile_str = str(result.runtime_profile)
+    assert "Probe Side Codegen Enabled" in profile_str, profile_str
+    assert "Build Side Codegen Enabled" in profile_str, profile_str
+    assert ("TextConverter::CodegenWriteSlot(): Char isn't supported for CodegenWriteSlot"
+            in profile_str), profile_str
diff --git a/tests/query_test/test_tpcds_queries.py b/tests/query_test/test_tpcds_queries.py
index 72b0224..2d26309 100644
--- a/tests/query_test/test_tpcds_queries.py
+++ b/tests/query_test/test_tpcds_queries.py
@@ -520,6 +520,9 @@ class TestTpcdsInsert(ImpalaTestSuite):
   def test_tpcds_partitioned_insert(self, vector):
     self.run_test_case('partitioned-insert', vector)
 
+  def test_expr_insert(self, vector):
+    self.run_test_case('expr-insert', vector)
+
 
 class TestTpcdsUnmodified(ImpalaTestSuite):
   @classmethod