You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2019/03/08 18:38:30 UTC

[arrow] branch master updated: ARROW-4782: [C++] Prototype array and scalar expression types to help with building an deferred compute graph

This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 08ca13f  ARROW-4782: [C++] Prototype array and scalar expression types to help with building an deferred compute graph
08ca13f is described below

commit 08ca13f83f3d6dbd818c4280d619dae306aa9de5
Author: Wes McKinney <we...@apache.org>
AuthorDate: Fri Mar 8 12:38:21 2019 -0600

    ARROW-4782: [C++] Prototype array and scalar expression types to help with building an deferred compute graph
    
    Basic ideas
    
    * `Expr` is the base class for the "edges" of the graph, i.e. data dependencies
    * `Operation` is the base for nodes in the graph. An Operation takes input Expr dependencies, plus any static / non-Expr arguments, and produces an output Expr. During this output resolution, type checking and validation is performed
    
    This patch does not get into various other necessary expression types, particularly table level operations like projections, filters, aggregations, and joins. I'll look at those in follow up patches.
    
    I also added the `arrow::compute::LogicalType` idea which provides the idea of an "unbound" / non-concrete type. The idea of this type is to permit instances of "type classes" like "number" or "integer" in addition to concrete data types like "int32". I think we need to have type objects that are decoupled from the metadata used for the Arrow columnar format, even though in some cases there is a 1-to-1 mapping. There are some other things to contemplate in future patches like "unbound [...]
    
    The general approach here is inspired by a pure Python expression algebra system I designed in 2015 called Ibis
    
    https://github.com/ibis-project/ibis/tree/master/ibis/expr
    
    With this system I was able to accurately model a superset of SQL and, with the help of some other open source developers, create compilers from the expression algebra to SQL or other backend execution.
    
    Author: Wes McKinney <we...@apache.org>
    Author: Krisztián Szűcs <sz...@gmail.com>
    
    Closes #3820 from wesm/compute-expr-prototyping and squashes the following commits:
    
    3a837351 <Wes McKinney> Code review comments, add some type convenience aliases
    1571b3b8 <Krisztián Szűcs> virtual destructor for LogicalType
    0142cfeb <Wes McKinney> Fix issues on Windows
    7a6b1ed8 <Wes McKinney> Smoke tests for example ops
    86283288 <Wes McKinney> More expr boxing scaffolding and logical type tests
    c00b6299 <Wes McKinney> Add some basic expression type factories
    a6651678 <Wes McKinney> Get some very basic unit tests passing
    03a3ed83 <Wes McKinney> Remove superfluous file
    a56b81cb <Wes McKinney> More scaffolding, a bit cleaner API, factory methods for expr types
    8eaadd93 <Wes McKinney> More boilerplate
    9859982e <Wes McKinney> Prototyping
    65dbdcbf <Wes McKinney> Prototyping
    77303f0d <Wes McKinney> Prototype
    86ec1521 <Wes McKinney> More prototyping / scaffolding
    8d424aae <Wes McKinney> Prototyping
    510bb7d5 <Wes McKinney> Prototyping
    c73f420b <Wes McKinney> Prototyping
    1fff4b81 <Wes McKinney> seed
---
 cpp/build-support/run_cpplint.py                   |   6 +
 cpp/src/arrow/CMakeLists.txt                       |   7 +-
 cpp/src/arrow/compute/CMakeLists.txt               |   2 +
 cpp/src/arrow/compute/expression-test.cc           | 151 ++++++++++
 cpp/src/arrow/compute/expression.cc                | 256 +++++++++++++++++
 cpp/src/arrow/compute/expression.h                 | 261 +++++++++++++++++
 cpp/src/arrow/compute/logical_type.cc              | 148 ++++++++++
 cpp/src/arrow/compute/logical_type.h               | 308 +++++++++++++++++++++
 cpp/src/arrow/compute/operation.cc                 |  31 +++
 cpp/src/arrow/compute/operation.h                  |  52 ++++
 cpp/src/arrow/compute/operations/cast.cc           |  53 ++++
 cpp/src/arrow/compute/operations/cast.h            |  46 +++
 cpp/src/arrow/compute/operations/literal.cc        |  40 +++
 cpp/src/arrow/compute/operations/literal.h         |  45 +++
 .../arrow/compute/operations/operations-test.cc    |  64 +++++
 cpp/src/arrow/compute/type_fwd.h                   |  38 +++
 cpp/src/arrow/memory_pool.h                        |   3 +-
 17 files changed, 1508 insertions(+), 3 deletions(-)

diff --git a/cpp/build-support/run_cpplint.py b/cpp/build-support/run_cpplint.py
index d291b61..9ee28af 100755
--- a/cpp/build-support/run_cpplint.py
+++ b/cpp/build-support/run_cpplint.py
@@ -26,8 +26,14 @@ import platform
 from functools import partial
 
 
+# NOTE(wesm):
+#
+# * readability/casting is disabled as it aggressively warns about functions
+#   with names like "int32", so "int32(x)", where int32 is a function name,
+#   warns with
 _filters = '''
 -whitespace/comments
+-readability/casting
 -readability/todo
 -build/header_guard
 -build/c++11
diff --git a/cpp/src/arrow/CMakeLists.txt b/cpp/src/arrow/CMakeLists.txt
index 2c3f00d..b541579 100644
--- a/cpp/src/arrow/CMakeLists.txt
+++ b/cpp/src/arrow/CMakeLists.txt
@@ -143,6 +143,9 @@ if(ARROW_COMPUTE)
   set(ARROW_SRCS
       ${ARROW_SRCS}
       compute/context.cc
+      compute/expression.cc
+      compute/logical_type.cc
+      compute/operation.cc
       compute/kernels/aggregate.cc
       compute/kernels/boolean.cc
       compute/kernels/cast.cc
@@ -150,7 +153,9 @@ if(ARROW_COMPUTE)
       compute/kernels/hash.cc
       compute/kernels/mean.cc
       compute/kernels/sum.cc
-      compute/kernels/util-internal.cc)
+      compute/kernels/util-internal.cc
+      compute/operations/cast.cc
+      compute/operations/literal.cc)
 endif()
 
 if(ARROW_CUDA)
diff --git a/cpp/src/arrow/compute/CMakeLists.txt b/cpp/src/arrow/compute/CMakeLists.txt
index 68ebf44..ae5a936 100644
--- a/cpp/src/arrow/compute/CMakeLists.txt
+++ b/cpp/src/arrow/compute/CMakeLists.txt
@@ -25,6 +25,8 @@ arrow_add_pkg_config("arrow-compute")
 #
 
 add_arrow_test(compute-test)
+add_arrow_test(expression-test PREFIX "arrow-compute")
+add_arrow_test(operations/operations-test PREFIX "arrow-compute")
 add_arrow_benchmark(compute-benchmark)
 
 add_subdirectory(kernels)
diff --git a/cpp/src/arrow/compute/expression-test.cc b/cpp/src/arrow/compute/expression-test.cc
new file mode 100644
index 0000000..31cfd5b
--- /dev/null
+++ b/cpp/src/arrow/compute/expression-test.cc
@@ -0,0 +1,151 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <cstdint>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include <gtest/gtest.h>
+
+#include "arrow/status.h"
+#include "arrow/table.h"
+#include "arrow/testing/gtest_common.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/type.h"
+
+#include "arrow/compute/expression.h"
+#include "arrow/compute/logical_type.h"
+#include "arrow/compute/operation.h"
+
+namespace arrow {
+namespace compute {
+
+// A placeholder operator implementation to use for testing various Expr
+// behavior
+class DummyOp : public Operation {
+ public:
+  Status ToExpr(ExprPtr* out) const override { return Status::NotImplemented("NYI"); }
+};
+
+TEST(TestLogicalType, NonNestedToString) {
+  std::vector<std::pair<LogicalTypePtr, std::string>> type_to_name = {
+      {type::any(), "Any"},
+      {type::null(), "Null"},
+      {type::boolean(), "Bool"},
+      {type::number(), "Number"},
+      {type::floating(), "Floating"},
+      {type::integer(), "Integer"},
+      {type::signed_integer(), "SignedInteger"},
+      {type::unsigned_integer(), "UnsignedInteger"},
+      {type::int8(), "Int8"},
+      {type::int16(), "Int16"},
+      {type::int32(), "Int32"},
+      {type::int64(), "Int64"},
+      {type::uint8(), "UInt8"},
+      {type::uint16(), "UInt16"},
+      {type::uint32(), "UInt32"},
+      {type::uint64(), "UInt64"},
+      {type::float16(), "Float16"},
+      {type::float32(), "Float32"},
+      {type::float64(), "Float64"},
+      {type::binary(), "Binary"},
+      {type::utf8(), "Utf8"}};
+
+  for (auto& entry : type_to_name) {
+    ASSERT_EQ(entry.second, entry.first->ToString());
+  }
+}
+
+class DummyExpr : public Expr {
+ public:
+  using Expr::Expr;
+  std::string kind() const override { return "dummy"; }
+};
+
+TEST(TestLogicalType, Any) {
+  auto op = std::make_shared<DummyOp>();
+  auto t = type::any();
+  ASSERT_TRUE(t->IsInstance(*scalar::int32(op)));
+  ASSERT_TRUE(t->IsInstance(*array::binary(op)));
+  ASSERT_FALSE(t->IsInstance(*std::make_shared<DummyExpr>(op)));
+}
+
+TEST(TestLogicalType, Number) {
+  auto op = std::make_shared<DummyOp>();
+  auto t = type::number();
+
+  ASSERT_TRUE(t->IsInstance(*scalar::int32(op)));
+  ASSERT_TRUE(t->IsInstance(*scalar::float64(op)));
+  ASSERT_FALSE(t->IsInstance(*scalar::boolean(op)));
+  ASSERT_FALSE(t->IsInstance(*scalar::null(op)));
+  ASSERT_FALSE(t->IsInstance(*scalar::binary(op)));
+}
+
+TEST(TestLogicalType, IntegerBaseTypes) {
+  auto op = std::make_shared<DummyOp>();
+  auto all_ty = type::integer();
+  auto signed_ty = type::signed_integer();
+  auto unsigned_ty = type::unsigned_integer();
+
+  ASSERT_TRUE(all_ty->IsInstance(*scalar::int32(op)));
+  ASSERT_TRUE(all_ty->IsInstance(*scalar::uint32(op)));
+  ASSERT_FALSE(all_ty->IsInstance(*array::float64(op)));
+  ASSERT_FALSE(all_ty->IsInstance(*array::binary(op)));
+
+  ASSERT_TRUE(signed_ty->IsInstance(*array::int32(op)));
+  ASSERT_FALSE(signed_ty->IsInstance(*scalar::uint32(op)));
+
+  ASSERT_TRUE(unsigned_ty->IsInstance(*scalar::uint32(op)));
+  ASSERT_TRUE(unsigned_ty->IsInstance(*array::uint32(op)));
+  ASSERT_FALSE(unsigned_ty->IsInstance(*array::int8(op)));
+}
+
+TEST(TestLogicalType, NumberConcreteIsinstance) {
+  auto op = std::make_shared<DummyOp>();
+
+  std::vector<LogicalTypePtr> types = {
+      type::null(),    type::boolean(), type::int8(),    type::int16(),  type::int32(),
+      type::int64(),   type::uint8(),   type::uint16(),  type::uint32(), type::uint64(),
+      type::float16(), type::float32(), type::float64(), type::binary(), type::utf8()};
+
+  std::vector<ExprPtr> exprs = {
+      scalar::null(op),    array::null(op),    scalar::boolean(op), array::boolean(op),
+      scalar::int8(op),    array::int8(op),    scalar::int16(op),   array::int16(op),
+      scalar::int32(op),   array::int32(op),   scalar::int64(op),   array::int64(op),
+      scalar::uint8(op),   array::uint8(op),   scalar::uint16(op),  array::uint16(op),
+      scalar::uint32(op),  array::uint32(op),  scalar::uint64(op),  array::uint64(op),
+      scalar::float16(op), array::float16(op), scalar::float32(op), array::float32(op),
+      scalar::float64(op), array::float64(op)};
+
+  for (auto ty : types) {
+    int num_matches = 0;
+    for (auto expr : exprs) {
+      const auto& v_expr = static_cast<const ValueExpr&>(*expr);
+      const bool ty_matches = v_expr.type()->id() == ty->id();
+      ASSERT_EQ(ty_matches, ty->IsInstance(v_expr))
+          << "Expr: " << expr->kind() << " Type: " << ty->ToString();
+      num_matches += ty_matches;
+    }
+    // Each logical type is represented twice in the list of exprs, once in
+    // array form, the other in scalar form
+    ASSERT_LE(num_matches, 2);
+  }
+}
+
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/expression.cc b/cpp/src/arrow/compute/expression.cc
new file mode 100644
index 0000000..6ca583c
--- /dev/null
+++ b/cpp/src/arrow/compute/expression.cc
@@ -0,0 +1,256 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/expression.h"
+
+#include <memory>
+#include <sstream>
+#include <utility>
+
+#include "arrow/compute/logical_type.h"
+#include "arrow/compute/operation.h"
+#include "arrow/status.h"
+
+namespace arrow {
+namespace compute {
+
+Expr::Expr(ConstOpPtr op) : op_(std::move(op)) {}
+
+ValueExpr::ValueExpr(ConstOpPtr op, LogicalTypePtr type)
+    : Expr(std::move(op)), type_(std::move(type)) {}
+
+LogicalTypePtr ValueExpr::type() const { return type_; }
+
+std::string ArrayExpr::kind() const {
+  std::stringstream ss;
+  ss << "array[" << type_->ToString() << "]";
+  return ss.str();
+}
+
+ValueRank ArrayExpr::rank() const { return ValueRank::ARRAY; }
+
+std::string ScalarExpr::kind() const {
+  std::stringstream ss;
+  ss << "scalar[" << type_->ToString() << "]";
+  return ss.str();
+}
+
+ValueRank ScalarExpr::rank() const { return ValueRank::SCALAR; }
+
+// ----------------------------------------------------------------------
+
+#define SIMPLE_EXPR_FACTORY(NAME, TYPE) \
+  ExprPtr NAME(ConstOpPtr op) { return std::make_shared<TYPE>(std::move(op)); }
+
+namespace scalar {
+
+#define SCALAR_EXPR_METHODS(NAME) \
+  NAME::NAME(ConstOpPtr op) : ScalarExpr(std::move(op), std::make_shared<type::NAME>()) {}
+
+SCALAR_EXPR_METHODS(Null)
+SCALAR_EXPR_METHODS(Bool)
+SCALAR_EXPR_METHODS(Int8)
+SCALAR_EXPR_METHODS(Int16)
+SCALAR_EXPR_METHODS(Int32)
+SCALAR_EXPR_METHODS(Int64)
+SCALAR_EXPR_METHODS(UInt8)
+SCALAR_EXPR_METHODS(UInt16)
+SCALAR_EXPR_METHODS(UInt32)
+SCALAR_EXPR_METHODS(UInt64)
+SCALAR_EXPR_METHODS(Float16)
+SCALAR_EXPR_METHODS(Float32)
+SCALAR_EXPR_METHODS(Float64)
+SCALAR_EXPR_METHODS(Binary)
+SCALAR_EXPR_METHODS(Utf8)
+
+SIMPLE_EXPR_FACTORY(null, Null);
+SIMPLE_EXPR_FACTORY(boolean, Bool);
+SIMPLE_EXPR_FACTORY(int8, Int8);
+SIMPLE_EXPR_FACTORY(int16, Int16);
+SIMPLE_EXPR_FACTORY(int32, Int32);
+SIMPLE_EXPR_FACTORY(int64, Int64);
+SIMPLE_EXPR_FACTORY(uint8, UInt8);
+SIMPLE_EXPR_FACTORY(uint16, UInt16);
+SIMPLE_EXPR_FACTORY(uint32, UInt32);
+SIMPLE_EXPR_FACTORY(uint64, UInt64);
+SIMPLE_EXPR_FACTORY(float16, Float16);
+SIMPLE_EXPR_FACTORY(float32, Float32);
+SIMPLE_EXPR_FACTORY(float64, Float64);
+SIMPLE_EXPR_FACTORY(binary, Binary);
+SIMPLE_EXPR_FACTORY(utf8, Utf8);
+
+List::List(ConstOpPtr op, LogicalTypePtr type)
+    : ScalarExpr(std::move(op), std::move(type)) {}
+
+Struct::Struct(ConstOpPtr op, LogicalTypePtr type)
+    : ScalarExpr(std::move(op), std::move(type)) {}
+
+}  // namespace scalar
+
+namespace array {
+
+#define ARRAY_EXPR_METHODS(NAME) \
+  NAME::NAME(ConstOpPtr op) : ArrayExpr(std::move(op), std::make_shared<type::NAME>()) {}
+
+ARRAY_EXPR_METHODS(Null)
+ARRAY_EXPR_METHODS(Bool)
+ARRAY_EXPR_METHODS(Int8)
+ARRAY_EXPR_METHODS(Int16)
+ARRAY_EXPR_METHODS(Int32)
+ARRAY_EXPR_METHODS(Int64)
+ARRAY_EXPR_METHODS(UInt8)
+ARRAY_EXPR_METHODS(UInt16)
+ARRAY_EXPR_METHODS(UInt32)
+ARRAY_EXPR_METHODS(UInt64)
+ARRAY_EXPR_METHODS(Float16)
+ARRAY_EXPR_METHODS(Float32)
+ARRAY_EXPR_METHODS(Float64)
+ARRAY_EXPR_METHODS(Binary)
+ARRAY_EXPR_METHODS(Utf8)
+
+SIMPLE_EXPR_FACTORY(null, Null);
+SIMPLE_EXPR_FACTORY(boolean, Bool);
+SIMPLE_EXPR_FACTORY(int8, Int8);
+SIMPLE_EXPR_FACTORY(int16, Int16);
+SIMPLE_EXPR_FACTORY(int32, Int32);
+SIMPLE_EXPR_FACTORY(int64, Int64);
+SIMPLE_EXPR_FACTORY(uint8, UInt8);
+SIMPLE_EXPR_FACTORY(uint16, UInt16);
+SIMPLE_EXPR_FACTORY(uint32, UInt32);
+SIMPLE_EXPR_FACTORY(uint64, UInt64);
+SIMPLE_EXPR_FACTORY(float16, Float16);
+SIMPLE_EXPR_FACTORY(float32, Float32);
+SIMPLE_EXPR_FACTORY(float64, Float64);
+SIMPLE_EXPR_FACTORY(binary, Binary);
+SIMPLE_EXPR_FACTORY(utf8, Utf8);
+
+List::List(ConstOpPtr op, LogicalTypePtr type)
+    : ArrayExpr(std::move(op), std::move(type)) {}
+
+Struct::Struct(ConstOpPtr op, LogicalTypePtr type)
+    : ArrayExpr(std::move(op), std::move(type)) {}
+
+}  // namespace array
+
+Status GetScalarExpr(ConstOpPtr op, LogicalTypePtr ty, ExprPtr* out) {
+  switch (ty->id()) {
+    case LogicalType::NULL_:
+      *out = scalar::null(op);
+      break;
+    case LogicalType::BOOL:
+      *out = scalar::boolean(op);
+      break;
+    case LogicalType::UINT8:
+      *out = scalar::uint8(op);
+      break;
+    case LogicalType::INT8:
+      *out = scalar::int8(op);
+      break;
+    case LogicalType::UINT16:
+      *out = scalar::uint16(op);
+      break;
+    case LogicalType::INT16:
+      *out = scalar::int16(op);
+      break;
+    case LogicalType::UINT32:
+      *out = scalar::uint32(op);
+      break;
+    case LogicalType::INT32:
+      *out = scalar::int32(op);
+      break;
+    case LogicalType::UINT64:
+      *out = scalar::uint64(op);
+      break;
+    case LogicalType::INT64:
+      *out = scalar::int64(op);
+      break;
+    case LogicalType::FLOAT16:
+      *out = scalar::float16(op);
+      break;
+    case LogicalType::FLOAT32:
+      *out = scalar::float32(op);
+      break;
+    case LogicalType::FLOAT64:
+      *out = scalar::float64(op);
+      break;
+    case LogicalType::UTF8:
+      *out = scalar::utf8(op);
+      break;
+    case LogicalType::BINARY:
+      *out = scalar::binary(op);
+      break;
+    default:
+      return Status::NotImplemented("Scalar expr for ", ty->ToString());
+  }
+  return Status::OK();
+}
+
+Status GetArrayExpr(ConstOpPtr op, LogicalTypePtr ty, ExprPtr* out) {
+  switch (ty->id()) {
+    case LogicalType::NULL_:
+      *out = array::null(op);
+      break;
+    case LogicalType::BOOL:
+      *out = array::boolean(op);
+      break;
+    case LogicalType::UINT8:
+      *out = array::uint8(op);
+      break;
+    case LogicalType::INT8:
+      *out = array::int8(op);
+      break;
+    case LogicalType::UINT16:
+      *out = array::uint16(op);
+      break;
+    case LogicalType::INT16:
+      *out = array::int16(op);
+      break;
+    case LogicalType::UINT32:
+      *out = array::uint32(op);
+      break;
+    case LogicalType::INT32:
+      *out = array::int32(op);
+      break;
+    case LogicalType::UINT64:
+      *out = array::uint64(op);
+      break;
+    case LogicalType::INT64:
+      *out = array::int64(op);
+      break;
+    case LogicalType::FLOAT16:
+      *out = array::float16(op);
+      break;
+    case LogicalType::FLOAT32:
+      *out = array::float32(op);
+      break;
+    case LogicalType::FLOAT64:
+      *out = array::float64(op);
+      break;
+    case LogicalType::UTF8:
+      *out = array::utf8(op);
+      break;
+    case LogicalType::BINARY:
+      *out = array::binary(op);
+      break;
+    default:
+      return Status::NotImplemented("Array expr for ", ty->ToString());
+  }
+  return Status::OK();
+}
+
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/expression.h b/cpp/src/arrow/compute/expression.h
new file mode 100644
index 0000000..cc55814
--- /dev/null
+++ b/cpp/src/arrow/compute/expression.h
@@ -0,0 +1,261 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include <memory>
+#include <string>
+
+#include "arrow/compute/type_fwd.h"
+#include "arrow/status.h"
+#include "arrow/util/macros.h"
+#include "arrow/util/visibility.h"
+
+namespace arrow {
+namespace compute {
+
+class LogicalType;
+class ExprVisitor;
+class Operation;
+
+/// \brief Base class for all analytic expressions. Expressions may represent
+/// data values (scalars, arrays, tables)
+class ARROW_EXPORT Expr {
+ public:
+  /// \brief Instantiate expression from an abstract operation
+  /// \param[in] op the operation that generates the expression
+  explicit Expr(ConstOpPtr op);
+
+  virtual ~Expr() = default;
+
+  /// \brief A unique string identifier for the kind of expression
+  virtual std::string kind() const = 0;
+
+  /// \brief Accept expression visitor
+  /// TODO(wesm)
+  // virtual Status Accept(ExprVisitor* visitor) const = 0;
+
+  /// \brief The underlying operation
+  ConstOpPtr op() const { return op_; }
+
+ protected:
+  ConstOpPtr op_;
+};
+
+/// The value cardinality: one or many. These correspond to the arrow::Scalar
+/// and arrow::Array types
+enum class ValueRank { SCALAR, ARRAY };
+
+/// \brief Base class for a data-generated expression with a fixed and known
+/// type. This includes arrays and scalars
+class ARROW_EXPORT ValueExpr : public Expr {
+ public:
+  /// \brief The name of the expression, if any. The default is unnamed
+  // virtual const ExprName& name() const;
+  LogicalTypePtr type() const;
+
+  /// \brief The value cardinality (scalar or array) of the expression
+  virtual ValueRank rank() const = 0;
+
+ protected:
+  ValueExpr(ConstOpPtr op, LogicalTypePtr type);
+
+  /// \brief The semantic data type of the expression
+  LogicalTypePtr type_;
+};
+
+class ARROW_EXPORT ArrayExpr : public ValueExpr {
+ protected:
+  using ValueExpr::ValueExpr;
+  std::string kind() const override;
+  ValueRank rank() const override;
+};
+
+class ARROW_EXPORT ScalarExpr : public ValueExpr {
+ protected:
+  using ValueExpr::ValueExpr;
+  std::string kind() const override;
+  ValueRank rank() const override;
+};
+
+namespace value {
+
+// These are mixin classes to provide a type hierarchy for values identify
+class ValueMixin {};
+class Null : public ValueMixin {};
+class Bool : public ValueMixin {};
+class Number : public ValueMixin {};
+class Integer : public Number {};
+class SignedInteger : public Integer {};
+class Int8 : public SignedInteger {};
+class Int16 : public SignedInteger {};
+class Int32 : public SignedInteger {};
+class Int64 : public SignedInteger {};
+class UnsignedInteger : public Integer {};
+class UInt8 : public UnsignedInteger {};
+class UInt16 : public UnsignedInteger {};
+class UInt32 : public UnsignedInteger {};
+class UInt64 : public UnsignedInteger {};
+class Floating : public Number {};
+class Float16 : public Floating {};
+class Float32 : public Floating {};
+class Float64 : public Floating {};
+class Binary : public ValueMixin {};
+class Utf8 : public Binary {};
+class List : public ValueMixin {};
+class Struct : public ValueMixin {};
+
+}  // namespace value
+
+#define SIMPLE_EXPR_FACTORY(NAME) ARROW_EXPORT ExprPtr NAME(ConstOpPtr op);
+
+namespace scalar {
+
+#define DECLARE_SCALAR_EXPR(TYPE)                                   \
+  class ARROW_EXPORT TYPE : public ScalarExpr, public value::TYPE { \
+   public:                                                          \
+    explicit TYPE(ConstOpPtr op);                                   \
+    using ScalarExpr::kind;                                         \
+  };
+
+DECLARE_SCALAR_EXPR(Null)
+DECLARE_SCALAR_EXPR(Bool)
+DECLARE_SCALAR_EXPR(Int8)
+DECLARE_SCALAR_EXPR(Int16)
+DECLARE_SCALAR_EXPR(Int32)
+DECLARE_SCALAR_EXPR(Int64)
+DECLARE_SCALAR_EXPR(UInt8)
+DECLARE_SCALAR_EXPR(UInt16)
+DECLARE_SCALAR_EXPR(UInt32)
+DECLARE_SCALAR_EXPR(UInt64)
+DECLARE_SCALAR_EXPR(Float16)
+DECLARE_SCALAR_EXPR(Float32)
+DECLARE_SCALAR_EXPR(Float64)
+DECLARE_SCALAR_EXPR(Binary)
+DECLARE_SCALAR_EXPR(Utf8)
+
+#undef DECLARE_SCALAR_EXPR
+
+SIMPLE_EXPR_FACTORY(null);
+SIMPLE_EXPR_FACTORY(boolean);
+SIMPLE_EXPR_FACTORY(int8);
+SIMPLE_EXPR_FACTORY(int16);
+SIMPLE_EXPR_FACTORY(int32);
+SIMPLE_EXPR_FACTORY(int64);
+SIMPLE_EXPR_FACTORY(uint8);
+SIMPLE_EXPR_FACTORY(uint16);
+SIMPLE_EXPR_FACTORY(uint32);
+SIMPLE_EXPR_FACTORY(uint64);
+SIMPLE_EXPR_FACTORY(float16);
+SIMPLE_EXPR_FACTORY(float32);
+SIMPLE_EXPR_FACTORY(float64);
+SIMPLE_EXPR_FACTORY(binary);
+SIMPLE_EXPR_FACTORY(utf8);
+
+class ARROW_EXPORT List : public ScalarExpr, public value::List {
+ public:
+  List(ConstOpPtr op, LogicalTypePtr type);
+  using ScalarExpr::kind;
+};
+
+class ARROW_EXPORT Struct : public ScalarExpr, public value::Struct {
+ public:
+  Struct(ConstOpPtr op, LogicalTypePtr type);
+  using ScalarExpr::kind;
+};
+
+}  // namespace scalar
+
+namespace array {
+
+#define DECLARE_ARRAY_EXPR(TYPE)                                   \
+  class ARROW_EXPORT TYPE : public ArrayExpr, public value::TYPE { \
+   public:                                                         \
+    explicit TYPE(ConstOpPtr op);                                  \
+    using ArrayExpr::kind;                                         \
+  };
+
+DECLARE_ARRAY_EXPR(Null)
+DECLARE_ARRAY_EXPR(Bool)
+DECLARE_ARRAY_EXPR(Int8)
+DECLARE_ARRAY_EXPR(Int16)
+DECLARE_ARRAY_EXPR(Int32)
+DECLARE_ARRAY_EXPR(Int64)
+DECLARE_ARRAY_EXPR(UInt8)
+DECLARE_ARRAY_EXPR(UInt16)
+DECLARE_ARRAY_EXPR(UInt32)
+DECLARE_ARRAY_EXPR(UInt64)
+DECLARE_ARRAY_EXPR(Float16)
+DECLARE_ARRAY_EXPR(Float32)
+DECLARE_ARRAY_EXPR(Float64)
+DECLARE_ARRAY_EXPR(Binary)
+DECLARE_ARRAY_EXPR(Utf8)
+
+#undef DECLARE_ARRAY_EXPR
+
+SIMPLE_EXPR_FACTORY(null);
+SIMPLE_EXPR_FACTORY(boolean);
+SIMPLE_EXPR_FACTORY(int8);
+SIMPLE_EXPR_FACTORY(int16);
+SIMPLE_EXPR_FACTORY(int32);
+SIMPLE_EXPR_FACTORY(int64);
+SIMPLE_EXPR_FACTORY(uint8);
+SIMPLE_EXPR_FACTORY(uint16);
+SIMPLE_EXPR_FACTORY(uint32);
+SIMPLE_EXPR_FACTORY(uint64);
+SIMPLE_EXPR_FACTORY(float16);
+SIMPLE_EXPR_FACTORY(float32);
+SIMPLE_EXPR_FACTORY(float64);
+SIMPLE_EXPR_FACTORY(binary);
+SIMPLE_EXPR_FACTORY(utf8);
+
+class ARROW_EXPORT List : public ArrayExpr, public value::List {
+ public:
+  List(ConstOpPtr op, LogicalTypePtr type);
+  using ArrayExpr::kind;
+};
+
+class ARROW_EXPORT Struct : public ArrayExpr, public value::Struct {
+ public:
+  Struct(ConstOpPtr op, LogicalTypePtr type);
+  using ArrayExpr::kind;
+};
+
+}  // namespace array
+
+#undef SIMPLE_EXPR_FACTORY
+
+template <typename T, typename ObjectType>
+inline bool InheritsFrom(const ObjectType* obj) {
+  return dynamic_cast<const T*>(obj) != NULLPTR;
+}
+
+template <typename T, typename ObjectType>
+inline bool InheritsFrom(const ObjectType& obj) {
+  return dynamic_cast<const T*>(&obj) != NULLPTR;
+}
+
+/// \brief Construct a ScalarExpr containing an Operation given a logical type
+ARROW_EXPORT
+Status GetScalarExpr(ConstOpPtr op, LogicalTypePtr ty, ExprPtr* out);
+
+/// \brief Construct an ArrayExpr containing an Operation given a logical type
+ARROW_EXPORT
+Status GetArrayExpr(ConstOpPtr op, LogicalTypePtr ty, ExprPtr* out);
+
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/logical_type.cc b/cpp/src/arrow/compute/logical_type.cc
new file mode 100644
index 0000000..5563393
--- /dev/null
+++ b/cpp/src/arrow/compute/logical_type.cc
@@ -0,0 +1,148 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// Metadata objects for creating well-typed expressions. These are distinct
+// from (and higher level than) arrow::DataType as some type parameters (like
+// decimal scale and precision) may not be known at expression build time, and
+// these are resolved later on evaluation
+
+#include "arrow/compute/logical_type.h"
+
+#include <string>
+
+#include "arrow/compute/expression.h"
+#include "arrow/status.h"
+#include "arrow/type.h"
+#include "arrow/util/visibility.h"
+
+namespace arrow {
+namespace compute {
+
+Status LogicalType::FromArrow(const ::arrow::DataType& type, LogicalTypePtr* out) {
+  switch (type.id()) {
+    case Type::NA:
+      *out = type::null();
+      break;
+    case Type::BOOL:
+      *out = type::boolean();
+      break;
+    case Type::UINT8:
+      *out = type::uint8();
+      break;
+    case Type::INT8:
+      *out = type::int8();
+      break;
+    case Type::UINT16:
+      *out = type::uint16();
+      break;
+    case Type::INT16:
+      *out = type::int16();
+      break;
+    case Type::UINT32:
+      *out = type::uint32();
+      break;
+    case Type::INT32:
+      *out = type::int32();
+      break;
+    case Type::UINT64:
+      *out = type::uint64();
+      break;
+    case Type::INT64:
+      *out = type::int64();
+      break;
+    case Type::HALF_FLOAT:
+      *out = type::float16();
+      break;
+    case Type::FLOAT:
+      *out = type::float32();
+      break;
+    case Type::DOUBLE:
+      *out = type::float64();
+      break;
+    case Type::STRING:
+      *out = type::utf8();
+      break;
+    case Type::BINARY:
+      *out = type::binary();
+      break;
+    default:
+      return Status::NotImplemented("Logical expr for ", type.ToString());
+  }
+  return Status::OK();
+}
+
+namespace type {
+
+bool Any::IsInstance(const Expr& expr) const { return InheritsFrom<ValueExpr>(expr); }
+
+std::string Any::ToString() const { return "Any"; }
+
+#define SIMPLE_LOGICAL_TYPE(NAME)                 \
+  bool NAME::IsInstance(const Expr& expr) const { \
+    return InheritsFrom<value::NAME>(expr);       \
+  }                                               \
+  std::string NAME::ToString() const { return "" #NAME; }
+
+SIMPLE_LOGICAL_TYPE(Null)
+SIMPLE_LOGICAL_TYPE(Bool)
+SIMPLE_LOGICAL_TYPE(Number)
+SIMPLE_LOGICAL_TYPE(Integer)
+SIMPLE_LOGICAL_TYPE(Floating)
+SIMPLE_LOGICAL_TYPE(SignedInteger)
+SIMPLE_LOGICAL_TYPE(UnsignedInteger)
+SIMPLE_LOGICAL_TYPE(Int8)
+SIMPLE_LOGICAL_TYPE(Int16)
+SIMPLE_LOGICAL_TYPE(Int32)
+SIMPLE_LOGICAL_TYPE(Int64)
+SIMPLE_LOGICAL_TYPE(UInt8)
+SIMPLE_LOGICAL_TYPE(UInt16)
+SIMPLE_LOGICAL_TYPE(UInt32)
+SIMPLE_LOGICAL_TYPE(UInt64)
+SIMPLE_LOGICAL_TYPE(Float16)
+SIMPLE_LOGICAL_TYPE(Float32)
+SIMPLE_LOGICAL_TYPE(Float64)
+SIMPLE_LOGICAL_TYPE(Binary)
+SIMPLE_LOGICAL_TYPE(Utf8)
+
+#define SIMPLE_TYPE_FACTORY(NAME, TYPE) \
+  LogicalTypePtr NAME() { return std::make_shared<TYPE>(); }
+
+SIMPLE_TYPE_FACTORY(any, Any);
+SIMPLE_TYPE_FACTORY(null, Null);
+SIMPLE_TYPE_FACTORY(boolean, Bool);
+SIMPLE_TYPE_FACTORY(number, Number);
+SIMPLE_TYPE_FACTORY(integer, Integer);
+SIMPLE_TYPE_FACTORY(signed_integer, SignedInteger);
+SIMPLE_TYPE_FACTORY(unsigned_integer, UnsignedInteger);
+SIMPLE_TYPE_FACTORY(floating, Floating);
+SIMPLE_TYPE_FACTORY(int8, Int8);
+SIMPLE_TYPE_FACTORY(int16, Int16);
+SIMPLE_TYPE_FACTORY(int32, Int32);
+SIMPLE_TYPE_FACTORY(int64, Int64);
+SIMPLE_TYPE_FACTORY(uint8, UInt8);
+SIMPLE_TYPE_FACTORY(uint16, UInt16);
+SIMPLE_TYPE_FACTORY(uint32, UInt32);
+SIMPLE_TYPE_FACTORY(uint64, UInt64);
+SIMPLE_TYPE_FACTORY(float16, Float16);
+SIMPLE_TYPE_FACTORY(float32, Float32);
+SIMPLE_TYPE_FACTORY(float64, Float64);
+SIMPLE_TYPE_FACTORY(binary, Binary);
+SIMPLE_TYPE_FACTORY(utf8, Utf8);
+
+}  // namespace type
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/logical_type.h b/cpp/src/arrow/compute/logical_type.h
new file mode 100644
index 0000000..7acbeef
--- /dev/null
+++ b/cpp/src/arrow/compute/logical_type.h
@@ -0,0 +1,308 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// Metadata objects for creating well-typed expressions. These are distinct
+// from (and higher level than) arrow::DataType as some type parameters (like
+// decimal scale and precision) may not be known at expression build time, and
+// these are resolved later on evaluation
+
+#pragma once
+
+#include <memory>
+#include <string>
+
+#include "arrow/compute/type_fwd.h"
+#include "arrow/util/visibility.h"
+
+namespace arrow {
+
+class Status;
+
+namespace compute {
+
+class Expr;
+
+/// \brief An object that represents either a single concrete value type or a
+/// group of related types, to help with expression type validation and other
+/// purposes
+class ARROW_EXPORT LogicalType {
+ public:
+  enum Id {
+    ANY,
+    NUMBER,
+    INTEGER,
+    SIGNED_INTEGER,
+    UNSIGNED_INTEGER,
+    FLOATING,
+    NULL_,
+    BOOL,
+    UINT8,
+    INT8,
+    UINT16,
+    INT16,
+    UINT32,
+    INT32,
+    UINT64,
+    INT64,
+    FLOAT16,
+    FLOAT32,
+    FLOAT64,
+    BINARY,
+    UTF8,
+    DATE,
+    TIME,
+    TIMESTAMP,
+    DECIMAL,
+    LIST,
+    STRUCT
+  };
+
+  Id id() const { return id_; }
+
+  virtual ~LogicalType() = default;
+
+  virtual std::string ToString() const = 0;
+
+  /// \brief Check if expression is an instance of this type class
+  virtual bool IsInstance(const Expr& expr) const = 0;
+
+  /// \brief Get a logical expression type from a concrete Arrow in-memory
+  /// array type
+  static Status FromArrow(const ::arrow::DataType& type, LogicalTypePtr* out);
+
+ protected:
+  explicit LogicalType(Id id) : id_(id) {}
+  Id id_;
+};
+
+namespace type {
+
+/// \brief Logical type for any value type
+class ARROW_EXPORT Any : public LogicalType {
+ public:
+  Any() : LogicalType(LogicalType::ANY) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for null
+class ARROW_EXPORT Null : public LogicalType {
+ public:
+  Null() : LogicalType(LogicalType::NULL_) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for concrete boolean
+class ARROW_EXPORT Bool : public LogicalType {
+ public:
+  Bool() : LogicalType(LogicalType::BOOL) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for any number (integer or floating point)
+class ARROW_EXPORT Number : public LogicalType {
+ public:
+  Number() : Number(LogicalType::NUMBER) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+
+ protected:
+  explicit Number(Id type_id) : LogicalType(type_id) {}
+};
+
+/// \brief Logical type for any integer
+class ARROW_EXPORT Integer : public Number {
+ public:
+  Integer() : Integer(LogicalType::INTEGER) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+
+ protected:
+  explicit Integer(Id type_id) : Number(type_id) {}
+};
+
+/// \brief Logical type for any floating point number
+class ARROW_EXPORT Floating : public Number {
+ public:
+  Floating() : Floating(LogicalType::FLOATING) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+
+ protected:
+  explicit Floating(Id type_id) : Number(type_id) {}
+};
+
+/// \brief Logical type for any signed integer
+class ARROW_EXPORT SignedInteger : public Integer {
+ public:
+  SignedInteger() : SignedInteger(LogicalType::SIGNED_INTEGER) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+
+ protected:
+  explicit SignedInteger(Id type_id) : Integer(type_id) {}
+};
+
+/// \brief Logical type for any unsigned integer
+class ARROW_EXPORT UnsignedInteger : public Integer {
+ public:
+  UnsignedInteger() : UnsignedInteger(LogicalType::UNSIGNED_INTEGER) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+
+ protected:
+  explicit UnsignedInteger(Id type_id) : Integer(type_id) {}
+};
+
+/// \brief Logical type for int8
+class ARROW_EXPORT Int8 : public SignedInteger {
+ public:
+  Int8() : SignedInteger(LogicalType::INT8) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for int16
+class ARROW_EXPORT Int16 : public SignedInteger {
+ public:
+  Int16() : SignedInteger(LogicalType::INT16) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for int32
+class ARROW_EXPORT Int32 : public SignedInteger {
+ public:
+  Int32() : SignedInteger(LogicalType::INT32) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for int64
+class ARROW_EXPORT Int64 : public SignedInteger {
+ public:
+  Int64() : SignedInteger(LogicalType::INT64) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for uint8
+class ARROW_EXPORT UInt8 : public UnsignedInteger {
+ public:
+  UInt8() : UnsignedInteger(LogicalType::UINT8) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for uint16
+class ARROW_EXPORT UInt16 : public UnsignedInteger {
+ public:
+  UInt16() : UnsignedInteger(LogicalType::UINT16) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for uint32
+class ARROW_EXPORT UInt32 : public UnsignedInteger {
+ public:
+  UInt32() : UnsignedInteger(LogicalType::UINT32) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for uint64
+class ARROW_EXPORT UInt64 : public UnsignedInteger {
+ public:
+  UInt64() : UnsignedInteger(LogicalType::UINT64) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for 16-bit floating point
+class ARROW_EXPORT Float16 : public Floating {
+ public:
+  Float16() : Floating(LogicalType::FLOAT16) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for 32-bit floating point
+class ARROW_EXPORT Float32 : public Floating {
+ public:
+  Float32() : Floating(LogicalType::FLOAT32) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for 64-bit floating point
+class ARROW_EXPORT Float64 : public Floating {
+ public:
+  Float64() : Floating(LogicalType::FLOAT64) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+/// \brief Logical type for variable-size binary
+class ARROW_EXPORT Binary : public LogicalType {
+ public:
+  Binary() : Binary(LogicalType::BINARY) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+
+ protected:
+  explicit Binary(Id type_id) : LogicalType(type_id) {}
+};
+
+/// \brief Logical type for variable-size binary
+class ARROW_EXPORT Utf8 : public Binary {
+ public:
+  Utf8() : Binary(LogicalType::UTF8) {}
+  bool IsInstance(const Expr& expr) const override;
+  std::string ToString() const override;
+};
+
+#define SIMPLE_TYPE_FACTORY(NAME) ARROW_EXPORT LogicalTypePtr NAME();
+
+SIMPLE_TYPE_FACTORY(any);
+SIMPLE_TYPE_FACTORY(null);
+SIMPLE_TYPE_FACTORY(boolean);
+SIMPLE_TYPE_FACTORY(number);
+SIMPLE_TYPE_FACTORY(integer);
+SIMPLE_TYPE_FACTORY(signed_integer);
+SIMPLE_TYPE_FACTORY(unsigned_integer);
+SIMPLE_TYPE_FACTORY(floating);
+SIMPLE_TYPE_FACTORY(int8);
+SIMPLE_TYPE_FACTORY(int16);
+SIMPLE_TYPE_FACTORY(int32);
+SIMPLE_TYPE_FACTORY(int64);
+SIMPLE_TYPE_FACTORY(uint8);
+SIMPLE_TYPE_FACTORY(uint16);
+SIMPLE_TYPE_FACTORY(uint32);
+SIMPLE_TYPE_FACTORY(uint64);
+SIMPLE_TYPE_FACTORY(float16);
+SIMPLE_TYPE_FACTORY(float32);
+SIMPLE_TYPE_FACTORY(float64);
+SIMPLE_TYPE_FACTORY(binary);
+SIMPLE_TYPE_FACTORY(utf8);
+
+#undef SIMPLE_TYPE_FACTORY
+
+}  // namespace type
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/operation.cc b/cpp/src/arrow/compute/operation.cc
new file mode 100644
index 0000000..6458afb
--- /dev/null
+++ b/cpp/src/arrow/compute/operation.cc
@@ -0,0 +1,31 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/operation.h"
+
+#include <memory>
+#include <vector>
+
+#include "arrow/util/visibility.h"
+
+namespace arrow {
+namespace compute {
+
+std::vector<ExprPtr> Operation::input_args() const { return {}; }
+
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/operation.h b/cpp/src/arrow/compute/operation.h
new file mode 100644
index 0000000..c06f8c3
--- /dev/null
+++ b/cpp/src/arrow/compute/operation.h
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include <memory>
+#include <vector>
+
+#include "arrow/compute/type_fwd.h"
+#include "arrow/util/visibility.h"
+
+namespace arrow {
+
+class Status;
+
+namespace compute {
+
+/// \brief An operation is a node in a computation graph, taking input data
+/// expression dependencies and emitting an output expression
+class ARROW_EXPORT Operation : public std::enable_shared_from_this<Operation> {
+ public:
+  virtual ~Operation() = default;
+
+  /// \brief Check input expression arguments and output the type of resulting
+  /// expression that this operation produces. If the input arguments are
+  /// invalid, error Status is returned
+  /// \param[out] out the returned well-typed expression
+  /// \return success or failure
+  virtual Status ToExpr(ExprPtr* out) const = 0;
+
+  /// \brief Return the input expressions used to instantiate the
+  /// operation. The default implementation returns an empty vector
+  /// \return a vector of expressions
+  virtual std::vector<ExprPtr> input_args() const;
+};
+
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/operations/cast.cc b/cpp/src/arrow/compute/operations/cast.cc
new file mode 100644
index 0000000..ac2e662
--- /dev/null
+++ b/cpp/src/arrow/compute/operations/cast.cc
@@ -0,0 +1,53 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/operations/cast.h"
+
+#include <memory>
+#include <utility>
+
+#include "arrow/compute/expression.h"
+#include "arrow/compute/logical_type.h"
+#include "arrow/status.h"
+
+namespace arrow {
+namespace compute {
+namespace ops {
+
+Cast::Cast(std::shared_ptr<Expr> value, std::shared_ptr<LogicalType> out_type)
+    : value_(std::move(value)), out_type_(std::move(out_type)) {}
+
+Status Cast::ToExpr(std::shared_ptr<Expr>* out) const {
+  // TODO(wesm): Add reusable type-checking rules
+  auto value_ty = type::any();
+  if (!value_ty->IsInstance(*value_)) {
+    return Status::Invalid("Cast only applies to value expressions");
+  }
+
+  // TODO(wesm): implement "shaped like" output type rule like Ibis
+  auto op = shared_from_this();
+  const auto& value_expr = static_cast<const ValueExpr&>(*value_);
+  if (value_expr.rank() == ValueRank::SCALAR) {
+    return GetScalarExpr(op, out_type_, out);
+  } else {
+    return GetArrayExpr(op, out_type_, out);
+  }
+}
+
+}  // namespace ops
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/operations/cast.h b/cpp/src/arrow/compute/operations/cast.h
new file mode 100644
index 0000000..0052ebb
--- /dev/null
+++ b/cpp/src/arrow/compute/operations/cast.h
@@ -0,0 +1,46 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include <memory>
+
+#include "arrow/compute/operation.h"
+#include "arrow/util/visibility.h"
+
+namespace arrow {
+namespace compute {
+
+class LogicalType;
+
+namespace ops {
+
+/// \brief A cast operation creates an expression from a known constant
+/// scalar value
+class ARROW_EXPORT Cast : public Operation {
+ public:
+  Cast(std::shared_ptr<Expr> value, std::shared_ptr<LogicalType> out_type);
+  Status ToExpr(std::shared_ptr<Expr>* out) const override;
+
+ private:
+  std::shared_ptr<Expr> value_;
+  std::shared_ptr<LogicalType> out_type_;
+};
+
+}  // namespace ops
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/operations/literal.cc b/cpp/src/arrow/compute/operations/literal.cc
new file mode 100644
index 0000000..d4934f1
--- /dev/null
+++ b/cpp/src/arrow/compute/operations/literal.cc
@@ -0,0 +1,40 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/operations/literal.h"
+
+#include <memory>
+
+#include "arrow/compute/expression.h"
+#include "arrow/compute/logical_type.h"
+#include "arrow/scalar.h"
+
+namespace arrow {
+namespace compute {
+namespace ops {
+
+Literal::Literal(const std::shared_ptr<Scalar>& value) : value_(value) {}
+
+Status Literal::ToExpr(std::shared_ptr<Expr>* out) const {
+  std::shared_ptr<LogicalType> ty;
+  RETURN_NOT_OK(LogicalType::FromArrow(*value_->type, &ty));
+  return GetScalarExpr(shared_from_this(), ty, out);
+}
+
+}  // namespace ops
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/operations/literal.h b/cpp/src/arrow/compute/operations/literal.h
new file mode 100644
index 0000000..b596b33
--- /dev/null
+++ b/cpp/src/arrow/compute/operations/literal.h
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include <memory>
+
+#include "arrow/compute/operation.h"
+#include "arrow/util/visibility.h"
+
+namespace arrow {
+
+struct Scalar;
+
+namespace compute {
+namespace ops {
+
+/// \brief A literal operation creates an expression from a known constant
+/// scalar value
+class ARROW_EXPORT Literal : public Operation {
+ public:
+  explicit Literal(const std::shared_ptr<Scalar>& value);
+  Status ToExpr(std::shared_ptr<Expr>* out) const override;
+
+ private:
+  std::shared_ptr<Scalar> value_;
+};
+
+}  // namespace ops
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/operations/operations-test.cc b/cpp/src/arrow/compute/operations/operations-test.cc
new file mode 100644
index 0000000..1616e84
--- /dev/null
+++ b/cpp/src/arrow/compute/operations/operations-test.cc
@@ -0,0 +1,64 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <cstdint>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include <gtest/gtest.h>
+
+#include "arrow/scalar.h"
+#include "arrow/status.h"
+#include "arrow/table.h"
+#include "arrow/testing/gtest_common.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/type.h"
+
+#include "arrow/compute/expression.h"
+#include "arrow/compute/logical_type.h"
+#include "arrow/compute/operation.h"
+#include "arrow/compute/operations/cast.h"
+#include "arrow/compute/operations/literal.h"
+
+namespace arrow {
+namespace compute {
+
+class DummyOp : public Operation {
+ public:
+  Status ToExpr(std::shared_ptr<Expr>* out) const override {
+    return Status::NotImplemented("NYI");
+  }
+};
+
+TEST(Literal, Basics) {
+  auto val = std::make_shared<DoubleScalar>(3.14159);
+  std::shared_ptr<Expr> expr;
+  ASSERT_OK(std::make_shared<ops::Literal>(val)->ToExpr(&expr));
+  ASSERT_TRUE(InheritsFrom<scalar::Float64>(*expr));
+}
+
+TEST(Cast, Basics) {
+  auto dummy_op = std::make_shared<DummyOp>();
+  auto expr = array::int32(dummy_op);
+  std::shared_ptr<Expr> out_expr;
+  ASSERT_OK(std::make_shared<ops::Cast>(expr, type::float64())->ToExpr(&out_expr));
+  ASSERT_TRUE(InheritsFrom<array::Float64>(*out_expr));
+}
+
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/compute/type_fwd.h b/cpp/src/arrow/compute/type_fwd.h
new file mode 100644
index 0000000..48d45ec
--- /dev/null
+++ b/cpp/src/arrow/compute/type_fwd.h
@@ -0,0 +1,38 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include <memory>
+
+#include "arrow/type_fwd.h"
+
+namespace arrow {
+namespace compute {
+
+class Expr;
+class LogicalType;
+class Operation;
+
+using ArrowTypePtr = std::shared_ptr<::arrow::DataType>;
+using ExprPtr = std::shared_ptr<Expr>;
+using ConstOpPtr = std::shared_ptr<const Operation>;
+using OpPtr = std::shared_ptr<Operation>;
+using LogicalTypePtr = std::shared_ptr<LogicalType>;
+
+}  // namespace compute
+}  // namespace arrow
diff --git a/cpp/src/arrow/memory_pool.h b/cpp/src/arrow/memory_pool.h
index 8499b6f..60643c3 100644
--- a/cpp/src/arrow/memory_pool.h
+++ b/cpp/src/arrow/memory_pool.h
@@ -22,6 +22,7 @@
 #include <cstdint>
 #include <memory>
 
+#include "arrow/status.h"
 #include "arrow/util/visibility.h"
 
 namespace arrow {
@@ -55,8 +56,6 @@ class MemoryPoolStats {
 
 }  // namespace internal
 
-class Status;
-
 /// Base class for memory allocation.
 ///
 /// Besides tracking the number of allocated bytes, the allocator also should