You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/10/29 14:30:44 UTC

[GitHub] [tvm] Lunderberg commented on a change in pull request #9313: Adds SEScope (Storage/Execution Scope) for use as new unit of planning in 'device' planning.

Lunderberg commented on a change in pull request #9313:
URL: https://github.com/apache/tvm/pull/9313#discussion_r739258382



##########
File path: include/tvm/target/se_scope.h
##########
@@ -0,0 +1,330 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file tvm/target/se_scope.h
+ * \brief A compile time representation for a Storage or Execution Scope.
+ */
+
+#ifndef TVM_TARGET_SE_SCOPE_H_
+#define TVM_TARGET_SE_SCOPE_H_
+
+#include <tvm/ir/transform.h>
+#include <tvm/target/target.h>
+
+#include <string>
+#include <unordered_map>
+#include <utility>
+
+namespace tvm {
+
+/*!
+ * Abstract label for an area of memory.
+ *
+ * Currently uninterpreted and arbitrary. Likely to be replaced by a structured representation
+ * of a memory pool in the future. Please try to use this alias instead of String to aid future
+ * code migration.
+ */
+using MemoryScope = String;
+
+/*!
+ * \brief Describes at compile time where data is to be stored down to the device and memory
+ * scope level, or where execution is to take place, down to the device level. It is a quadruple of:
+ * - A \p device_type (\p DLDeviceType).
+ * - An uninterpreted \p virtual_device_id (\p int) distinguishing the intended device from all
+ *   other devices (either of the same \p device_type, or across all available devices in the
+ *   system). The virtual device id need not correspond to any physical device id, see
+ *   "Virtual Devices" below.
+ * - A \p target (\p Target) describing how to compile code for the intended device.
+ * - A \p memory_scope (MemoryScope, which is currently just \p String) describing which memory
+ *   area is to be used to hold data. The area should be reachable from the device but need not be
+ *   'on' the device, see "Memory Scopes and Devices" below.
+ *
+ * All of these fields may be 'unconstrained' (ie null, -1 or ""), signaling that device planning
+ * is free to choose a value consistent with the whole program. However if a \p target is given
+ * then the \p device_type must equal \p target->kind->device_type.
+ *
+ * Note that currently we assume if a function returns its result on a particular device
+ * then the function body is also executed on that device. See the overview comment in
+ * src/relay/transforms/device_planner.cc for more details.
+ *
+ * By 'data' we include both tensors and additional supporting datastructures such as shapes,
+ * Relay AST items, Relay tuples, and Relay references. Typically non-tensor data must reside
+ * on a 'CPU'-like device with good support for scalars.
+ *
+ * By 'execution' we include both (fused) primitive operators, and all the Relay expressions
+ * surrounding them which coordinates data and control flow. Again, typically non-primitive
+ * operators must be executed on a 'CPU'-like device with good support for control flow.
+ *
+ * Targets vs Devices
+ * ------------------
+ * Generally \p Targets (a compile-time only datastructue) describe compiler options for a specific
+ * microarchitecture and toolchain, while \p Devices (a runtime datastructure also available at
+ * compile time) describe a physical device on the target system. Obviously the target must agree
+ * with the device's microarchitecture, but we otherwise don't impose any constraints between them:
+ *  - It's ok to use different \p Targets for the same \p Device, eg to squeeze some extra perf
+ *    out of a particular primitive.
+ *  - It's ok to use the same \p Target for multiple \p Devices, eg if we have multiple CPUs.
+ *
+ * Traditionally TVM assumes at most one \p Target per \p DLDeviceType. We are moving away from that
+ * assumption.
+ *
+ * Virtual vs Physical Devices

Review comment:
       Can we also document what problem the virtual device ids are introduced to solve?  From the description here and the rest of the RFC, it looks like the goal is to distinguish between multiple devices that have the same device type, but different `Target` parameters, but that would be good to include here as well.

##########
File path: include/tvm/target/se_scope.h
##########
@@ -0,0 +1,330 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file tvm/target/se_scope.h
+ * \brief A compile time representation for a Storage or Execution Scope.
+ */
+
+#ifndef TVM_TARGET_SE_SCOPE_H_
+#define TVM_TARGET_SE_SCOPE_H_
+
+#include <tvm/ir/transform.h>
+#include <tvm/target/target.h>
+
+#include <string>
+#include <unordered_map>
+#include <utility>
+
+namespace tvm {
+
+/*!
+ * Abstract label for an area of memory.
+ *
+ * Currently uninterpreted and arbitrary. Likely to be replaced by a structured representation
+ * of a memory pool in the future. Please try to use this alias instead of String to aid future
+ * code migration.
+ */
+using MemoryScope = String;
+
+/*!
+ * \brief Describes at compile time where data is to be stored down to the device and memory
+ * scope level, or where execution is to take place, down to the device level. It is a quadruple of:
+ * - A \p device_type (\p DLDeviceType).
+ * - An uninterpreted \p virtual_device_id (\p int) distinguishing the intended device from all
+ *   other devices (either of the same \p device_type, or across all available devices in the
+ *   system). The virtual device id need not correspond to any physical device id, see
+ *   "Virtual Devices" below.
+ * - A \p target (\p Target) describing how to compile code for the intended device.
+ * - A \p memory_scope (MemoryScope, which is currently just \p String) describing which memory
+ *   area is to be used to hold data. The area should be reachable from the device but need not be
+ *   'on' the device, see "Memory Scopes and Devices" below.
+ *
+ * All of these fields may be 'unconstrained' (ie null, -1 or ""), signaling that device planning
+ * is free to choose a value consistent with the whole program. However if a \p target is given
+ * then the \p device_type must equal \p target->kind->device_type.
+ *
+ * Note that currently we assume if a function returns its result on a particular device
+ * then the function body is also executed on that device. See the overview comment in
+ * src/relay/transforms/device_planner.cc for more details.
+ *
+ * By 'data' we include both tensors and additional supporting datastructures such as shapes,
+ * Relay AST items, Relay tuples, and Relay references. Typically non-tensor data must reside
+ * on a 'CPU'-like device with good support for scalars.
+ *
+ * By 'execution' we include both (fused) primitive operators, and all the Relay expressions
+ * surrounding them which coordinates data and control flow. Again, typically non-primitive
+ * operators must be executed on a 'CPU'-like device with good support for control flow.
+ *
+ * Targets vs Devices
+ * ------------------
+ * Generally \p Targets (a compile-time only datastructue) describe compiler options for a specific
+ * microarchitecture and toolchain, while \p Devices (a runtime datastructure also available at
+ * compile time) describe a physical device on the target system. Obviously the target must agree
+ * with the device's microarchitecture, but we otherwise don't impose any constraints between them:
+ *  - It's ok to use different \p Targets for the same \p Device, eg to squeeze some extra perf
+ *    out of a particular primitive.
+ *  - It's ok to use the same \p Target for multiple \p Devices, eg if we have multiple CPUs.
+ *
+ * Traditionally TVM assumes at most one \p Target per \p DLDeviceType. We are moving away from that
+ * assumption.
+ *
+ * Virtual vs Physical Devices
+ * ---------------------------
+ * The \p virtual_device_id may be left as 0 if not significant. It is up to downstream
+ * compilation passes and/or the runtime to map a \p virtual_device_id to an actual physical
+ * device id if required. For example, some runtimes may support passing in an array of actual
+ * `device` specifications, and the \p virtual_device_id is simply an index known at compile time
+ * into that array.
+ *
+ * Memory Scopes and Devices
+ * -------------------------
+ * Multi-device systems can have complex memory hierarchies. For example
+ * \code
+ * (kDLCPU, 0, "llvm", "global")
+ * \endcode
+ * and
+ * \code
+ * (kDLCPU, 1, "llvm", "global")
+ * \endcode
+ * could denote:
+ * - The same memory area accessible from two separate CPUs without any CPU affinity;
+ * - Distinct memory areas in a NUMA architecture for which cross-device access is handled
+ *   by the memory system;
+ * - Outright distinct memory areas, where one device cannot directly address the memory of
+ *   another.
+ *
+ * Similarly:
+ * \code
+ * (kDLCPU, 0, "llvm", "global")
+ * \endcode
+ * and
+ * \code
+ * (kDLCUDA, 0, "cuda", "host")
+ * \endcode
+ * could denote the same memory area, but with very different access costs.
+ *
+ * We don't currently try to build any of this system-level understanding into \p SEScope. Device
+ * planning will simply insert "device_copy" operators wherever \p SEScopes are not exactly
+ * pointwise equal, and we leave it to downstream compilation to elide unnecessary copies. We
+ * may revisit this in the future.
+ *
+ * Joining and Defaulting
+ * ----------------------
+ * It is possible to 'join' two \p SEScopes to yield the most constrained \p SEScope which agrees
+ * with both join arguments. Eg:
+ * \code
+ * Join((kDLCPU, -1, "llvm", ""), (kInvalidDeviceType, 3, null, "global))
+ *   => (kDLCPU, 3, "llvm", "global")
+ * Join((kDLCPU, -1, "llvm", ""), (kInvalidDeviceType, 3, null, "local))
+ *   => null (no join possible)
+ * \endcode
+ *
+ * Related to 'join' is 'default', which only takes constrained fields from the rhs when the
+ * lhs is unconstrained:
+ * \code
+ * Default(kDLCPU, -1, "llvm", "local"), (kDLCPU, 3, null, "global"))
+ *   => (kDLCPU, 3, "llvm", "local")
+ * \endcode
+ *
+ * These operations are needed during device planning.
+ *
+ */
+class SEScopeNode : public Object {
+ public:
+  /*!
+   * \brief The \p DLDeviceType of the device. If \p target is known then this will be equal to
+   * \p target->kind->device_type. If \p target is null then the target is to be determined by
+   * a later pass.
+   *
+   * This is needed to support the legacy "on_device" and "device_copy" calls which only allow
+   * a \p DLDeviceTypes (as an integer) to be given.
+   *
+   * kInvalidDeviceType denotes unconstrained.
+   */
+  DLDeviceType device_type() const { return device_type_; }
+
+  /*!
+   * \brief The 'virtual' device identifier for the device. This must be resolved to a physical
+   * device identifier either during compilation or at runtime.
+   *
+   * -1 denotes unconstrained. May be 0 if not significant.
+   */
+  int virtual_device_id() const { return virtual_device_id_; }
+
+  /*!
+   * \brief The \p Target describing how to compile for the device.
+   *
+   * Null denotes unconstrained (though if device_type is known then only a target of that

Review comment:
       Can we rephrase the parenthetical to "(though if `device_type` is known, then only a target runnable on that device type is allowed)"?  This would cover cases where multiple targets share the same device type (e.g. targets `"llvm"` and "`c`" both run on `kDLCPU`, and targets `"cuda"` and `"nvptx"` both run on `kDLCUDA`).

##########
File path: include/tvm/target/se_scope.h
##########
@@ -0,0 +1,330 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file tvm/target/se_scope.h
+ * \brief A compile time representation for a Storage or Execution Scope.
+ */
+
+#ifndef TVM_TARGET_SE_SCOPE_H_
+#define TVM_TARGET_SE_SCOPE_H_
+
+#include <tvm/ir/transform.h>
+#include <tvm/target/target.h>
+
+#include <string>
+#include <unordered_map>
+#include <utility>
+
+namespace tvm {
+
+/*!
+ * Abstract label for an area of memory.
+ *
+ * Currently uninterpreted and arbitrary. Likely to be replaced by a structured representation
+ * of a memory pool in the future. Please try to use this alias instead of String to aid future
+ * code migration.
+ */
+using MemoryScope = String;
+
+/*!
+ * \brief Describes at compile time where data is to be stored down to the device and memory
+ * scope level, or where execution is to take place, down to the device level. It is a quadruple of:
+ * - A \p device_type (\p DLDeviceType).
+ * - An uninterpreted \p virtual_device_id (\p int) distinguishing the intended device from all
+ *   other devices (either of the same \p device_type, or across all available devices in the
+ *   system). The virtual device id need not correspond to any physical device id, see
+ *   "Virtual Devices" below.
+ * - A \p target (\p Target) describing how to compile code for the intended device.
+ * - A \p memory_scope (MemoryScope, which is currently just \p String) describing which memory
+ *   area is to be used to hold data. The area should be reachable from the device but need not be
+ *   'on' the device, see "Memory Scopes and Devices" below.
+ *
+ * All of these fields may be 'unconstrained' (ie null, -1 or ""), signaling that device planning
+ * is free to choose a value consistent with the whole program. However if a \p target is given
+ * then the \p device_type must equal \p target->kind->device_type.
+ *
+ * Note that currently we assume if a function returns its result on a particular device
+ * then the function body is also executed on that device. See the overview comment in
+ * src/relay/transforms/device_planner.cc for more details.
+ *
+ * By 'data' we include both tensors and additional supporting datastructures such as shapes,
+ * Relay AST items, Relay tuples, and Relay references. Typically non-tensor data must reside
+ * on a 'CPU'-like device with good support for scalars.
+ *
+ * By 'execution' we include both (fused) primitive operators, and all the Relay expressions
+ * surrounding them which coordinates data and control flow. Again, typically non-primitive
+ * operators must be executed on a 'CPU'-like device with good support for control flow.
+ *
+ * Targets vs Devices
+ * ------------------
+ * Generally \p Targets (a compile-time only datastructue) describe compiler options for a specific
+ * microarchitecture and toolchain, while \p Devices (a runtime datastructure also available at
+ * compile time) describe a physical device on the target system. Obviously the target must agree
+ * with the device's microarchitecture, but we otherwise don't impose any constraints between them:
+ *  - It's ok to use different \p Targets for the same \p Device, eg to squeeze some extra perf
+ *    out of a particular primitive.
+ *  - It's ok to use the same \p Target for multiple \p Devices, eg if we have multiple CPUs.
+ *
+ * Traditionally TVM assumes at most one \p Target per \p DLDeviceType. We are moving away from that
+ * assumption.
+ *
+ * Virtual vs Physical Devices
+ * ---------------------------
+ * The \p virtual_device_id may be left as 0 if not significant. It is up to downstream
+ * compilation passes and/or the runtime to map a \p virtual_device_id to an actual physical
+ * device id if required. For example, some runtimes may support passing in an array of actual
+ * `device` specifications, and the \p virtual_device_id is simply an index known at compile time
+ * into that array.
+ *
+ * Memory Scopes and Devices
+ * -------------------------
+ * Multi-device systems can have complex memory hierarchies. For example
+ * \code
+ * (kDLCPU, 0, "llvm", "global")
+ * \endcode
+ * and
+ * \code
+ * (kDLCPU, 1, "llvm", "global")
+ * \endcode
+ * could denote:
+ * - The same memory area accessible from two separate CPUs without any CPU affinity;
+ * - Distinct memory areas in a NUMA architecture for which cross-device access is handled
+ *   by the memory system;
+ * - Outright distinct memory areas, where one device cannot directly address the memory of
+ *   another.
+ *
+ * Similarly:
+ * \code
+ * (kDLCPU, 0, "llvm", "global")
+ * \endcode
+ * and
+ * \code
+ * (kDLCUDA, 0, "cuda", "host")
+ * \endcode
+ * could denote the same memory area, but with very different access costs.
+ *
+ * We don't currently try to build any of this system-level understanding into \p SEScope. Device
+ * planning will simply insert "device_copy" operators wherever \p SEScopes are not exactly
+ * pointwise equal, and we leave it to downstream compilation to elide unnecessary copies. We
+ * may revisit this in the future.
+ *
+ * Joining and Defaulting
+ * ----------------------
+ * It is possible to 'join' two \p SEScopes to yield the most constrained \p SEScope which agrees
+ * with both join arguments. Eg:
+ * \code
+ * Join((kDLCPU, -1, "llvm", ""), (kInvalidDeviceType, 3, null, "global))

Review comment:
       How does `Join` interact with the "not significant" value of `virtual_device_id`?  If a `SEScope` with `virtual_device_id==1` is joined with a `virtual_device_id==0`, does this result in `null`, or a `virtual_device_id==1`?

##########
File path: include/tvm/target/se_scope.h
##########
@@ -0,0 +1,330 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file tvm/target/se_scope.h
+ * \brief A compile time representation for a Storage or Execution Scope.
+ */
+
+#ifndef TVM_TARGET_SE_SCOPE_H_
+#define TVM_TARGET_SE_SCOPE_H_
+
+#include <tvm/ir/transform.h>
+#include <tvm/target/target.h>
+
+#include <string>
+#include <unordered_map>
+#include <utility>
+
+namespace tvm {
+
+/*!
+ * Abstract label for an area of memory.
+ *
+ * Currently uninterpreted and arbitrary. Likely to be replaced by a structured representation
+ * of a memory pool in the future. Please try to use this alias instead of String to aid future
+ * code migration.
+ */
+using MemoryScope = String;
+
+/*!
+ * \brief Describes at compile time where data is to be stored down to the device and memory
+ * scope level, or where execution is to take place, down to the device level. It is a quadruple of:
+ * - A \p device_type (\p DLDeviceType).
+ * - An uninterpreted \p virtual_device_id (\p int) distinguishing the intended device from all
+ *   other devices (either of the same \p device_type, or across all available devices in the
+ *   system). The virtual device id need not correspond to any physical device id, see
+ *   "Virtual Devices" below.
+ * - A \p target (\p Target) describing how to compile code for the intended device.
+ * - A \p memory_scope (MemoryScope, which is currently just \p String) describing which memory
+ *   area is to be used to hold data. The area should be reachable from the device but need not be
+ *   'on' the device, see "Memory Scopes and Devices" below.
+ *
+ * All of these fields may be 'unconstrained' (ie null, -1 or ""), signaling that device planning
+ * is free to choose a value consistent with the whole program. However if a \p target is given
+ * then the \p device_type must equal \p target->kind->device_type.
+ *
+ * Note that currently we assume if a function returns its result on a particular device
+ * then the function body is also executed on that device. See the overview comment in
+ * src/relay/transforms/device_planner.cc for more details.
+ *
+ * By 'data' we include both tensors and additional supporting datastructures such as shapes,
+ * Relay AST items, Relay tuples, and Relay references. Typically non-tensor data must reside
+ * on a 'CPU'-like device with good support for scalars.
+ *
+ * By 'execution' we include both (fused) primitive operators, and all the Relay expressions
+ * surrounding them which coordinates data and control flow. Again, typically non-primitive
+ * operators must be executed on a 'CPU'-like device with good support for control flow.
+ *
+ * Targets vs Devices
+ * ------------------
+ * Generally \p Targets (a compile-time only datastructue) describe compiler options for a specific
+ * microarchitecture and toolchain, while \p Devices (a runtime datastructure also available at
+ * compile time) describe a physical device on the target system. Obviously the target must agree
+ * with the device's microarchitecture, but we otherwise don't impose any constraints between them:
+ *  - It's ok to use different \p Targets for the same \p Device, eg to squeeze some extra perf
+ *    out of a particular primitive.
+ *  - It's ok to use the same \p Target for multiple \p Devices, eg if we have multiple CPUs.
+ *
+ * Traditionally TVM assumes at most one \p Target per \p DLDeviceType. We are moving away from that
+ * assumption.
+ *
+ * Virtual vs Physical Devices
+ * ---------------------------
+ * The \p virtual_device_id may be left as 0 if not significant. It is up to downstream
+ * compilation passes and/or the runtime to map a \p virtual_device_id to an actual physical
+ * device id if required. For example, some runtimes may support passing in an array of actual
+ * `device` specifications, and the \p virtual_device_id is simply an index known at compile time
+ * into that array.
+ *
+ * Memory Scopes and Devices
+ * -------------------------
+ * Multi-device systems can have complex memory hierarchies. For example
+ * \code
+ * (kDLCPU, 0, "llvm", "global")
+ * \endcode
+ * and
+ * \code
+ * (kDLCPU, 1, "llvm", "global")
+ * \endcode
+ * could denote:
+ * - The same memory area accessible from two separate CPUs without any CPU affinity;
+ * - Distinct memory areas in a NUMA architecture for which cross-device access is handled
+ *   by the memory system;
+ * - Outright distinct memory areas, where one device cannot directly address the memory of
+ *   another.
+ *
+ * Similarly:
+ * \code
+ * (kDLCPU, 0, "llvm", "global")
+ * \endcode
+ * and
+ * \code
+ * (kDLCUDA, 0, "cuda", "host")
+ * \endcode
+ * could denote the same memory area, but with very different access costs.
+ *
+ * We don't currently try to build any of this system-level understanding into \p SEScope. Device
+ * planning will simply insert "device_copy" operators wherever \p SEScopes are not exactly
+ * pointwise equal, and we leave it to downstream compilation to elide unnecessary copies. We
+ * may revisit this in the future.
+ *
+ * Joining and Defaulting
+ * ----------------------
+ * It is possible to 'join' two \p SEScopes to yield the most constrained \p SEScope which agrees
+ * with both join arguments. Eg:
+ * \code
+ * Join((kDLCPU, -1, "llvm", ""), (kInvalidDeviceType, 3, null, "global))
+ *   => (kDLCPU, 3, "llvm", "global")
+ * Join((kDLCPU, -1, "llvm", ""), (kInvalidDeviceType, 3, null, "local))
+ *   => null (no join possible)
+ * \endcode
+ *
+ * Related to 'join' is 'default', which only takes constrained fields from the rhs when the
+ * lhs is unconstrained:
+ * \code
+ * Default(kDLCPU, -1, "llvm", "local"), (kDLCPU, 3, null, "global"))
+ *   => (kDLCPU, 3, "llvm", "local")
+ * \endcode
+ *
+ * These operations are needed during device planning.
+ *
+ */
+class SEScopeNode : public Object {
+ public:
+  /*!
+   * \brief The \p DLDeviceType of the device. If \p target is known then this will be equal to
+   * \p target->kind->device_type. If \p target is null then the target is to be determined by
+   * a later pass.
+   *
+   * This is needed to support the legacy "on_device" and "device_copy" calls which only allow
+   * a \p DLDeviceTypes (as an integer) to be given.
+   *
+   * kInvalidDeviceType denotes unconstrained.
+   */
+  DLDeviceType device_type() const { return device_type_; }
+
+  /*!
+   * \brief The 'virtual' device identifier for the device. This must be resolved to a physical
+   * device identifier either during compilation or at runtime.
+   *
+   * -1 denotes unconstrained. May be 0 if not significant.
+   */
+  int virtual_device_id() const { return virtual_device_id_; }
+
+  /*!
+   * \brief The \p Target describing how to compile for the device.
+   *
+   * Null denotes unconstrained (though if device_type is known then only a target of that
+   * type is allowed).
+   */
+  const Target& target() const { return target_; }
+
+  /*!
+   * \brief The scope of memory within the device.
+   *
+   * Empty denotes unconstrained.
+   */
+  const MemoryScope& memory_scope() const { return memory_scope_; }
+
+  /*!
+   * \brief Returns true if scope is fully unconstrained, ie no target/device type, virtual device
+   * id or memory scope is specified.
+   */
+  bool is_fully_unconstrained() const {
+    return !target_.defined() && device_type_ == kInvalidDeviceType && virtual_device_id_ == -1 &&
+           memory_scope_.empty();
+  }
+
+  /*!
+   * \brief Returns true if scope is fully constrained, ie target, virtual device id and
+   * memory scope are all specified.
+   */
+  bool is_fully_constrained() const {
+    return target_.defined() && virtual_device_id_ != -1 && !memory_scope_.empty();
+  }
+
+  Device ToDevice() const {
+    ICHECK(device_type_ != kInvalidDeviceType);
+    ICHECK(virtual_device_id_ != -1);
+    Device device;
+    device.device_type = device_type_;
+    device.device_id = virtual_device_id_;

Review comment:
       Since zero is a valid `device_id`, but has special meaning as a "not significant" `virtual_device_id`, how would one specify that the device id is significant, and should be assigned to zero?

##########
File path: include/tvm/target/se_scope.h
##########
@@ -0,0 +1,330 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+ * \file tvm/target/se_scope.h
+ * \brief A compile time representation for a Storage or Execution Scope.
+ */
+
+#ifndef TVM_TARGET_SE_SCOPE_H_
+#define TVM_TARGET_SE_SCOPE_H_
+
+#include <tvm/ir/transform.h>
+#include <tvm/target/target.h>
+
+#include <string>
+#include <unordered_map>
+#include <utility>
+
+namespace tvm {
+
+/*!
+ * Abstract label for an area of memory.
+ *
+ * Currently uninterpreted and arbitrary. Likely to be replaced by a structured representation
+ * of a memory pool in the future. Please try to use this alias instead of String to aid future
+ * code migration.
+ */
+using MemoryScope = String;
+
+/*!
+ * \brief Describes at compile time where data is to be stored down to the device and memory
+ * scope level, or where execution is to take place, down to the device level. It is a quadruple of:
+ * - A \p device_type (\p DLDeviceType).
+ * - An uninterpreted \p virtual_device_id (\p int) distinguishing the intended device from all
+ *   other devices (either of the same \p device_type, or across all available devices in the
+ *   system). The virtual device id need not correspond to any physical device id, see
+ *   "Virtual Devices" below.
+ * - A \p target (\p Target) describing how to compile code for the intended device.
+ * - A \p memory_scope (MemoryScope, which is currently just \p String) describing which memory
+ *   area is to be used to hold data. The area should be reachable from the device but need not be
+ *   'on' the device, see "Memory Scopes and Devices" below.
+ *
+ * All of these fields may be 'unconstrained' (ie null, -1 or ""), signaling that device planning
+ * is free to choose a value consistent with the whole program. However if a \p target is given
+ * then the \p device_type must equal \p target->kind->device_type.
+ *
+ * Note that currently we assume if a function returns its result on a particular device
+ * then the function body is also executed on that device. See the overview comment in
+ * src/relay/transforms/device_planner.cc for more details.
+ *
+ * By 'data' we include both tensors and additional supporting datastructures such as shapes,
+ * Relay AST items, Relay tuples, and Relay references. Typically non-tensor data must reside
+ * on a 'CPU'-like device with good support for scalars.
+ *
+ * By 'execution' we include both (fused) primitive operators, and all the Relay expressions
+ * surrounding them which coordinates data and control flow. Again, typically non-primitive
+ * operators must be executed on a 'CPU'-like device with good support for control flow.
+ *
+ * Targets vs Devices
+ * ------------------
+ * Generally \p Targets (a compile-time only datastructue) describe compiler options for a specific
+ * microarchitecture and toolchain, while \p Devices (a runtime datastructure also available at
+ * compile time) describe a physical device on the target system. Obviously the target must agree
+ * with the device's microarchitecture, but we otherwise don't impose any constraints between them:
+ *  - It's ok to use different \p Targets for the same \p Device, eg to squeeze some extra perf
+ *    out of a particular primitive.
+ *  - It's ok to use the same \p Target for multiple \p Devices, eg if we have multiple CPUs.
+ *
+ * Traditionally TVM assumes at most one \p Target per \p DLDeviceType. We are moving away from that
+ * assumption.
+ *
+ * Virtual vs Physical Devices
+ * ---------------------------
+ * The \p virtual_device_id may be left as 0 if not significant. It is up to downstream
+ * compilation passes and/or the runtime to map a \p virtual_device_id to an actual physical
+ * device id if required. For example, some runtimes may support passing in an array of actual
+ * `device` specifications, and the \p virtual_device_id is simply an index known at compile time
+ * into that array.
+ *
+ * Memory Scopes and Devices
+ * -------------------------
+ * Multi-device systems can have complex memory hierarchies. For example
+ * \code
+ * (kDLCPU, 0, "llvm", "global")
+ * \endcode
+ * and
+ * \code
+ * (kDLCPU, 1, "llvm", "global")
+ * \endcode
+ * could denote:
+ * - The same memory area accessible from two separate CPUs without any CPU affinity;
+ * - Distinct memory areas in a NUMA architecture for which cross-device access is handled
+ *   by the memory system;
+ * - Outright distinct memory areas, where one device cannot directly address the memory of
+ *   another.
+ *
+ * Similarly:
+ * \code
+ * (kDLCPU, 0, "llvm", "global")
+ * \endcode
+ * and
+ * \code
+ * (kDLCUDA, 0, "cuda", "host")
+ * \endcode
+ * could denote the same memory area, but with very different access costs.
+ *
+ * We don't currently try to build any of this system-level understanding into \p SEScope. Device
+ * planning will simply insert "device_copy" operators wherever \p SEScopes are not exactly
+ * pointwise equal, and we leave it to downstream compilation to elide unnecessary copies. We
+ * may revisit this in the future.
+ *
+ * Joining and Defaulting
+ * ----------------------
+ * It is possible to 'join' two \p SEScopes to yield the most constrained \p SEScope which agrees
+ * with both join arguments. Eg:
+ * \code
+ * Join((kDLCPU, -1, "llvm", ""), (kInvalidDeviceType, 3, null, "global))
+ *   => (kDLCPU, 3, "llvm", "global")
+ * Join((kDLCPU, -1, "llvm", ""), (kInvalidDeviceType, 3, null, "local))
+ *   => null (no join possible)
+ * \endcode
+ *
+ * Related to 'join' is 'default', which only takes constrained fields from the rhs when the
+ * lhs is unconstrained:
+ * \code
+ * Default(kDLCPU, -1, "llvm", "local"), (kDLCPU, 3, null, "global"))
+ *   => (kDLCPU, 3, "llvm", "local")
+ * \endcode
+ *
+ * These operations are needed during device planning.
+ *
+ */
+class SEScopeNode : public Object {
+ public:
+  /*!
+   * \brief The \p DLDeviceType of the device. If \p target is known then this will be equal to
+   * \p target->kind->device_type. If \p target is null then the target is to be determined by
+   * a later pass.
+   *
+   * This is needed to support the legacy "on_device" and "device_copy" calls which only allow
+   * a \p DLDeviceTypes (as an integer) to be given.
+   *
+   * kInvalidDeviceType denotes unconstrained.
+   */
+  DLDeviceType device_type() const { return device_type_; }
+
+  /*!
+   * \brief The 'virtual' device identifier for the device. This must be resolved to a physical
+   * device identifier either during compilation or at runtime.
+   *
+   * -1 denotes unconstrained. May be 0 if not significant.

Review comment:
       What is the semantic difference between "unconstrained" (`virtual_device_id==-1`) and "not significant" (`virtual_device_id==0`)?  From the description, I'd read that "unconstrained" is a compile-time concept, that the compiler is free to assign the memory/compilation to any virtual device during memory planning, while "not significant" is a runtime context, that the executor/vm is free to load data to and run the corresponding code on any device of that type.  Is this an accurate understanding?
   
   As a follow-up, is the compiler required to ensure that the code can run on all listed targets of a given device type in order to use the "not significant"?  (e.g. If there are two targets specified as `vulkan -supports_float16=0` and `vulkan -supports_float16=1`, should the compiler verify that the allocated memory is not of `float16` type before declaring the virtual device id as "not significant"?)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org