You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/07/16 07:07:08 UTC

[GitHub] [incubator-tvm] jcf94 opened a new pull request #6073: [Ansor][AutoTVM v2.0] Part 1: Add annotation/compute_at/compute_root/compute_inline steps

jcf94 opened a new pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456319743



##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -143,12 +265,67 @@ void State::DoReorderStep(const ReorderStep& step) {
                      Stage(stage->op, stage->op_type, iters, stage->compute_at, stage->attrs));
 }
 
+void State::DoComputeAtStep(const ComputeAtStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // After compute_at, we don't know the accurate length information any more
+  // If we do want to know the accurate lengths, we can call ComputeDAG::InferBound

Review comment:
       Doc updated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] comaniac edited a comment on pull request #6073: [Ansor][AutoTVM v2.0] Part 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

comaniac edited a comment on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-659756713


   > Just noticed that #6077 is also titled as "part 1", shall we make one of them part 2 instead?
   
   Part 1 will have several PRs. "Phase 1" might be a better word to describe this process, but anyways.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on pull request #6073: [Ansor][AutoTVM v2.0] Part 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-659776003


   > > Just noticed that #6077 is also titled as "part 1", shall we make one of them part 2 instead?
   > 
   > Part 1 will have several PRs. "Phase 1" might be a better word to describe this process, but anyways.
   
   @junrushao1994 My bad, yes I was intended to express a meaning of stage/phase...
   @comaniac Let's just use phase, I'll rename these PRs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456320683



##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")

Review comment:
       Check added, and this really helped me to find a hidden bug. :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457773012



##########
File path: src/auto_scheduler/measure_record.cc
##########
@@ -82,98 +63,23 @@ struct Handler<::tvm::Array<::tvm::auto_scheduler::Step>> {
   inline static void Write(dmlc::JSONWriter* writer,
                            const ::tvm::Array<::tvm::auto_scheduler::Step>& data) {
     writer->BeginArray(false);
-    for (size_t i = 0; i < data.size(); ++i) {
+    for (const auto& step : data) {
       writer->WriteArraySeperator();
       writer->BeginArray(false);
-      if (auto ps = data[i].as<::tvm::auto_scheduler::ReorderStepNode>()) {
-        writer->WriteArrayItem(std::string("RE"));
-        writer->WriteArrayItem(ps->stage_id);
-        writer->WriteArrayItem(IntArrayToVector(ps->after_ids));
-      } else if (auto ps = data[i].as<::tvm::auto_scheduler::SplitStepNode>()) {
-        writer->WriteArrayItem(std::string("SP"));
-        writer->WriteArrayItem(ps->stage_id);
-        writer->WriteArrayItem(ps->iter_id);
-        writer->WriteArrayItem(ps->extent ? ::tvm::auto_scheduler::GetIntImm(ps->extent.value())
-                                          : 0);
-        writer->WriteArrayItem(IntArrayToVector(ps->lengths));
-        writer->WriteArrayItem(static_cast<int>(ps->inner_to_outer));
-      } else if (auto ps = data[i].as<::tvm::auto_scheduler::FuseStepNode>()) {
-        writer->WriteArrayItem(std::string("FU"));
-        writer->WriteArrayItem(ps->stage_id);
-        writer->WriteArrayItem(IntArrayToVector(ps->fused_ids));
-      } else {
-        LOG(FATAL) << "Invalid step: " << data[i];
-      }
+      step->WriteToRecord(writer);
       writer->EndArray();
     }
     writer->EndArray();
   }
 
   inline static void Read(dmlc::JSONReader* reader,
                           ::tvm::Array<::tvm::auto_scheduler::Step>* data) {
-    std::vector<int> int_list;
-    bool s, inner_to_outer;
-    std::string name, scope_name, pragma_type, ti_func_name;
-    int stage_id, iter_id, extent;
-
     reader->BeginArray();
     data->clear();
     while (reader->NextArrayItem()) {
       reader->BeginArray();
-      s = reader->NextArrayItem();
-      CHECK(s);
-      reader->Read(&name);
-      if (name == "RE") {
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&stage_id);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&int_list);
-        ::tvm::Array<::tvm::Integer> after_ids;
-        for (const auto& i : int_list) {
-          after_ids.push_back(i);
-        }
-        data->push_back(::tvm::auto_scheduler::ReorderStep(stage_id, after_ids));
-      } else if (name == "SP") {
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&stage_id);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&iter_id);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&extent);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&int_list);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&inner_to_outer);
-        ::tvm::Array<::tvm::Optional<::tvm::Integer>> lengths;
-        for (const auto& i : int_list) {
-          lengths.push_back(::tvm::Integer(i));
-        }
-        data->push_back(::tvm::auto_scheduler::SplitStep(
-            stage_id, iter_id, extent == 0 ? ::tvm::PrimExpr() : extent, lengths, inner_to_outer));
-      } else if (name == "FU") {
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&stage_id);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&int_list);
-        ::tvm::Array<::tvm::Integer> fused_ids;
-        for (const auto& i : int_list) {
-          fused_ids.push_back(i);
-        }
-        data->push_back(::tvm::auto_scheduler::FuseStep(stage_id, fused_ids));
-      } else {
-        LOG(FATAL) << "Invalid step format";
-      }
-      s = reader->NextArrayItem();
-      CHECK(!s);
+      data->push_back(::tvm::auto_scheduler::StepReadFromRecord(reader));
+      CHECK(!reader->NextArrayItem());

Review comment:
       `NextArrayItem` has side effects (i.e. moving the cursor of the reader), so I don't want to put it into the debug-oriented `CHECK`. In my opinion, the code should run correctly if we delete all `CHECK`.

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -244,285 +197,73 @@ Iterator State::unroll(int stage_id, const Iterator& it, int max_unroll) {
   AnnotationStep step =
       AnnotationStep(stage_id, GetIndex(stage->iters, it), IteratorAnnotation::kUnroll);
   CopyOnWrite()->transform_steps.push_back(step);
-  return DoAnnotationStep(step);
+  return step->ApplyToState(this);
 }
 
-Iterator State::bind(int stage_id, const Iterator& it, IteratorAnnotation thread_type) {
+Iterator State::vectorize(int stage_id, const Iterator& it) {
   const Stage& stage = operator->()->stages[stage_id];
-  if (thread_type < IteratorAnnotation::kVThread || thread_type > IteratorAnnotation::kThreadZ) {
-    LOG(FATAL) << "thread_type error, valid: kVThread, kBlockX, kBlockY, "
-               << "kThreadX, kThreadY, kBlockZ, kThreadZ";
-  }
-  AnnotationStep step = AnnotationStep(stage_id, GetIndex(stage->iters, it), thread_type);
+  AnnotationStep step =
+      AnnotationStep(stage_id, GetIndex(stage->iters, it), IteratorAnnotation::kVectorize);
   CopyOnWrite()->transform_steps.push_back(step);
-  return DoAnnotationStep(step);
-}
-
-/********** Step implementations for state **********/
-void State::DoReorderStep(const ReorderStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-  Array<Iterator> iters;
-  for (auto x : step->after_ids) {
-    iters.push_back(stage->iters[x]);
-  }
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id,
-                     Stage(stage->op, stage->op_type, iters, stage->compute_at, stage->attrs));
+  return step->ApplyToState(this);
 }
 
-void State::DoComputeAtStep(const ComputeAtStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Remove the bound information of each iterator since they may not be accurate after
-  // compute at
-  Array<Iterator> new_iters;
-  for (const Iterator& it : stage->iters) {
-    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
-                                           ComputeAtKind::kIter, stage->attrs));
-  // Update attach map
-  pstate->attach_map.SetComputeAtIter(step->stage_id, step->target_stage_id, step->target_iter_id);
-}
-
-void State::DoComputeRootStep(const ComputeRootStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Remove the bound information of each iterator since they may not be accurate after
-  // compute root
-  Array<Iterator> new_iters;
-  for (const Iterator& it : stage->iters) {
-    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
-                                           ComputeAtKind::kRoot, stage->attrs));
-  // Update attach map
-  pstate->attach_map.DeleteStage(step->stage_id);
+Iterator State::fuse(int stage_id, const Array<Iterator>& iters) {
+  const Stage& stage = operator->()->stages[stage_id];
+  Array<Integer> indices;
+  GetIndices(stage->iters, iters, &indices);
+  FuseStep step = FuseStep(stage_id, indices);
+  CopyOnWrite()->transform_steps.push_back(step);
+  return step->ApplyToState(this);
 }
 
-void State::DoComputeInlineStep(const ComputeInlineStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Check the validity of compute_inline
-  for (size_t i = 0; i < stage->iters.size(); ++i) {
-    CHECK_EQ(operator->()->attach_map->iter_to_attached_stages.count(
-                 std::make_pair(step->stage_id, i)),
-             0)
-        << "Invalid compute_inline: There are some other stages that are attached to the "
-        << "target stage";
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  auto new_stage = pstate->stages[step->stage_id];
-  new_stage.CopyOnWrite()->compute_at = ComputeAtKind::kInlined;
-  pstate->stages.Set(step->stage_id, std::move(new_stage));
-  // Update attach map
-  pstate->attach_map.DeleteStage(step->stage_id);
+void State::reorder(int stage_id, const Array<Iterator>& order) {
+  const Stage& stage = operator->()->stages[stage_id];
+  CHECK_EQ(order.size(), stage->iters.size()) << "The order of all iterators "
+                                              << "should be specified";
+  Array<Integer> after_ids;
+  GetIndices(stage->iters, order, &after_ids);
+  ReorderStep step = ReorderStep(stage_id, after_ids);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-// common part for DoSplitStep, DoFollowSplitStep, and DoFollowFusedSplitStep
-Array<Iterator> State::DoSplitStepCommon(int stage_id, int iter_id,
-                                         const Array<Optional<Integer>>& lengths,
-                                         bool inner_to_outer) {
+Array<Iterator> State::split(int stage_id, const Iterator& it,
+                             const Array<Optional<Integer>>& lengths, bool inner_to_outer) {
   const Stage& stage = operator->()->stages[stage_id];
-  const Iterator& it = stage->iters[iter_id];
-  size_t old_iter_size = stage->iters.size();
-  bool concrete = true;
-
-  Optional<PrimExpr> tosplit_min, tosplit_extent;
-  if (it->range.defined()) {
-    tosplit_min = it->range->min;
-    tosplit_extent = it->range->extent;
-  } else {
-    tosplit_min = NullOpt;
-    tosplit_extent = NullOpt;
-  }
-
-  Array<Iterator> outs;
-  for (size_t i = 0; i < lengths.size(); ++i) {
-    Optional<Integer> l;
-    String name;
-    if (inner_to_outer) {
-      l = lengths[lengths.size() - i - 1];
-      name = it->name + "." + std::to_string(lengths.size() - i);
-    } else {
-      l = lengths[i];
-      name = it->name + "." + std::to_string(i);
-    }
-    Iterator res;
-    if (l && tosplit_min && tosplit_extent) {
-      res = Iterator(name, Range::FromMinExtent(tosplit_min.value(), l.value()), it->iter_kind,
-                     IteratorAnnotation::kNone);
-      tosplit_min = Integer(0);
-      tosplit_extent = indexdiv(tosplit_extent.value() + l.value() - 1, l.value());
-    } else {
-      res = Iterator(name, Range(), it->iter_kind, IteratorAnnotation::kNone);
-      tosplit_min = NullOpt;
-      tosplit_extent = NullOpt;
-      concrete = false;
-    }
-    outs.push_back(std::move(res));
-  }
-
-  Range range;
-  if (tosplit_min && tosplit_extent) {
-    range = Range::FromMinExtent(tosplit_min.value(), tosplit_extent.value());
-  }
-  if (inner_to_outer) {
-    outs.push_back(Iterator(it->name + ".0", range, it->iter_kind, IteratorAnnotation::kNone));
-    // Reverse the Iterator array
-    Array<Iterator> temp(outs.rbegin(), outs.rend());
-    outs = std::move(temp);
-  } else {
-    outs.push_back(Iterator(it->name + "." + std::to_string(lengths.size()), range, it->iter_kind,
-                            IteratorAnnotation::kNone));
-  }
-
-  Array<Iterator> new_iters;
-  new_iters.insert(new_iters.end(), stage->iters.begin(), stage->iters.begin() + iter_id);
-  new_iters.insert(new_iters.end(), outs.begin(), outs.end());
-  new_iters.insert(new_iters.end(), stage->iters.begin() + iter_id + 1, stage->iters.end());
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(stage_id,
-                     Stage(stage->op, stage->op_type, new_iters, stage->compute_at, stage->attrs));
-  pstate->concrete &= concrete;
-
-  // Two vectors are used to represent the iterator relation before and after split
-  // The original iterators in AttachMap will be updated with the new iterators
-  std::vector<IterKey> from_iters;
-  std::vector<IterKey> to_iters;
-  for (size_t i = iter_id; i < old_iter_size; ++i) {
-    from_iters.emplace_back(stage_id, i);
-    to_iters.emplace_back(stage_id, i + lengths.size());
-  }
-  pstate->attach_map.UpdateIters(from_iters, to_iters);
-
-  return outs;
+  SplitStep step =
+      SplitStep(stage_id, GetIndex(stage->iters, it),
+                it->range.defined() ? it->range->extent : PrimExpr(), lengths, inner_to_outer);
+  CopyOnWrite()->transform_steps.push_back(step);
+  return step->ApplyToState(this);
 }
 
-Array<Iterator> State::DoSplitStep(const SplitStep& step) {
-  return DoSplitStepCommon(step->stage_id, step->iter_id, step->lengths, step->inner_to_outer);
+void State::compute_at(int stage_id, int target_stage_id, const Iterator& target_iter) {
+  const Stage& target_stage = operator->()->stages[target_stage_id];
+  ComputeAtStep step =
+      ComputeAtStep(stage_id, target_stage_id, GetIndex(target_stage->iters, target_iter));
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-Iterator State::DoFuseStep(const FuseStep& step) {
-  int stage_id = step->stage_id;
-  const Stage& stage = operator->()->stages[stage_id];
-  size_t old_iter_size = static_cast<int>(stage->iters.size());
-
-  String new_name;
-  PrimExpr new_extent = 1;
-  IteratorKind new_iter_kind = IteratorKind::kSpecial;
-
-  for (size_t i = 0; i < step->fused_ids.size(); ++i) {
-    if (i > 0) {
-      CHECK_EQ(step->fused_ids[i]->value, step->fused_ids[i - 1]->value + 1);
-    }
-
-    if (i != step->fused_ids.size() - 1) {
-      const auto& iter_to_attached_stage = operator->()->attach_map->iter_to_attached_stages;
-      if (iter_to_attached_stage.find(std::make_pair(stage_id, step->fused_ids[i])) !=
-          iter_to_attached_stage.end()) {
-        LOG(FATAL) << "Invalid Fuse. Trying to fuse iterators that have been attached by some "
-                   << "stages. State before fusion:\n"
-                   << *this;
-      }
-    }
-
-    const Iterator& it = stage->iters[step->fused_ids[i]];
-    new_name = new_name + it->name + "@";
-
-    if (it->range.defined() && new_extent.defined()) {
-      new_extent = new_extent * it->range->extent;
-    } else {
-      new_extent = PrimExpr();
-    }
-
-    if (i == 0) {
-      new_iter_kind = it->iter_kind;
-    } else {
-      if (new_iter_kind != it->iter_kind) {
-        new_iter_kind = IteratorKind::kMixed;
-      }
-    }
-  }
-
-  Range range;
-  if (new_extent.defined()) {
-    range = Range::FromMinExtent(0, new_extent);
-  }
-  Iterator new_it = Iterator(new_name, range, new_iter_kind, IteratorAnnotation::kNone);
-  Array<Iterator> new_iters;
-  new_iters.insert(new_iters.end(), stage->iters.begin(),
-                   stage->iters.begin() + step->fused_ids.front());
-  new_iters.push_back(new_it);
-  new_iters.insert(new_iters.end(), stage->iters.begin() + step->fused_ids.back() + 1,
-                   stage->iters.end());
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(stage_id,
-                     Stage(stage->op, stage->op_type, new_iters, stage->compute_at, stage->attrs));
-
-  // Two vectors are used to represent the iterator relation before and after fuse
-  // The original iterators in AttachMap will be updated with the new iterators
-  std::vector<IterKey> from_iters;
-  std::vector<IterKey> to_iters;
-  const size_t begin_id = step->fused_ids.front(), end_id = step->fused_ids.back();
-  for (size_t i = 0; i < old_iter_size; ++i) {
-    if (i <= begin_id) {
-      continue;
-    } else if (i > end_id) {
-      // move forward
-      from_iters.emplace_back(stage_id, i);
-      to_iters.emplace_back(stage_id, i - end_id + begin_id);
-    } else {
-      // move to the fused id
-      from_iters.emplace_back(stage_id, i);
-      to_iters.emplace_back(stage_id, begin_id);
-    }
-  }
-  pstate->attach_map.UpdateIters(from_iters, to_iters);
-
-  return new_it;
+void State::compute_inline(int stage_id) {
+  ComputeInlineStep step = ComputeInlineStep(stage_id);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-Iterator State::DoAnnotationStep(const AnnotationStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-  Iterator it = stage->iters[step->iter_id];
-
-  CHECK(it->annotation == IteratorAnnotation::kNone);
-  Iterator new_it = Iterator(it->name, it->range, it->iter_kind, step->annotation);
-  Stage new_stage = stage;
-  new_stage.CopyOnWrite()->iters.Set(step->iter_id, new_it);
-  CopyOnWrite()->stages.Set(step->stage_id, std::move(new_stage));
-  return new_it;
+void State::compute_root(int stage_id) {
+  ComputeRootStep step = ComputeRootStep(stage_id);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-void State::DoSteps(const ComputeDAG& dag) {
+void State::ApplySteps(const ComputeDAG& dag) {
   CHECK(operator->()->stages.size()) << "Invalid State with empty operation stages.";
 
+  // Call each step's ApplyToState method
   for (const auto& step : operator->()->transform_steps) {
-    if (auto ps = step.as<ReorderStepNode>()) {
-      DoReorderStep(GetRef<ReorderStep>(ps));
-    } else if (auto ps = step.as<ComputeAtStepNode>()) {
-      DoComputeAtStep(GetRef<ComputeAtStep>(ps));
-    } else if (auto ps = step.as<ComputeRootStepNode>()) {
-      DoComputeRootStep(GetRef<ComputeRootStep>(ps));
-    } else if (auto ps = step.as<ComputeInlineStepNode>()) {
-      DoComputeInlineStep(GetRef<ComputeInlineStep>(ps));
-    } else if (auto ps = step.as<SplitStepNode>()) {
-      DoSplitStep(GetRef<SplitStep>(ps));
-    } else if (auto ps = step.as<FuseStepNode>()) {
-      DoFuseStep(GetRef<FuseStep>(ps));
-    } else if (auto ps = step.as<AnnotationStepNode>()) {
-      DoAnnotationStep(GetRef<AnnotationStep>(ps));
-    } else {
-      LOG(FATAL) << "Invalid step: " << step;
-    }
+    StepApplyToState(step, this, dag);

Review comment:
       1. We can remove `StepApplyToState`
   2. `dag` will be used in other steps such as cache_write and rfactor.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

FrozenGene commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457823642



##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -270,19 +270,9 @@ std::pair<te::Schedule, Array<te::Tensor>> ComputeDAG::ApplySteps(
   }
 
   // Apply the history steps to TVM schedule
+  // Call each step's ApplyToSchedule method
   for (const auto& step : transform_steps) {
-    // Call each step's ApplyToSchedule method
-    // Note: some steps have extra parameters that must be passed and they may need different
-    // return value, so the ApplyToSchedule is not able to be merged to single interface
-    if (auto ps = step.as<ReorderStepNode>()) {
-      ps->ApplyToSchedule(stages, stage_to_axes);
-    } else if (auto ps = step.as<SplitStepNode>()) {
-      ps->ApplyToSchedule(stages, stage_to_axes);
-    } else if (auto ps = step.as<FuseStepNode>()) {
-      ps->ApplyToSchedule(stages, stage_to_axes);
-    } else {
-      LOG(FATAL) << "Invalid Step";
-    }
+    StepApplyToSchedule(step, stages, stage_to_axes);

Review comment:
       Why we don't name it as `ApplyStepToSchedule`?

##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -326,15 +316,7 @@ String ComputeDAG::PrintStepsAsPython(const Array<Step>& transform_steps) const
   }
   // Call each step's PrintAsPythonAPI method
   for (const auto& step : transform_steps) {
-    if (auto ps = step.as<ReorderStepNode>()) {
-      ss << ps->PrintAsPythonAPI(&stages, &stage_to_axes);
-    } else if (auto ps = step.as<SplitStepNode>()) {
-      ss << ps->PrintAsPythonAPI(&stages, &stage_to_axes);
-    } else if (auto ps = step.as<FuseStepNode>()) {
-      ss << ps->PrintAsPythonAPI(&stages, &stage_to_axes);
-    } else {
-      LOG(FATAL) << "Invalid Step";
-    }
+    ss << StepPrintAsPythonAPI(step, &stages, &stage_to_axes);

Review comment:
       ditto. PrintStepAsPythonAPI




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456181569



##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)
+    assert str(s0) == \

Review comment:
       I agree, but failed to figure out a better way here. At least the output string of state will not be Inferenced by other TVM parts out than `auto_scheduler` dir.
   @merrymercy Any comment?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457796865



##########
File path: src/auto_scheduler/transform_step.cc
##########
@@ -51,129 +53,539 @@ const char* IteratorAnnotationString[] = {
     "tensorize"     // kTensorized = 11
 };
 
-/********** Reorder **********/
-ReorderStep::ReorderStep(int stage_id, const Array<Integer>& after_ids) {
-  auto node = make_object<ReorderStepNode>();
-  node->stage_id = stage_id;
-  for (const auto& x : after_ids) {
-    CHECK(x->IsInstance<IntImmNode>());
+Step StepReadFromRecord(dmlc::JSONReader* reader) {
+  std::string name;
+  CHECK(reader->NextArrayItem());
+  reader->Read(&name);
+  if (name == AnnotationStepNode::record_prefix_str) {
+    return AnnotationStep(reader);
+  } else if (name == FuseStepNode::record_prefix_str) {
+    return FuseStep(reader);
+  } else if (name == ReorderStepNode::record_prefix_str) {
+    return ReorderStep(reader);
+  } else if (name == SplitStepNode::record_prefix_str) {
+    return SplitStep(reader);
+  } else if (name == ComputeAtStepNode::record_prefix_str) {
+    return ComputeAtStep(reader);
+  } else if (name == ComputeInlineStepNode::record_prefix_str) {
+    return ComputeInlineStep(reader);
+  } else if (name == ComputeRootStepNode::record_prefix_str) {
+    return ComputeRootStep(reader);
+  } else {
+    LOG(FATAL) << "Invalid step format: " << name;
   }
-  node->after_ids = after_ids;
-  data_ = std::move(node);
+  return Step();
 }
 
-void ReorderStepNode::ApplyToSchedule(Array<te::Stage>* stages,
-                                      StageToAxesMap* stage_to_axes) const {
-  auto stage = (*stages)[stage_id];
-  const Array<IterVar>& axes = stage_to_axes->at(stage);
-  CHECK_EQ(after_ids.size(), axes.size());
-
-  Array<IterVar> new_axes;
-  new_axes.reserve(axes.size());
-  for (auto i : after_ids) {
-    new_axes.push_back(axes[i]);
+void StepApplyToState(const Step& step, State* state, const ComputeDAG& dag) {
+  if (auto ps = step.as<AnnotationStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<FuseStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<ReorderStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<SplitStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<ComputeAtStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<ComputeInlineStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<ComputeRootStepNode>()) {
+    ps->ApplyToState(state);
+  } else {
+    LOG(FATAL) << "Invalid step: " << step;
   }
-  stage.reorder(new_axes);
-
-  stage_to_axes->Set(stage, std::move(new_axes));
-  stages->Set(stage_id, std::move(stage));
 }
 
-String ReorderStepNode::PrintAsPythonAPI(Array<te::Stage>* stages,
-                                         StageToAxesMap* stage_to_axes) const {
-  const auto& stage = (*stages)[stage_id];
-  std::stringstream ss;
-
-  ss << "s[" << CleanName(stage->op->name) << "].reorder(";
-  for (size_t i = 0; i < after_ids.size(); ++i) {
-    ss << CleanName((*stage_to_axes)[stage][after_ids[i]]->var->name_hint);
-    if (i != after_ids.size() - 1) {
-      ss << ", ";
-    }
+void StepApplyToSchedule(const Step& step, Array<te::Stage>* stages,

Review comment:
       Emm, `step->ApplyToState`, `step->ApplyToSchedule` & `step->PrintAsPythonAPI` all have problems to be written to virtual functions, we can later try to see if there's any better way to deal with this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457785421



##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -324,17 +314,9 @@ String ComputeDAG::PrintStepsAsPython(const Array<Step>& transform_steps) const
          << "tuple(" << stage->op->name << ".op.reduce_axis)\n";
     }
   }
-  // Call each step's PrintAsPythonAPI method
+  // Call each step's ApplyToPythonAPI method

Review comment:
       `PrintAsPythonAPI` is a better name in my opinion




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456228059



##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +202,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be vectorized.
 
         Returns
         -------
         res_it : Iterator
-            The fused Iterator
+            The vectorized Iterator.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object, res = _ffi_api.StateVectorize(self.state_object,
+                                                         self._resolve_stage_id(stage), iterator)
+        return res
+
+    def parallel(self, stage, iterator):
+        """ Schedule primitive corresponds to te.parallel.
 
-        self.state_object, res = _ffi_api.StateFuse(self.state_object, stage_id, iters)
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be paralleled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be paralleled.
+
+        Returns
+        -------
+        res_it : Iterator
+            The paralleled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateParallel(self.state_object,
+                                                        self._resolve_stage_id(stage), iterator)
+        return res
+
+    def unroll(self, stage, iterator, max_unroll=None):
+        """ Schedule primitive corresponds to te.unroll.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be unrolled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be unrolled.
+        max_unroll : Optional[int]
+            The max unroll limit. Iterator with extent larger than this limit will be skipped.
+
+        Returns
+        -------
+        res_it : Iterator
+            The unrolled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateUnroll(self.state_object,
+                                                      self._resolve_stage_id(stage), iterator,
+                                                      max_unroll if max_unroll else -1)
+        return res
+
+    def bind(self, stage, iterator, thread_name):
+        """ Schedule primitive corresponds to te.bind.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be binded, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be binded.
+        thread_name : str
+            The thread type to be binded. Currently support:
+            - vthread
+            - blockIdx.x
+            - threadIdx.x
+            - blockIdx.y
+            - threadIdx.y

Review comment:
       We should add them. Because other policies can use them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-662195808


   > Given that the issue of 3 dispatching functions will be addressed later, this PR LGTM. Thanks for the efforts!.
   
   Thanks, updated our merge plan to track this issue: https://github.com/merrymercy/Ansor/issues/65


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] comaniac commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Part 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

comaniac commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456141719



##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -90,12 +90,69 @@ Stage::Stage(te::Operation op, StageKind op_type, const Array<Iterator>& iters,
   data_ = std::move(node);
 }
 
+/********** AttachMap **********/
+void AttachMap::SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  // Delete the current entry of this stage
+  DeleteStageEntry(pnode, stage_id);
+
+  // Store the new relations to map
+  IterKey iter_key(target_stage_id, target_iter_id);
+  pnode->stage_to_attach_iter[stage_id] = iter_key;
+  pnode->iter_to_attached_stages[iter_key].push_back(stage_id);
+}
+
+void AttachMap::DeleteStage(int stage_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+  // Delete the original stage entry
+  DeleteStageEntry(pnode, stage_id);
+}
+
+void AttachMap::UpdateIters(const std::vector<IterKey>& old_iters,
+                            const std::vector<IterKey>& new_iters) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  CHECK_EQ(old_iters.size(), new_iters.size());
+  for (size_t i = 0; i < old_iters.size(); ++i) {
+    auto entry = pnode->iter_to_attached_stages.find(old_iters[i]);
+    if (entry == pnode->iter_to_attached_stages.end()) {
+      continue;
+    }
+
+    // Replace iter in the value of `stage_to_attach_iter`
+    for (const auto& s : entry->second) {
+      pnode->stage_to_attach_iter[s] = new_iters[i];
+    }
+
+    // Replace iter in the key of `iter_to_attached_stages`

Review comment:
       Maybe "Remove the old iter from iter_to_attached_stages and add the new iter" might be better.

##########
File path: src/auto_scheduler/measure_record.cc
##########
@@ -169,6 +206,18 @@ struct Handler<::tvm::Array<::tvm::auto_scheduler::Step>> {
           fused_ids.push_back(i);
         }
         data->push_back(::tvm::auto_scheduler::FuseStep(stage_id, fused_ids));
+      } else if (name == "AN") {

Review comment:
       I feel the current approach of (de)serializing records is not scalable and hard to be maintained. Specifically, we put all deserialization rules to `Read` and all serialization rules to `Write`. The key to connect the serialization logic to the corresponding deserialization logic is a two character short name (e.g., AN). It seems to me that it would be better to define a short name and (de)serialization logic in each step. In this case, we can use `ps->serialize(writer)` for serialization, and build a step factory to deserialize them.
   

##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +202,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be vectorized.
 
         Returns
         -------
         res_it : Iterator
-            The fused Iterator
+            The vectorized Iterator.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object, res = _ffi_api.StateVectorize(self.state_object,
+                                                         self._resolve_stage_id(stage), iterator)
+        return res
+
+    def parallel(self, stage, iterator):
+        """ Schedule primitive corresponds to te.parallel.
 
-        self.state_object, res = _ffi_api.StateFuse(self.state_object, stage_id, iters)
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be paralleled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be paralleled.
+
+        Returns
+        -------
+        res_it : Iterator
+            The paralleled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateParallel(self.state_object,
+                                                        self._resolve_stage_id(stage), iterator)
+        return res
+
+    def unroll(self, stage, iterator, max_unroll=None):
+        """ Schedule primitive corresponds to te.unroll.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be unrolled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be unrolled.
+        max_unroll : Optional[int]
+            The max unroll limit. Iterator with extent larger than this limit will be skipped.
+
+        Returns
+        -------
+        res_it : Iterator
+            The unrolled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateUnroll(self.state_object,
+                                                      self._resolve_stage_id(stage), iterator,
+                                                      max_unroll if max_unroll else -1)
+        return res
+
+    def bind(self, stage, iterator, thread_name):
+        """ Schedule primitive corresponds to te.bind.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be binded, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be binded.
+        thread_name : str
+            The thread type to be binded. Currently support:
+            - vthread
+            - blockIdx.x
+            - threadIdx.x
+            - blockIdx.y
+            - threadIdx.y
+
+        Returns
+        -------
+        res_it : Iterator
+            The binded Iterator.
+        """
+        trans_table = {
+            "vthread": 4,
+            "blockIdx.x": 5,
+            "threadIdx.x": 6,
+            "blockIdx.y": 7,
+            "threadIdx.y": 8,
+        }

Review comment:
       We should make this dict static.

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -143,12 +265,67 @@ void State::DoReorderStep(const ReorderStep& step) {
                      Stage(stage->op, stage->op_type, iters, stage->compute_at, stage->attrs));
 }
 
+void State::DoComputeAtStep(const ComputeAtStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // After compute_at, we don't know the accurate length information any more
+  // If we do want to know the accurate lengths, we can call ComputeDAG::InferBound

Review comment:
       This comment should be in the `compute_at()` function definition in `loop_state.h` because this is more like a side effect of this function.
   
   In addition, I think it's fine to say we intentionally removed the loop sizes of that state when doing this step since it is not accurate anymore after compute_at. Run ComputeDAG::InferBound again to recover the loop sizes.

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -264,6 +462,38 @@ Iterator State::DoFuseStep(const FuseStep& step) {
   pstate->stages.Set(stage_id,
                      Stage(stage->op, stage->op_type, new_iters, stage->compute_at, stage->attrs));
 
+  // We have to update the iterator relations in attach map, these two vectors keep the replacement
+  // mapping

Review comment:
       ditto.

##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)

Review comment:
       Remove this line.

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -90,12 +90,69 @@ Stage::Stage(te::Operation op, StageKind op_type, const Array<Iterator>& iters,
   data_ = std::move(node);
 }
 
+/********** AttachMap **********/
+void AttachMap::SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  // Delete the current entry of this stage
+  DeleteStageEntry(pnode, stage_id);
+
+  // Store the new relations to map
+  IterKey iter_key(target_stage_id, target_iter_id);
+  pnode->stage_to_attach_iter[stage_id] = iter_key;
+  pnode->iter_to_attached_stages[iter_key].push_back(stage_id);
+}
+
+void AttachMap::DeleteStage(int stage_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+  // Delete the original stage entry
+  DeleteStageEntry(pnode, stage_id);
+}
+
+void AttachMap::UpdateIters(const std::vector<IterKey>& old_iters,
+                            const std::vector<IterKey>& new_iters) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  CHECK_EQ(old_iters.size(), new_iters.size());
+  for (size_t i = 0; i < old_iters.size(); ++i) {
+    auto entry = pnode->iter_to_attached_stages.find(old_iters[i]);
+    if (entry == pnode->iter_to_attached_stages.end()) {
+      continue;
+    }
+
+    // Replace iter in the value of `stage_to_attach_iter`

Review comment:
       Maybe "Update the attaching target of an old iter to the new iter" might be clearer.

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -90,12 +90,69 @@ Stage::Stage(te::Operation op, StageKind op_type, const Array<Iterator>& iters,
   data_ = std::move(node);
 }
 
+/********** AttachMap **********/
+void AttachMap::SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  // Delete the current entry of this stage
+  DeleteStageEntry(pnode, stage_id);
+
+  // Store the new relations to map
+  IterKey iter_key(target_stage_id, target_iter_id);
+  pnode->stage_to_attach_iter[stage_id] = iter_key;
+  pnode->iter_to_attached_stages[iter_key].push_back(stage_id);
+}
+
+void AttachMap::DeleteStage(int stage_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+  // Delete the original stage entry
+  DeleteStageEntry(pnode, stage_id);
+}
+
+void AttachMap::UpdateIters(const std::vector<IterKey>& old_iters,
+                            const std::vector<IterKey>& new_iters) {

Review comment:
       I guess you need to guarantee the length of both vectors are the same?

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -143,12 +265,67 @@ void State::DoReorderStep(const ReorderStep& step) {
                      Stage(stage->op, stage->op_type, iters, stage->compute_at, stage->attrs));
 }
 
+void State::DoComputeAtStep(const ComputeAtStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // After compute_at, we don't know the accurate length information any more
+  // If we do want to know the accurate lengths, we can call ComputeDAG::InferBound
+  Array<Iterator> new_iters;
+  for (const Iterator& it : stage->iters) {
+    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
+  }
+
+  StateNode* pstate = CopyOnWrite();
+  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
+                                           ComputeAtKind::kIter, stage->attrs));
+  // Update attach map
+  pstate->attach_map.SetComputeAtIter(step->stage_id, step->target_stage_id, step->target_iter_id);
+}
+
+void State::DoComputeRootStep(const ComputeRootStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // After compute_at, we don't know the accurate length information any more
+  // If we do want to know the accurate lengths, we can call ComputeDAG::InferBound

Review comment:
       ditto.

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -90,12 +90,69 @@ Stage::Stage(te::Operation op, StageKind op_type, const Array<Iterator>& iters,
   data_ = std::move(node);
 }
 
+/********** AttachMap **********/
+void AttachMap::SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  // Delete the current entry of this stage
+  DeleteStageEntry(pnode, stage_id);
+
+  // Store the new relations to map
+  IterKey iter_key(target_stage_id, target_iter_id);
+  pnode->stage_to_attach_iter[stage_id] = iter_key;
+  pnode->iter_to_attached_stages[iter_key].push_back(stage_id);
+}
+
+void AttachMap::DeleteStage(int stage_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+  // Delete the original stage entry
+  DeleteStageEntry(pnode, stage_id);
+}
+
+void AttachMap::UpdateIters(const std::vector<IterKey>& old_iters,
+                            const std::vector<IterKey>& new_iters) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  CHECK_EQ(old_iters.size(), new_iters.size());
+  for (size_t i = 0; i < old_iters.size(); ++i) {
+    auto entry = pnode->iter_to_attached_stages.find(old_iters[i]);
+    if (entry == pnode->iter_to_attached_stages.end()) {
+      continue;
+    }

Review comment:
       Add a comment saying that we are skpping the old iterators that have no stage attached.

##########
File path: src/auto_scheduler/loop_state.h
##########
@@ -217,6 +196,68 @@ class Stage : public ObjectRef {
   TVM_DEFINE_OBJECT_REF_COW_METHOD(StageNode);
 };
 
+/*! \brief Use stage_id to represent a stage. */
+using StageKey = int;
+/*! \brief Use stage_id and iter_id to represent a iterator. */
+using IterKey = std::pair<int, int>;
+
+/*!
+ * \brief stores the compute_at relation between stages
+ * This stores a bi-directional mapping from stages and iter:
+ * 1. Stage to its attached iterator
+ * 2. Iterator to the stage attached to it
+ * You can use AttachMapNode::stage_to_attach_iter and AttachMapNode::iter_to_attached_stages
+ * to query the relations
+ */
+class AttachMapNode : public Object {
+ public:
+  /*! \brief A Map to store the mapping of stage to its attached iterator. */
+  std::unordered_map<StageKey, IterKey> stage_to_attach_iter;
+  /*! \brief A Map to store the mapping of iterator to the stage attached to it. */
+  std::unordered_map<IterKey, std::vector<StageKey>> iter_to_attached_stages;
+
+  static constexpr const char* _type_key = "auto_scheduler.AttachMap";
+  TVM_DECLARE_FINAL_OBJECT_INFO(AttachMapNode, Object);
+};
+
+/*!
+ * \brief Managed reference to AttachMapNode.
+ * \sa AttachMapNode
+ */
+class AttachMap : public ObjectRef {
+ public:
+  /*!
+   * \brief Process the stage/iterator mapping after compute at.
+   * \param stage_id The index of the stage to be compute at.
+   * \param target_stage_id The index of stage that this step will compute at to.
+   * \param target_iter_id The index of iterator in target stage that this step will compute at to.
+   */
+  void SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id);
+  /*!
+   * \brief This is a public wrapper of `DeleteStageEntry`. To delete the entry of a specific stage.
+   * \param stage_id The index of the stage to be compute at.
+   */
+  void DeleteStage(int stage_id);
+  /*!
+   * \brief Update the iterator relations in AttachMap.

Review comment:
       To my understanding, this function attempts to update from (stage -> old_iter) to (stage -> new_iter). It needs a clearer description.

##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")

Review comment:
       Should check their annotations?

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -143,12 +265,67 @@ void State::DoReorderStep(const ReorderStep& step) {
                      Stage(stage->op, stage->op_type, iters, stage->compute_at, stage->attrs));
 }
 
+void State::DoComputeAtStep(const ComputeAtStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // After compute_at, we don't know the accurate length information any more
+  // If we do want to know the accurate lengths, we can call ComputeDAG::InferBound
+  Array<Iterator> new_iters;
+  for (const Iterator& it : stage->iters) {
+    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
+  }
+
+  StateNode* pstate = CopyOnWrite();
+  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
+                                           ComputeAtKind::kIter, stage->attrs));
+  // Update attach map
+  pstate->attach_map.SetComputeAtIter(step->stage_id, step->target_stage_id, step->target_iter_id);
+}
+
+void State::DoComputeRootStep(const ComputeRootStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // After compute_at, we don't know the accurate length information any more
+  // If we do want to know the accurate lengths, we can call ComputeDAG::InferBound
+  Array<Iterator> new_iters;
+  for (const Iterator& it : stage->iters) {
+    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
+  }
+
+  StateNode* pstate = CopyOnWrite();
+  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
+                                           ComputeAtKind::kRoot, stage->attrs));
+  // Update attach map
+  pstate->attach_map.DeleteStage(step->stage_id);
+}
+
+void State::DoComputeInlineStep(const ComputeInlineStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // CHECK the validity of compute_inline
+  for (size_t i = 0; i < stage->iters.size(); ++i) {
+    CHECK_EQ(operator->()->attach_map->iter_to_attached_stages.count(
+                 std::make_pair(step->stage_id, i)),
+             0)
+        << "Invalid compute_inline: Because there are some other stages "

Review comment:
       Remove "Because"

##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)
+    assert str(s0) == \

Review comment:
       I feel that it's not a good idea to check the correctness using printed strings...

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -143,12 +265,67 @@ void State::DoReorderStep(const ReorderStep& step) {
                      Stage(stage->op, stage->op_type, iters, stage->compute_at, stage->attrs));
 }
 
+void State::DoComputeAtStep(const ComputeAtStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // After compute_at, we don't know the accurate length information any more
+  // If we do want to know the accurate lengths, we can call ComputeDAG::InferBound
+  Array<Iterator> new_iters;
+  for (const Iterator& it : stage->iters) {
+    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
+  }
+
+  StateNode* pstate = CopyOnWrite();
+  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
+                                           ComputeAtKind::kIter, stage->attrs));
+  // Update attach map
+  pstate->attach_map.SetComputeAtIter(step->stage_id, step->target_stage_id, step->target_iter_id);
+}
+
+void State::DoComputeRootStep(const ComputeRootStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // After compute_at, we don't know the accurate length information any more
+  // If we do want to know the accurate lengths, we can call ComputeDAG::InferBound
+  Array<Iterator> new_iters;
+  for (const Iterator& it : stage->iters) {
+    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
+  }
+
+  StateNode* pstate = CopyOnWrite();
+  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
+                                           ComputeAtKind::kRoot, stage->attrs));
+  // Update attach map
+  pstate->attach_map.DeleteStage(step->stage_id);
+}
+
+void State::DoComputeInlineStep(const ComputeInlineStep& step) {
+  const Stage& stage = operator->()->stages[step->stage_id];
+
+  // CHECK the validity of compute_inline

Review comment:
       s/CHECK/Check

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -210,6 +387,16 @@ Array<Iterator> State::DoSplitStepCommon(int stage_id, int iter_id,
                      Stage(stage->op, stage->op_type, new_iters, stage->compute_at, stage->attrs));
   pstate->concrete &= concrete;
 
+  // We have to update the iterator relations in attach map, these two vectors keep the replacement
+  // mapping

Review comment:
       ```suggestion
     // Use two vectors to represent the iterator relation before and after the split
     // in order to update the attach_map
   ```
   
   We may also need to mention in the split function defintion in `loop_state.h` that if the attached target iterator is split, then the innermost split iterator would be the new attached target.

##########
File path: tests/python/unittest/test_auto_scheduler_measure.py
##########
@@ -18,14 +18,51 @@
 """ Test measurement and log serialization. """
 
 import tvm
-from tvm import auto_scheduler
+import topi
+from tvm import te, auto_scheduler
 import tempfile
 
 from test_auto_scheduler_common import get_tiled_matmul
 
 
 def test_record():
-    dag, s = get_tiled_matmul()
+    A = te.placeholder((512, 512), name='A')

Review comment:
       Move the checker in L67 to here. If LLVM is disabled, we don't have to even build this DAG.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 edited a comment on pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 edited a comment on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-660786809


   Moved all the implementation to transform_step.h/cc, and updated the order of steps.
   In the current implementation, the next pr of steps adding will only need to change transform_step.h/cc two files.
   cc @merrymercy @junrushao1994 @comaniac @FrozenGene 
   ... I'm not sure why I failed to re-request review of @junrushao1994 and @comaniac


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456321060



##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)
+    assert str(s0) == \
+        "Placeholder: Data, Kernel, Bias, Bn_scale, Bn_offset\n" + \
+        "for i1 (0,3)\n" + \
+        "  for i2 (0,230)\n" + \
+        "    for i3 (0,230)\n" + \
+        "      pad_temp = ...\n" + \
+        "for i1 (0,64)\n" + \
+        "  for i2 (0,112)\n" + \
+        "    for nn (None)\n" + \
+        "      for ff (None)\n" + \
+        "        for yy (None)\n" + \
+        "          for xx (None)\n" + \
+        "            for rc (None)\n" + \
+        "              for ry (None)\n" + \
+        "                for rx (None)\n" + \
+        "                  compute = ...\n" + \
+        "    for i3 (0,112)\n" + \
+        "      compute = ...\n"

Review comment:
       Updated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456318718



##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +202,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be vectorized.
 
         Returns
         -------
         res_it : Iterator
-            The fused Iterator
+            The vectorized Iterator.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object, res = _ffi_api.StateVectorize(self.state_object,
+                                                         self._resolve_stage_id(stage), iterator)
+        return res
+
+    def parallel(self, stage, iterator):
+        """ Schedule primitive corresponds to te.parallel.
 
-        self.state_object, res = _ffi_api.StateFuse(self.state_object, stage_id, iters)
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be paralleled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be paralleled.
+
+        Returns
+        -------
+        res_it : Iterator
+            The paralleled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateParallel(self.state_object,
+                                                        self._resolve_stage_id(stage), iterator)
+        return res
+
+    def unroll(self, stage, iterator, max_unroll=None):
+        """ Schedule primitive corresponds to te.unroll.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be unrolled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be unrolled.
+        max_unroll : Optional[int]
+            The max unroll limit. Iterator with extent larger than this limit will be skipped.
+
+        Returns
+        -------
+        res_it : Iterator
+            The unrolled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateUnroll(self.state_object,
+                                                      self._resolve_stage_id(stage), iterator,
+                                                      max_unroll if max_unroll else -1)
+        return res
+
+    def bind(self, stage, iterator, thread_name):
+        """ Schedule primitive corresponds to te.bind.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be binded, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be binded.
+        thread_name : str
+            The thread type to be binded. Currently support:
+            - vthread
+            - blockIdx.x
+            - threadIdx.x
+            - blockIdx.y
+            - threadIdx.y

Review comment:
       blockIdx.z & threadIdx.z Added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456772818



##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)
+    assert str(s0) == \

Review comment:
       Update the UT to check each iterator




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456844187



##########
File path: tests/python/unittest/test_auto_scheduler_measure.py
##########
@@ -18,17 +18,55 @@
 """ Test measurement and log serialization. """
 
 import tvm
-from tvm import auto_scheduler
+import topi
+from tvm import te, auto_scheduler
 import tempfile
 
 from test_auto_scheduler_common import get_tiled_matmul
 
 
 def test_record():
-    dag, s = get_tiled_matmul()
-
     if not tvm.runtime.enabled("llvm"):
         return
+
+    A = te.placeholder((512, 512), name='A')
+    B = te.placeholder((512, 512), name='B')
+    k = te.reduce_axis((0, 512), name='k')
+    C = te.compute((512, 512), lambda i, j: te.sum(A[i][k] * B[k][j], axis=[k]), name='C')
+    D = topi.nn.relu(C)
+    k = te.reduce_axis((0, 512), name='k')
+    E = te.compute((512, 512), lambda i, j: te.sum(A[i][k] * D[k][j], axis=[k]), name='C')
+    F = topi.nn.relu(E)
+
+    dag = auto_scheduler.ComputeDAG([A, B, F])
+    s = dag.get_init_state()
+
+    # Split
+    its0 = s.split(C, s[C].iters[0], [4, 8, 8])
+    its1 = s.split(C, s[C].iters[4], [8, 4, 4])
+    # Reorder
+    s.reorder(C, [its0[0], its1[0], its0[1], its1[1], its0[2], its1[2], its0[3], s[C].iters[8],
+                  its1[3]])
+    # Fuse
+    s.fuse(C, [s[C].iters[0], s[C].iters[1], s[C].iters[2]])
+    # Compute at
+    s.split(F, s[F].iters[0], [2])
+    s.compute_at(E, F, s[F].iters[0])
+    # Compute inline
+    s.compute_inline(D)
+    # Compute root
+    s.compute_root(D)
+    # Parallel
+    s.parallel(C, s[C].iters[0])
+    # Thread bind
+    s.bind(C, s[C].iters[1], "blockIdx.x")

Review comment:
       This test is for measure records, I tried to add different steps here to check if they can be serialized to record & reload from record successfully.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457795516



##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -324,17 +314,9 @@ String ComputeDAG::PrintStepsAsPython(const Array<Step>& transform_steps) const
          << "tuple(" << stage->op->name << ".op.reduce_axis)\n";
     }
   }
-  // Call each step's PrintAsPythonAPI method
+  // Call each step's ApplyToPythonAPI method

Review comment:
       Recover to PrintAsPythonAPI.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456769902



##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -276,10 +276,18 @@ std::pair<te::Schedule, Array<te::Tensor>> ComputeDAG::ApplySteps(
     // return value, so the ApplyToSchedule is not able to be merged to single interface
     if (auto ps = step.as<ReorderStepNode>()) {
       ps->ApplyToSchedule(stages, stage_to_axes);
+    } else if (auto ps = step.as<ComputeAtStepNode>()) {

Review comment:
       Ok, in the .h & .py I was tring to place the functions with similar return format together.
   
   void reorder
   void compute_at
   void compute_root
   void compute_inline
   Array<Iterator> split
   Iterator fuse
   Iterator vectorize
   Iterator parallel
   Iterator unroll
   Iterator bind
   
   It's fine to modified to the order you suggested.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456772818



##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)
+    assert str(s0) == \

Review comment:
       Updated the UT to check each iterator




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457782201



##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -244,285 +197,73 @@ Iterator State::unroll(int stage_id, const Iterator& it, int max_unroll) {
   AnnotationStep step =
       AnnotationStep(stage_id, GetIndex(stage->iters, it), IteratorAnnotation::kUnroll);
   CopyOnWrite()->transform_steps.push_back(step);
-  return DoAnnotationStep(step);
+  return step->ApplyToState(this);
 }
 
-Iterator State::bind(int stage_id, const Iterator& it, IteratorAnnotation thread_type) {
+Iterator State::vectorize(int stage_id, const Iterator& it) {
   const Stage& stage = operator->()->stages[stage_id];
-  if (thread_type < IteratorAnnotation::kVThread || thread_type > IteratorAnnotation::kThreadZ) {
-    LOG(FATAL) << "thread_type error, valid: kVThread, kBlockX, kBlockY, "
-               << "kThreadX, kThreadY, kBlockZ, kThreadZ";
-  }
-  AnnotationStep step = AnnotationStep(stage_id, GetIndex(stage->iters, it), thread_type);
+  AnnotationStep step =
+      AnnotationStep(stage_id, GetIndex(stage->iters, it), IteratorAnnotation::kVectorize);
   CopyOnWrite()->transform_steps.push_back(step);
-  return DoAnnotationStep(step);
-}
-
-/********** Step implementations for state **********/
-void State::DoReorderStep(const ReorderStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-  Array<Iterator> iters;
-  for (auto x : step->after_ids) {
-    iters.push_back(stage->iters[x]);
-  }
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id,
-                     Stage(stage->op, stage->op_type, iters, stage->compute_at, stage->attrs));
+  return step->ApplyToState(this);
 }
 
-void State::DoComputeAtStep(const ComputeAtStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Remove the bound information of each iterator since they may not be accurate after
-  // compute at
-  Array<Iterator> new_iters;
-  for (const Iterator& it : stage->iters) {
-    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
-                                           ComputeAtKind::kIter, stage->attrs));
-  // Update attach map
-  pstate->attach_map.SetComputeAtIter(step->stage_id, step->target_stage_id, step->target_iter_id);
-}
-
-void State::DoComputeRootStep(const ComputeRootStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Remove the bound information of each iterator since they may not be accurate after
-  // compute root
-  Array<Iterator> new_iters;
-  for (const Iterator& it : stage->iters) {
-    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
-                                           ComputeAtKind::kRoot, stage->attrs));
-  // Update attach map
-  pstate->attach_map.DeleteStage(step->stage_id);
+Iterator State::fuse(int stage_id, const Array<Iterator>& iters) {
+  const Stage& stage = operator->()->stages[stage_id];
+  Array<Integer> indices;
+  GetIndices(stage->iters, iters, &indices);
+  FuseStep step = FuseStep(stage_id, indices);
+  CopyOnWrite()->transform_steps.push_back(step);
+  return step->ApplyToState(this);
 }
 
-void State::DoComputeInlineStep(const ComputeInlineStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Check the validity of compute_inline
-  for (size_t i = 0; i < stage->iters.size(); ++i) {
-    CHECK_EQ(operator->()->attach_map->iter_to_attached_stages.count(
-                 std::make_pair(step->stage_id, i)),
-             0)
-        << "Invalid compute_inline: There are some other stages that are attached to the "
-        << "target stage";
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  auto new_stage = pstate->stages[step->stage_id];
-  new_stage.CopyOnWrite()->compute_at = ComputeAtKind::kInlined;
-  pstate->stages.Set(step->stage_id, std::move(new_stage));
-  // Update attach map
-  pstate->attach_map.DeleteStage(step->stage_id);
+void State::reorder(int stage_id, const Array<Iterator>& order) {
+  const Stage& stage = operator->()->stages[stage_id];
+  CHECK_EQ(order.size(), stage->iters.size()) << "The order of all iterators "
+                                              << "should be specified";
+  Array<Integer> after_ids;
+  GetIndices(stage->iters, order, &after_ids);
+  ReorderStep step = ReorderStep(stage_id, after_ids);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-// common part for DoSplitStep, DoFollowSplitStep, and DoFollowFusedSplitStep
-Array<Iterator> State::DoSplitStepCommon(int stage_id, int iter_id,
-                                         const Array<Optional<Integer>>& lengths,
-                                         bool inner_to_outer) {
+Array<Iterator> State::split(int stage_id, const Iterator& it,
+                             const Array<Optional<Integer>>& lengths, bool inner_to_outer) {
   const Stage& stage = operator->()->stages[stage_id];
-  const Iterator& it = stage->iters[iter_id];
-  size_t old_iter_size = stage->iters.size();
-  bool concrete = true;
-
-  Optional<PrimExpr> tosplit_min, tosplit_extent;
-  if (it->range.defined()) {
-    tosplit_min = it->range->min;
-    tosplit_extent = it->range->extent;
-  } else {
-    tosplit_min = NullOpt;
-    tosplit_extent = NullOpt;
-  }
-
-  Array<Iterator> outs;
-  for (size_t i = 0; i < lengths.size(); ++i) {
-    Optional<Integer> l;
-    String name;
-    if (inner_to_outer) {
-      l = lengths[lengths.size() - i - 1];
-      name = it->name + "." + std::to_string(lengths.size() - i);
-    } else {
-      l = lengths[i];
-      name = it->name + "." + std::to_string(i);
-    }
-    Iterator res;
-    if (l && tosplit_min && tosplit_extent) {
-      res = Iterator(name, Range::FromMinExtent(tosplit_min.value(), l.value()), it->iter_kind,
-                     IteratorAnnotation::kNone);
-      tosplit_min = Integer(0);
-      tosplit_extent = indexdiv(tosplit_extent.value() + l.value() - 1, l.value());
-    } else {
-      res = Iterator(name, Range(), it->iter_kind, IteratorAnnotation::kNone);
-      tosplit_min = NullOpt;
-      tosplit_extent = NullOpt;
-      concrete = false;
-    }
-    outs.push_back(std::move(res));
-  }
-
-  Range range;
-  if (tosplit_min && tosplit_extent) {
-    range = Range::FromMinExtent(tosplit_min.value(), tosplit_extent.value());
-  }
-  if (inner_to_outer) {
-    outs.push_back(Iterator(it->name + ".0", range, it->iter_kind, IteratorAnnotation::kNone));
-    // Reverse the Iterator array
-    Array<Iterator> temp(outs.rbegin(), outs.rend());
-    outs = std::move(temp);
-  } else {
-    outs.push_back(Iterator(it->name + "." + std::to_string(lengths.size()), range, it->iter_kind,
-                            IteratorAnnotation::kNone));
-  }
-
-  Array<Iterator> new_iters;
-  new_iters.insert(new_iters.end(), stage->iters.begin(), stage->iters.begin() + iter_id);
-  new_iters.insert(new_iters.end(), outs.begin(), outs.end());
-  new_iters.insert(new_iters.end(), stage->iters.begin() + iter_id + 1, stage->iters.end());
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(stage_id,
-                     Stage(stage->op, stage->op_type, new_iters, stage->compute_at, stage->attrs));
-  pstate->concrete &= concrete;
-
-  // Two vectors are used to represent the iterator relation before and after split
-  // The original iterators in AttachMap will be updated with the new iterators
-  std::vector<IterKey> from_iters;
-  std::vector<IterKey> to_iters;
-  for (size_t i = iter_id; i < old_iter_size; ++i) {
-    from_iters.emplace_back(stage_id, i);
-    to_iters.emplace_back(stage_id, i + lengths.size());
-  }
-  pstate->attach_map.UpdateIters(from_iters, to_iters);
-
-  return outs;
+  SplitStep step =
+      SplitStep(stage_id, GetIndex(stage->iters, it),
+                it->range.defined() ? it->range->extent : PrimExpr(), lengths, inner_to_outer);
+  CopyOnWrite()->transform_steps.push_back(step);
+  return step->ApplyToState(this);
 }
 
-Array<Iterator> State::DoSplitStep(const SplitStep& step) {
-  return DoSplitStepCommon(step->stage_id, step->iter_id, step->lengths, step->inner_to_outer);
+void State::compute_at(int stage_id, int target_stage_id, const Iterator& target_iter) {
+  const Stage& target_stage = operator->()->stages[target_stage_id];
+  ComputeAtStep step =
+      ComputeAtStep(stage_id, target_stage_id, GetIndex(target_stage->iters, target_iter));
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-Iterator State::DoFuseStep(const FuseStep& step) {
-  int stage_id = step->stage_id;
-  const Stage& stage = operator->()->stages[stage_id];
-  size_t old_iter_size = static_cast<int>(stage->iters.size());
-
-  String new_name;
-  PrimExpr new_extent = 1;
-  IteratorKind new_iter_kind = IteratorKind::kSpecial;
-
-  for (size_t i = 0; i < step->fused_ids.size(); ++i) {
-    if (i > 0) {
-      CHECK_EQ(step->fused_ids[i]->value, step->fused_ids[i - 1]->value + 1);
-    }
-
-    if (i != step->fused_ids.size() - 1) {
-      const auto& iter_to_attached_stage = operator->()->attach_map->iter_to_attached_stages;
-      if (iter_to_attached_stage.find(std::make_pair(stage_id, step->fused_ids[i])) !=
-          iter_to_attached_stage.end()) {
-        LOG(FATAL) << "Invalid Fuse. Trying to fuse iterators that have been attached by some "
-                   << "stages. State before fusion:\n"
-                   << *this;
-      }
-    }
-
-    const Iterator& it = stage->iters[step->fused_ids[i]];
-    new_name = new_name + it->name + "@";
-
-    if (it->range.defined() && new_extent.defined()) {
-      new_extent = new_extent * it->range->extent;
-    } else {
-      new_extent = PrimExpr();
-    }
-
-    if (i == 0) {
-      new_iter_kind = it->iter_kind;
-    } else {
-      if (new_iter_kind != it->iter_kind) {
-        new_iter_kind = IteratorKind::kMixed;
-      }
-    }
-  }
-
-  Range range;
-  if (new_extent.defined()) {
-    range = Range::FromMinExtent(0, new_extent);
-  }
-  Iterator new_it = Iterator(new_name, range, new_iter_kind, IteratorAnnotation::kNone);
-  Array<Iterator> new_iters;
-  new_iters.insert(new_iters.end(), stage->iters.begin(),
-                   stage->iters.begin() + step->fused_ids.front());
-  new_iters.push_back(new_it);
-  new_iters.insert(new_iters.end(), stage->iters.begin() + step->fused_ids.back() + 1,
-                   stage->iters.end());
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(stage_id,
-                     Stage(stage->op, stage->op_type, new_iters, stage->compute_at, stage->attrs));
-
-  // Two vectors are used to represent the iterator relation before and after fuse
-  // The original iterators in AttachMap will be updated with the new iterators
-  std::vector<IterKey> from_iters;
-  std::vector<IterKey> to_iters;
-  const size_t begin_id = step->fused_ids.front(), end_id = step->fused_ids.back();
-  for (size_t i = 0; i < old_iter_size; ++i) {
-    if (i <= begin_id) {
-      continue;
-    } else if (i > end_id) {
-      // move forward
-      from_iters.emplace_back(stage_id, i);
-      to_iters.emplace_back(stage_id, i - end_id + begin_id);
-    } else {
-      // move to the fused id
-      from_iters.emplace_back(stage_id, i);
-      to_iters.emplace_back(stage_id, begin_id);
-    }
-  }
-  pstate->attach_map.UpdateIters(from_iters, to_iters);
-
-  return new_it;
+void State::compute_inline(int stage_id) {
+  ComputeInlineStep step = ComputeInlineStep(stage_id);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-Iterator State::DoAnnotationStep(const AnnotationStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-  Iterator it = stage->iters[step->iter_id];
-
-  CHECK(it->annotation == IteratorAnnotation::kNone);
-  Iterator new_it = Iterator(it->name, it->range, it->iter_kind, step->annotation);
-  Stage new_stage = stage;
-  new_stage.CopyOnWrite()->iters.Set(step->iter_id, new_it);
-  CopyOnWrite()->stages.Set(step->stage_id, std::move(new_stage));
-  return new_it;
+void State::compute_root(int stage_id) {
+  ComputeRootStep step = ComputeRootStep(stage_id);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-void State::DoSteps(const ComputeDAG& dag) {
+void State::ApplySteps(const ComputeDAG& dag) {
   CHECK(operator->()->stages.size()) << "Invalid State with empty operation stages.";
 
+  // Call each step's ApplyToState method
   for (const auto& step : operator->()->transform_steps) {
-    if (auto ps = step.as<ReorderStepNode>()) {
-      DoReorderStep(GetRef<ReorderStep>(ps));
-    } else if (auto ps = step.as<ComputeAtStepNode>()) {
-      DoComputeAtStep(GetRef<ComputeAtStep>(ps));
-    } else if (auto ps = step.as<ComputeRootStepNode>()) {
-      DoComputeRootStep(GetRef<ComputeRootStep>(ps));
-    } else if (auto ps = step.as<ComputeInlineStepNode>()) {
-      DoComputeInlineStep(GetRef<ComputeInlineStep>(ps));
-    } else if (auto ps = step.as<SplitStepNode>()) {
-      DoSplitStep(GetRef<SplitStep>(ps));
-    } else if (auto ps = step.as<FuseStepNode>()) {
-      DoFuseStep(GetRef<FuseStep>(ps));
-    } else if (auto ps = step.as<AnnotationStepNode>()) {
-      DoAnnotationStep(GetRef<AnnotationStep>(ps));
-    } else {
-      LOG(FATAL) << "Invalid step: " << step;
-    }
+    StepApplyToState(step, this, dag);

Review comment:
       The `step->ApplyToState()` is actually not a virtual function(the return value of different steps may be different), so I add the `StepApplyToState` to pack all those step type check to `transform_steps.cc`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] junrushao1994 commented on pull request #6073: [Ansor][AutoTVM v2.0] Part 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

junrushao1994 commented on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-659627481


   Just noticed that #6077 is also titled as "part 1", shall we make one of them part 2 instead?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 edited a comment on pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 edited a comment on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-660786809


   Moved all the implementation to transform_step.h/cc, and updated the order of steps.
   In the current implementation, the next pr of steps adding will only need to change transform_step.h/cc two files.
   cc @merrymercy @junrushao1994 @comaniac @FrozenGene 
   ... I'm not sure why I failed to request review of @junrushao1994 and @comaniac


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457777810



##########
File path: src/auto_scheduler/transform_step.cc
##########
@@ -51,129 +53,539 @@ const char* IteratorAnnotationString[] = {
     "tensorize"     // kTensorized = 11
 };
 
-/********** Reorder **********/
-ReorderStep::ReorderStep(int stage_id, const Array<Integer>& after_ids) {
-  auto node = make_object<ReorderStepNode>();
-  node->stage_id = stage_id;
-  for (const auto& x : after_ids) {
-    CHECK(x->IsInstance<IntImmNode>());
+Step StepReadFromRecord(dmlc::JSONReader* reader) {

Review comment:
       1. Yes, we can add a registration mechanism. But I think it can be done in future PRs.
   2. The current records do have a version field 
   https://github.com/apache/incubator-tvm/blob/7ac78d96611ab3a069fae9eb5210e433eb240cc5/src/auto_scheduler/measure_record.cc#L314




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457777810



##########
File path: src/auto_scheduler/transform_step.cc
##########
@@ -51,129 +53,539 @@ const char* IteratorAnnotationString[] = {
     "tensorize"     // kTensorized = 11
 };
 
-/********** Reorder **********/
-ReorderStep::ReorderStep(int stage_id, const Array<Integer>& after_ids) {
-  auto node = make_object<ReorderStepNode>();
-  node->stage_id = stage_id;
-  for (const auto& x : after_ids) {
-    CHECK(x->IsInstance<IntImmNode>());
+Step StepReadFromRecord(dmlc::JSONReader* reader) {

Review comment:
       1. Yes, we can add a registration mechanism. But I think it can be done in future PRs.
   2. The current records do have a version record 
   https://github.com/apache/incubator-tvm/blob/7ac78d96611ab3a069fae9eb5210e433eb240cc5/src/auto_scheduler/measure_record.cc#L314




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] comaniac commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

comaniac commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456834812



##########
File path: tests/python/unittest/test_auto_scheduler_measure.py
##########
@@ -18,17 +18,55 @@
 """ Test measurement and log serialization. """
 
 import tvm
-from tvm import auto_scheduler
+import topi
+from tvm import te, auto_scheduler
 import tempfile
 
 from test_auto_scheduler_common import get_tiled_matmul
 
 
 def test_record():
-    dag, s = get_tiled_matmul()
-
     if not tvm.runtime.enabled("llvm"):
         return
+
+    A = te.placeholder((512, 512), name='A')
+    B = te.placeholder((512, 512), name='B')
+    k = te.reduce_axis((0, 512), name='k')
+    C = te.compute((512, 512), lambda i, j: te.sum(A[i][k] * B[k][j], axis=[k]), name='C')
+    D = topi.nn.relu(C)
+    k = te.reduce_axis((0, 512), name='k')
+    E = te.compute((512, 512), lambda i, j: te.sum(A[i][k] * D[k][j], axis=[k]), name='C')
+    F = topi.nn.relu(E)
+
+    dag = auto_scheduler.ComputeDAG([A, B, F])
+    s = dag.get_init_state()
+
+    # Split
+    its0 = s.split(C, s[C].iters[0], [4, 8, 8])
+    its1 = s.split(C, s[C].iters[4], [8, 4, 4])
+    # Reorder
+    s.reorder(C, [its0[0], its1[0], its0[1], its1[1], its0[2], its1[2], its0[3], s[C].iters[8],
+                  its1[3]])
+    # Fuse
+    s.fuse(C, [s[C].iters[0], s[C].iters[1], s[C].iters[2]])
+    # Compute at
+    s.split(F, s[F].iters[0], [2])
+    s.compute_at(E, F, s[F].iters[0])
+    # Compute inline
+    s.compute_inline(D)
+    # Compute root
+    s.compute_root(D)
+    # Parallel
+    s.parallel(C, s[C].iters[0])
+    # Thread bind
+    s.bind(C, s[C].iters[1], "blockIdx.x")

Review comment:
       What are we trying to test here? I didn't see a corresponding assertion.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457831406



##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -270,19 +270,9 @@ std::pair<te::Schedule, Array<te::Tensor>> ComputeDAG::ApplySteps(
   }
 
   // Apply the history steps to TVM schedule
+  // Call each step's ApplyToSchedule method
   for (const auto& step : transform_steps) {
-    // Call each step's ApplyToSchedule method
-    // Note: some steps have extra parameters that must be passed and they may need different
-    // return value, so the ApplyToSchedule is not able to be merged to single interface
-    if (auto ps = step.as<ReorderStepNode>()) {
-      ps->ApplyToSchedule(stages, stage_to_axes);
-    } else if (auto ps = step.as<SplitStepNode>()) {
-      ps->ApplyToSchedule(stages, stage_to_axes);
-    } else if (auto ps = step.as<FuseStepNode>()) {
-      ps->ApplyToSchedule(stages, stage_to_axes);
-    } else {
-      LOG(FATAL) << "Invalid Step";
-    }
+    StepApplyToSchedule(step, stages, stage_to_axes);

Review comment:
       This corresponds to each step's ApplyToSchedule, so I just name it like this ...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456321361



##########
File path: src/auto_scheduler/utils.h
##########
@@ -89,6 +89,15 @@ inline int GetIndex(const Array<T>& array, const T& to_locate) {
   return -1;
 }
 
+/*! \brief Delete the item in a std::vector. */
+template <typename T>
+inline void DeleteItem(std::vector<T>* array, const T& to_delete) {

Review comment:
       Updated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456552575



##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)
+    assert str(s0) == \

Review comment:
       I have no idea either

##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -119,9 +137,61 @@ def reorder(self, stage, order):
         order : List[Iterator]
             Iterators in the expected order.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object = _ffi_api.StateReorder(self.state_object, self._resolve_stage_id(stage),
+                                                  order)
+
+    def compute_at(self, stage, target_stage, target_iter):
+        """ Schedule primitive corresponds to te.compute_at.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be compute at, can be a Stage order index, Stage operation or stage
+            output tensor.
+        target_stage : Union[int, Operation, Tensor]
+            The target stage of compute_at, can be a Stage order index, Stage operation or stage
+            output tensor.
+        target_iter : Iterator
+            The target Iterator of compute_at.
+
+        Notes
+        -----
+        After compute_at, the extent of each iterator may not be accurate any more, so the bound
+        information will be removed from this state. Run ComputeDAG::InferBound to recover.

Review comment:
       ```suggestion
           After compute_at, we need careful dependency analysis to compute the accurate bound information.
           However, it is relatively expensive and complicated. So in LoopState, we just fill "None" as bound 
           for the newly created iterators. If you do need the bound, you can call ComputeDAG::InferBound on
           the returned state to get all bound information.
   ```
   
   Please propagate this change.

##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +235,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+
+        Notes
+        -----
+        If the iterators to be fused have stages attached at them(by compute_at), the fused
+        result will become the new attach point.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage

Review comment:
       ```suggestion
               The Stage to be vectorized, which can be specified by 
               the integer index, Operation, or output tensor of the stage.
   ```
   Propagate this modification to other comments as well.

##########
File path: tests/python/unittest/test_auto_scheduler_measure.py
##########
@@ -18,17 +18,55 @@
 """ Test measurement and log serialization. """
 
 import tvm
-from tvm import auto_scheduler
+import topi
+from tvm import te, auto_scheduler
 import tempfile
 
 from test_auto_scheduler_common import get_tiled_matmul
 
 
 def test_record():
-    dag, s = get_tiled_matmul()
-
     if not tvm.runtime.enabled("llvm"):
         return
+
+    A = te.placeholder((512, 512), name='A')
+    B = te.placeholder((512, 512), name='B')
+    k = te.reduce_axis((0, 512), name='k')
+    C = te.compute((512, 512), lambda i, j: te.sum(A[i][k] * B[k][j], axis=[k]), name='C')
+    D = topi.nn.relu(C)
+    k = te.reduce_axis((0, 512), name='k')
+    E = te.compute((512, 512), lambda i, j: te.sum(A[i][k] * D[k][j], axis=[k]), name='C')
+    F = topi.nn.relu(E)
+
+    dag = auto_scheduler.ComputeDAG([A, B, F])
+    s = dag.get_init_state()
+
+    # Split
+    its0 = s.split(C, s[C].iters[0], [4, 8, 8])
+    its1 = s.split(C, s[C].iters[4], [8, 4, 4])
+    # Reorder
+    s.reorder(C, [its0[0], its1[0], its0[1], its1[1], its0[2], its1[2], its0[3], s[C].iters[8],
+                  its1[3]])
+    # Fuse
+    s.fuse(C, [s[C].iters[0], s[C].iters[1], s[C].iters[2]])
+    # Compute at
+    s.split(F, s[F].iters[0], [2])
+    s.compute_at(E, F, s[F].iters[0])
+    # Compute inline
+    s.compute_inline(D)
+    # Compute root
+    s.compute_root(D)
+    # Parallel
+    s.parallel(C, s[C].iters[0])
+    # Thread bind
+    s.bind(C, s[C].iters[1], "blockIdx.x")

Review comment:
       Using thread binding on a llvm target is confusing to me. Although the check is correct.

##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -276,10 +276,18 @@ std::pair<te::Schedule, Array<te::Tensor>> ComputeDAG::ApplySteps(
     // return value, so the ApplyToSchedule is not able to be merged to single interface
     if (auto ps = step.as<ReorderStepNode>()) {
       ps->ApplyToSchedule(stages, stage_to_axes);
+    } else if (auto ps = step.as<ComputeAtStepNode>()) {

Review comment:
       Why do you use a different order from the order in our internal repo?
   I propose we can use two orders
   1.  Alphabetical order
   2. Logical order from easy to hard.
      - primitives working on one stage: reorder, split, fuse, annotation
       - primitives working on multiple stages: compute_at, compute_root, compute_inline
      - primitive adding new stages: cache_read, cache_write, ...
   
   And we should keep the same order in all places.

##########
File path: src/auto_scheduler/measure_record.cc
##########
@@ -169,6 +206,18 @@ struct Handler<::tvm::Array<::tvm::auto_scheduler::Step>> {
           fused_ids.push_back(i);
         }
         data->push_back(::tvm::auto_scheduler::FuseStep(stage_id, fused_ids));
+      } else if (name == "AN") {

Review comment:
       I agree with your points.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456549272



##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -276,10 +276,18 @@ std::pair<te::Schedule, Array<te::Tensor>> ComputeDAG::ApplySteps(
     // return value, so the ApplyToSchedule is not able to be merged to single interface
     if (auto ps = step.as<ReorderStepNode>()) {
       ps->ApplyToSchedule(stages, stage_to_axes);
+    } else if (auto ps = step.as<ComputeAtStepNode>()) {

Review comment:
       Why do you use a different order from the order in our internal repo?
   I propose we can use one of the following two orders. 
   1.  Alphabetical order
   2. Logical order from easy to hard.
      - primitives working on one stage: reorder, split, fuse, annotation
       - primitives working on multiple stages: compute_at, compute_root, compute_inline
      - primitive adding new stages: cache_read, cache_write, ...
   
   And we should keep the same order in all places.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456811108



##########
File path: src/auto_scheduler/measure_record.cc
##########
@@ -169,6 +206,18 @@ struct Handler<::tvm::Array<::tvm::auto_scheduler::Step>> {
           fused_ids.push_back(i);
         }
         data->push_back(::tvm::auto_scheduler::FuseStep(stage_id, fused_ids));
+      } else if (name == "AN") {

Review comment:
       I'm working on this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] junrushao1994 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

junrushao1994 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456194801



##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +202,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be vectorized.
 
         Returns
         -------
         res_it : Iterator
-            The fused Iterator
+            The vectorized Iterator.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object, res = _ffi_api.StateVectorize(self.state_object,
+                                                         self._resolve_stage_id(stage), iterator)
+        return res
+
+    def parallel(self, stage, iterator):
+        """ Schedule primitive corresponds to te.parallel.
 
-        self.state_object, res = _ffi_api.StateFuse(self.state_object, stage_id, iters)
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be paralleled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be paralleled.
+
+        Returns
+        -------
+        res_it : Iterator
+            The paralleled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateParallel(self.state_object,
+                                                        self._resolve_stage_id(stage), iterator)
+        return res
+
+    def unroll(self, stage, iterator, max_unroll=None):
+        """ Schedule primitive corresponds to te.unroll.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be unrolled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be unrolled.
+        max_unroll : Optional[int]
+            The max unroll limit. Iterator with extent larger than this limit will be skipped.
+
+        Returns
+        -------
+        res_it : Iterator
+            The unrolled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateUnroll(self.state_object,
+                                                      self._resolve_stage_id(stage), iterator,
+                                                      max_unroll if max_unroll else -1)
+        return res
+
+    def bind(self, stage, iterator, thread_name):
+        """ Schedule primitive corresponds to te.bind.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be binded, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be binded.
+        thread_name : str
+            The thread type to be binded. Currently support:
+            - vthread
+            - blockIdx.x
+            - threadIdx.x
+            - blockIdx.y
+            - threadIdx.y

Review comment:
       Sure it makes sense :-)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-660786809


   Moved all the implementation to transform_step.h & transform_step.cc, and updated the order of steps.
   @merrymercy @junrushao1994 @comaniac @FrozenGene 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457784703



##########
File path: src/auto_scheduler/measure_record.cc
##########
@@ -82,98 +63,23 @@ struct Handler<::tvm::Array<::tvm::auto_scheduler::Step>> {
   inline static void Write(dmlc::JSONWriter* writer,
                            const ::tvm::Array<::tvm::auto_scheduler::Step>& data) {
     writer->BeginArray(false);
-    for (size_t i = 0; i < data.size(); ++i) {
+    for (const auto& step : data) {
       writer->WriteArraySeperator();
       writer->BeginArray(false);
-      if (auto ps = data[i].as<::tvm::auto_scheduler::ReorderStepNode>()) {
-        writer->WriteArrayItem(std::string("RE"));
-        writer->WriteArrayItem(ps->stage_id);
-        writer->WriteArrayItem(IntArrayToVector(ps->after_ids));
-      } else if (auto ps = data[i].as<::tvm::auto_scheduler::SplitStepNode>()) {
-        writer->WriteArrayItem(std::string("SP"));
-        writer->WriteArrayItem(ps->stage_id);
-        writer->WriteArrayItem(ps->iter_id);
-        writer->WriteArrayItem(ps->extent ? ::tvm::auto_scheduler::GetIntImm(ps->extent.value())
-                                          : 0);
-        writer->WriteArrayItem(IntArrayToVector(ps->lengths));
-        writer->WriteArrayItem(static_cast<int>(ps->inner_to_outer));
-      } else if (auto ps = data[i].as<::tvm::auto_scheduler::FuseStepNode>()) {
-        writer->WriteArrayItem(std::string("FU"));
-        writer->WriteArrayItem(ps->stage_id);
-        writer->WriteArrayItem(IntArrayToVector(ps->fused_ids));
-      } else {
-        LOG(FATAL) << "Invalid step: " << data[i];
-      }
+      step->WriteToRecord(writer);
       writer->EndArray();
     }
     writer->EndArray();
   }
 
   inline static void Read(dmlc::JSONReader* reader,
                           ::tvm::Array<::tvm::auto_scheduler::Step>* data) {
-    std::vector<int> int_list;
-    bool s, inner_to_outer;
-    std::string name, scope_name, pragma_type, ti_func_name;
-    int stage_id, iter_id, extent;
-
     reader->BeginArray();
     data->clear();
     while (reader->NextArrayItem()) {
       reader->BeginArray();
-      s = reader->NextArrayItem();
-      CHECK(s);
-      reader->Read(&name);
-      if (name == "RE") {
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&stage_id);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&int_list);
-        ::tvm::Array<::tvm::Integer> after_ids;
-        for (const auto& i : int_list) {
-          after_ids.push_back(i);
-        }
-        data->push_back(::tvm::auto_scheduler::ReorderStep(stage_id, after_ids));
-      } else if (name == "SP") {
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&stage_id);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&iter_id);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&extent);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&int_list);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&inner_to_outer);
-        ::tvm::Array<::tvm::Optional<::tvm::Integer>> lengths;
-        for (const auto& i : int_list) {
-          lengths.push_back(::tvm::Integer(i));
-        }
-        data->push_back(::tvm::auto_scheduler::SplitStep(
-            stage_id, iter_id, extent == 0 ? ::tvm::PrimExpr() : extent, lengths, inner_to_outer));
-      } else if (name == "FU") {
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&stage_id);
-        s = reader->NextArrayItem();
-        CHECK(s);
-        reader->Read(&int_list);
-        ::tvm::Array<::tvm::Integer> fused_ids;
-        for (const auto& i : int_list) {
-          fused_ids.push_back(i);
-        }
-        data->push_back(::tvm::auto_scheduler::FuseStep(stage_id, fused_ids));
-      } else {
-        LOG(FATAL) << "Invalid step format";
-      }
-      s = reader->NextArrayItem();
-      CHECK(!s);
+      data->push_back(::tvm::auto_scheduler::StepReadFromRecord(reader));
+      CHECK(!reader->NextArrayItem());

Review comment:
       Ok, I think I'd better add the bool value s back here...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] junrushao1994 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

junrushao1994 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456257744



##########
File path: src/auto_scheduler/loop_state.h
##########
@@ -217,6 +196,68 @@ class Stage : public ObjectRef {
   TVM_DEFINE_OBJECT_REF_COW_METHOD(StageNode);
 };
 
+/*! \brief Use stage_id to represent a stage. */
+using StageKey = int;
+/*! \brief Use stage_id and iter_id to represent a iterator. */
+using IterKey = std::pair<int, int>;
+
+/*!
+ * \brief stores the compute_at relation between stages
+ * This stores a bi-directional mapping from stages and iter:
+ * 1. Stage to its attached iterator
+ * 2. Iterator to the stage attached to it
+ * You can use AttachMapNode::stage_to_attach_iter and AttachMapNode::iter_to_attached_stages
+ * to query the relations
+ */
+class AttachMapNode : public Object {
+ public:
+  /*! \brief A Map to store the mapping of stage to its attached iterator. */
+  std::unordered_map<StageKey, IterKey> stage_to_attach_iter;
+  /*! \brief A Map to store the mapping of iterator to the stage attached to it. */
+  std::unordered_map<IterKey, std::vector<StageKey>> iter_to_attached_stages;
+
+  static constexpr const char* _type_key = "auto_scheduler.AttachMap";
+  TVM_DECLARE_FINAL_OBJECT_INFO(AttachMapNode, Object);
+};
+
+/*!
+ * \brief Managed reference to AttachMapNode.
+ * \sa AttachMapNode
+ */
+class AttachMap : public ObjectRef {
+ public:
+  /*!
+   * \brief Process the stage/iterator mapping after compute at.
+   * \param stage_id The index of the stage to be compute at.
+   * \param target_stage_id The index of stage that this step will compute at to.
+   * \param target_iter_id The index of iterator in target stage that this step will compute at to.
+   */
+  void SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id);
+  /*!
+   * \brief This is a public wrapper of `DeleteStageEntry`. To delete the entry of a specific stage.
+   * \param stage_id The index of the stage to be compute at.
+   */
+  void DeleteStage(int stage_id);
+  /*!
+   * \brief Update the iterator relations in AttachMap.

Review comment:
       I agree with Cody...Tried to understand the doc for a while but didn't succeed without looking at the source code. That will be great if we have clearer doc and method name :-)

##########
File path: src/auto_scheduler/utils.h
##########
@@ -89,6 +89,15 @@ inline int GetIndex(const Array<T>& array, const T& to_locate) {
   return -1;
 }
 
+/*! \brief Delete the item in a std::vector. */
+template <typename T>
+inline void DeleteItem(std::vector<T>* array, const T& to_delete) {

Review comment:
       Maybe `FindAndDeleteItem` sounds more informative?

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -90,12 +90,69 @@ Stage::Stage(te::Operation op, StageKind op_type, const Array<Iterator>& iters,
   data_ = std::move(node);
 }
 
+/********** AttachMap **********/
+void AttachMap::SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  // Delete the current entry of this stage
+  DeleteStageEntry(pnode, stage_id);
+
+  // Store the new relations to map
+  IterKey iter_key(target_stage_id, target_iter_id);
+  pnode->stage_to_attach_iter[stage_id] = iter_key;
+  pnode->iter_to_attached_stages[iter_key].push_back(stage_id);
+}
+
+void AttachMap::DeleteStage(int stage_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+  // Delete the original stage entry
+  DeleteStageEntry(pnode, stage_id);
+}
+
+void AttachMap::UpdateIters(const std::vector<IterKey>& old_iters,
+                            const std::vector<IterKey>& new_iters) {

Review comment:
       Shall we move the check to the beginning of this function before CoW?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456174561



##########
File path: src/auto_scheduler/measure_record.cc
##########
@@ -169,6 +206,18 @@ struct Handler<::tvm::Array<::tvm::auto_scheduler::Step>> {
           fused_ids.push_back(i);
         }
         data->push_back(::tvm::auto_scheduler::FuseStep(stage_id, fused_ids));
+      } else if (name == "AN") {

Review comment:
       Good point.
   @merrymercy Currently to add a new transform steps, we have to modify 7 files ... Maybe it would better to move all specific implementation to transform_step.h/cc, and left the other file to just add some interface.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456228059



##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +202,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be vectorized.
 
         Returns
         -------
         res_it : Iterator
-            The fused Iterator
+            The vectorized Iterator.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object, res = _ffi_api.StateVectorize(self.state_object,
+                                                         self._resolve_stage_id(stage), iterator)
+        return res
+
+    def parallel(self, stage, iterator):
+        """ Schedule primitive corresponds to te.parallel.
 
-        self.state_object, res = _ffi_api.StateFuse(self.state_object, stage_id, iters)
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be paralleled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be paralleled.
+
+        Returns
+        -------
+        res_it : Iterator
+            The paralleled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateParallel(self.state_object,
+                                                        self._resolve_stage_id(stage), iterator)
+        return res
+
+    def unroll(self, stage, iterator, max_unroll=None):
+        """ Schedule primitive corresponds to te.unroll.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be unrolled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be unrolled.
+        max_unroll : Optional[int]
+            The max unroll limit. Iterator with extent larger than this limit will be skipped.
+
+        Returns
+        -------
+        res_it : Iterator
+            The unrolled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateUnroll(self.state_object,
+                                                      self._resolve_stage_id(stage), iterator,
+                                                      max_unroll if max_unroll else -1)
+        return res
+
+    def bind(self, stage, iterator, thread_name):
+        """ Schedule primitive corresponds to te.bind.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be binded, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be binded.
+        thread_name : str
+            The thread type to be binded. Currently support:
+            - vthread
+            - blockIdx.x
+            - threadIdx.x
+            - blockIdx.y
+            - threadIdx.y

Review comment:
       We should add them. Because other policies might use them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456770051



##########
File path: tests/python/unittest/test_auto_scheduler_measure.py
##########
@@ -18,17 +18,55 @@
 """ Test measurement and log serialization. """
 
 import tvm
-from tvm import auto_scheduler
+import topi
+from tvm import te, auto_scheduler
 import tempfile
 
 from test_auto_scheduler_common import get_tiled_matmul
 
 
 def test_record():
-    dag, s = get_tiled_matmul()
-
     if not tvm.runtime.enabled("llvm"):
         return
+
+    A = te.placeholder((512, 512), name='A')
+    B = te.placeholder((512, 512), name='B')
+    k = te.reduce_axis((0, 512), name='k')
+    C = te.compute((512, 512), lambda i, j: te.sum(A[i][k] * B[k][j], axis=[k]), name='C')
+    D = topi.nn.relu(C)
+    k = te.reduce_axis((0, 512), name='k')
+    E = te.compute((512, 512), lambda i, j: te.sum(A[i][k] * D[k][j], axis=[k]), name='C')
+    F = topi.nn.relu(E)
+
+    dag = auto_scheduler.ComputeDAG([A, B, F])
+    s = dag.get_init_state()
+
+    # Split
+    its0 = s.split(C, s[C].iters[0], [4, 8, 8])
+    its1 = s.split(C, s[C].iters[4], [8, 4, 4])
+    # Reorder
+    s.reorder(C, [its0[0], its1[0], its0[1], its1[1], its0[2], its1[2], its0[3], s[C].iters[8],
+                  its1[3]])
+    # Fuse
+    s.fuse(C, [s[C].iters[0], s[C].iters[1], s[C].iters[2]])
+    # Compute at
+    s.split(F, s[F].iters[0], [2])
+    s.compute_at(E, F, s[F].iters[0])
+    # Compute inline
+    s.compute_inline(D)
+    # Compute root
+    s.compute_root(D)
+    # Parallel
+    s.parallel(C, s[C].iters[0])
+    # Thread bind
+    s.bind(C, s[C].iters[1], "blockIdx.x")

Review comment:
       Maybe add a comment here saying this is just for testing? ...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456769902



##########
File path: src/auto_scheduler/compute_dag.cc
##########
@@ -276,10 +276,18 @@ std::pair<te::Schedule, Array<te::Tensor>> ComputeDAG::ApplySteps(
     // return value, so the ApplyToSchedule is not able to be merged to single interface
     if (auto ps = step.as<ReorderStepNode>()) {
       ps->ApplyToSchedule(stages, stage_to_axes);
+    } else if (auto ps = step.as<ComputeAtStepNode>()) {

Review comment:
       Ok, in the .h & .py I was tring to place the functions with similar return format together.
   
   void reorder
   void compute_at
   void compute_root
   void compute_inline
   Array\<Iterator\> split
   Iterator fuse
   Iterator vectorize
   Iterator parallel
   Iterator unroll
   Iterator bind
   
   It's fine to modified to the order you suggested.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] comaniac commented on pull request #6073: [Ansor][AutoTVM v2.0] Part 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

comaniac commented on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-659756713


   > Just noticed that #6077 is also titled as "part 1", shall we make one of them part 2 instead?
   
   Part 1 will have several PRs. "Phase 1" might be a better name to describe this proess, but anyways.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456319452



##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -90,12 +90,69 @@ Stage::Stage(te::Operation op, StageKind op_type, const Array<Iterator>& iters,
   data_ = std::move(node);
 }
 
+/********** AttachMap **********/
+void AttachMap::SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  // Delete the current entry of this stage
+  DeleteStageEntry(pnode, stage_id);
+
+  // Store the new relations to map
+  IterKey iter_key(target_stage_id, target_iter_id);
+  pnode->stage_to_attach_iter[stage_id] = iter_key;
+  pnode->iter_to_attached_stages[iter_key].push_back(stage_id);
+}
+
+void AttachMap::DeleteStage(int stage_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+  // Delete the original stage entry
+  DeleteStageEntry(pnode, stage_id);
+}
+
+void AttachMap::UpdateIters(const std::vector<IterKey>& old_iters,
+                            const std::vector<IterKey>& new_iters) {

Review comment:
       Moved this check to front.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] junrushao1994 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

junrushao1994 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456230063



##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +202,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be vectorized.
 
         Returns
         -------
         res_it : Iterator
-            The fused Iterator
+            The vectorized Iterator.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object, res = _ffi_api.StateVectorize(self.state_object,
+                                                         self._resolve_stage_id(stage), iterator)
+        return res
+
+    def parallel(self, stage, iterator):
+        """ Schedule primitive corresponds to te.parallel.
 
-        self.state_object, res = _ffi_api.StateFuse(self.state_object, stage_id, iters)
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be paralleled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be paralleled.
+
+        Returns
+        -------
+        res_it : Iterator
+            The paralleled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateParallel(self.state_object,
+                                                        self._resolve_stage_id(stage), iterator)
+        return res
+
+    def unroll(self, stage, iterator, max_unroll=None):
+        """ Schedule primitive corresponds to te.unroll.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be unrolled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be unrolled.
+        max_unroll : Optional[int]
+            The max unroll limit. Iterator with extent larger than this limit will be skipped.
+
+        Returns
+        -------
+        res_it : Iterator
+            The unrolled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateUnroll(self.state_object,
+                                                      self._resolve_stage_id(stage), iterator,
+                                                      max_unroll if max_unroll else -1)
+        return res
+
+    def bind(self, stage, iterator, thread_name):
+        """ Schedule primitive corresponds to te.bind.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be binded, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be binded.
+        thread_name : str
+            The thread type to be binded. Currently support:
+            - vthread
+            - blockIdx.x
+            - threadIdx.x
+            - blockIdx.y
+            - threadIdx.y

Review comment:
       I agree with @merrymercy that it is better to add them back




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456249617



##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -90,12 +90,69 @@ Stage::Stage(te::Operation op, StageKind op_type, const Array<Iterator>& iters,
   data_ = std::move(node);
 }
 
+/********** AttachMap **********/
+void AttachMap::SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  // Delete the current entry of this stage
+  DeleteStageEntry(pnode, stage_id);
+
+  // Store the new relations to map
+  IterKey iter_key(target_stage_id, target_iter_id);
+  pnode->stage_to_attach_iter[stage_id] = iter_key;
+  pnode->iter_to_attached_stages[iter_key].push_back(stage_id);
+}
+
+void AttachMap::DeleteStage(int stage_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+  // Delete the original stage entry
+  DeleteStageEntry(pnode, stage_id);
+}
+
+void AttachMap::UpdateIters(const std::vector<IterKey>& old_iters,
+                            const std::vector<IterKey>& new_iters) {

Review comment:
       We already have a vector size check in this function. :) 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

FrozenGene commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456188551



##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +202,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be vectorized.
 
         Returns
         -------
         res_it : Iterator
-            The fused Iterator
+            The vectorized Iterator.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object, res = _ffi_api.StateVectorize(self.state_object,
+                                                         self._resolve_stage_id(stage), iterator)
+        return res
+
+    def parallel(self, stage, iterator):
+        """ Schedule primitive corresponds to te.parallel.
 
-        self.state_object, res = _ffi_api.StateFuse(self.state_object, stage_id, iters)
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be paralleled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be paralleled.
+
+        Returns
+        -------
+        res_it : Iterator
+            The paralleled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateParallel(self.state_object,
+                                                        self._resolve_stage_id(stage), iterator)
+        return res
+
+    def unroll(self, stage, iterator, max_unroll=None):
+        """ Schedule primitive corresponds to te.unroll.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be unrolled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be unrolled.
+        max_unroll : Optional[int]
+            The max unroll limit. Iterator with extent larger than this limit will be skipped.
+
+        Returns
+        -------
+        res_it : Iterator
+            The unrolled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateUnroll(self.state_object,
+                                                      self._resolve_stage_id(stage), iterator,
+                                                      max_unroll if max_unroll else -1)
+        return res
+
+    def bind(self, stage, iterator, thread_name):
+        """ Schedule primitive corresponds to te.bind.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be binded, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be binded.
+        thread_name : str
+            The thread type to be binded. Currently support:
+            - vthread
+            - blockIdx.x
+            - threadIdx.x
+            - blockIdx.y
+            - threadIdx.y

Review comment:
       We could add one brief comment why we don't need `blockIdx.z` / `threadIdx.z`, which is mentioned in last pr by @junrushao1994 Does it make sense to you?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] comaniac commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

comaniac commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456255907



##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -90,12 +90,69 @@ Stage::Stage(te::Operation op, StageKind op_type, const Array<Iterator>& iters,
   data_ = std::move(node);
 }
 
+/********** AttachMap **********/
+void AttachMap::SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+
+  // Delete the current entry of this stage
+  DeleteStageEntry(pnode, stage_id);
+
+  // Store the new relations to map
+  IterKey iter_key(target_stage_id, target_iter_id);
+  pnode->stage_to_attach_iter[stage_id] = iter_key;
+  pnode->iter_to_attached_stages[iter_key].push_back(stage_id);
+}
+
+void AttachMap::DeleteStage(int stage_id) {
+  AttachMapNode* pnode = CopyOnWrite();
+  // Delete the original stage entry
+  DeleteStageEntry(pnode, stage_id);
+}
+
+void AttachMap::UpdateIters(const std::vector<IterKey>& old_iters,
+                            const std::vector<IterKey>& new_iters) {

Review comment:
       Ah I found it. Sorry for missing that line.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456319021



##########
File path: src/auto_scheduler/loop_state.h
##########
@@ -217,6 +196,68 @@ class Stage : public ObjectRef {
   TVM_DEFINE_OBJECT_REF_COW_METHOD(StageNode);
 };
 
+/*! \brief Use stage_id to represent a stage. */
+using StageKey = int;
+/*! \brief Use stage_id and iter_id to represent a iterator. */
+using IterKey = std::pair<int, int>;
+
+/*!
+ * \brief stores the compute_at relation between stages
+ * This stores a bi-directional mapping from stages and iter:
+ * 1. Stage to its attached iterator
+ * 2. Iterator to the stage attached to it
+ * You can use AttachMapNode::stage_to_attach_iter and AttachMapNode::iter_to_attached_stages
+ * to query the relations
+ */
+class AttachMapNode : public Object {
+ public:
+  /*! \brief A Map to store the mapping of stage to its attached iterator. */
+  std::unordered_map<StageKey, IterKey> stage_to_attach_iter;
+  /*! \brief A Map to store the mapping of iterator to the stage attached to it. */
+  std::unordered_map<IterKey, std::vector<StageKey>> iter_to_attached_stages;
+
+  static constexpr const char* _type_key = "auto_scheduler.AttachMap";
+  TVM_DECLARE_FINAL_OBJECT_INFO(AttachMapNode, Object);
+};
+
+/*!
+ * \brief Managed reference to AttachMapNode.
+ * \sa AttachMapNode
+ */
+class AttachMap : public ObjectRef {
+ public:
+  /*!
+   * \brief Process the stage/iterator mapping after compute at.
+   * \param stage_id The index of the stage to be compute at.
+   * \param target_stage_id The index of stage that this step will compute at to.
+   * \param target_iter_id The index of iterator in target stage that this step will compute at to.
+   */
+  void SetComputeAtIter(int stage_id, int target_stage_id, int target_iter_id);
+  /*!
+   * \brief This is a public wrapper of `DeleteStageEntry`. To delete the entry of a specific stage.
+   * \param stage_id The index of the stage to be compute at.
+   */
+  void DeleteStage(int stage_id);
+  /*!
+   * \brief Update the iterator relations in AttachMap.

Review comment:
       Doc updated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] merrymercy merged pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

merrymercy merged pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] junrushao1994 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

junrushao1994 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456206595



##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)
+    assert str(s0) == \
+        "Placeholder: Data, Kernel, Bias, Bn_scale, Bn_offset\n" + \
+        "for i1 (0,3)\n" + \
+        "  for i2 (0,230)\n" + \
+        "    for i3 (0,230)\n" + \
+        "      pad_temp = ...\n" + \
+        "for i1 (0,64)\n" + \
+        "  for i2 (0,112)\n" + \
+        "    for nn (None)\n" + \
+        "      for ff (None)\n" + \
+        "        for yy (None)\n" + \
+        "          for xx (None)\n" + \
+        "            for rc (None)\n" + \
+        "              for ry (None)\n" + \
+        "                for rx (None)\n" + \
+        "                  compute = ...\n" + \
+        "    for i3 (0,112)\n" + \
+        "      compute = ...\n"

Review comment:
       Use """ instead?

##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))
+    s0 = dag.get_init_state()
+
+    # data, padding, kernel = 0, 1, 2
+    conv = s0.stage_ops[3]
+    # bias = 4
+    bias_add = s0.stage_ops[5]
+    # bn_scale = 6
+    bn_mul = s0.stage_ops[7]
+    # bn_offset = 8
+    bn_add = s0.stage_ops[9]
+    relu = s0.stage_ops[10]
+
+    s0.compute_inline(bn_add)
+    s0.compute_inline(bn_mul)
+    s0.compute_inline(bias_add)
+    s0.compute_at(conv, relu, s0[relu].iters[2])
+    print(s0)
+    assert str(s0) == \
+        "Placeholder: Data, Kernel, Bias, Bn_scale, Bn_offset\n" + \
+        "for i1 (0,3)\n" + \
+        "  for i2 (0,230)\n" + \
+        "    for i3 (0,230)\n" + \
+        "      pad_temp = ...\n" + \
+        "for i1 (0,64)\n" + \
+        "  for i2 (0,112)\n" + \
+        "    for nn (None)\n" + \
+        "      for ff (None)\n" + \
+        "        for yy (None)\n" + \
+        "          for xx (None)\n" + \
+        "            for rc (None)\n" + \
+        "              for ry (None)\n" + \
+        "                for rx (None)\n" + \
+        "                  compute = ...\n" + \
+        "    for i3 (0,112)\n" + \
+        "      compute = ...\n"
+
+    s0.compute_root(conv)
+    s0.compute_root(bn_mul)
+    assert str(s0) == \
+        "Placeholder: Data, Kernel, Bias, Bn_scale, Bn_offset\n" + \
+        "for i1 (0,3)\n" + \
+        "  for i2 (0,230)\n" + \
+        "    for i3 (0,230)\n" + \
+        "      pad_temp = ...\n" + \
+        "for nn (None)\n" + \
+        "  for ff (None)\n" + \
+        "    for yy (None)\n" + \
+        "      for xx (None)\n" + \
+        "        for rc (None)\n" + \
+        "          for ry (None)\n" + \
+        "            for rx (None)\n" + \
+        "              compute = ...\n" + \
+        "for i (None)\n" + \
+        "  for j (None)\n" + \
+        "    for k (None)\n" + \
+        "      for l (None)\n" + \
+        "        Bn_mul = ...\n" + \
+        "for i1 (0,64)\n" + \
+        "  for i2 (0,112)\n" + \
+        "    for i3 (0,112)\n" + \
+        "      compute = ...\n"

Review comment:
       ditto

##########
File path: src/auto_scheduler/transform_step.cc
##########
@@ -82,6 +82,83 @@ String ReorderStepNode::PrintAsPythonAPI(Array<te::Stage>* stages,
   return ss.str();
 }
 
+/********** Compute At **********/
+ComputeAtStep::ComputeAtStep(int stage_id, int target_stage_id, int target_iter_id) {
+  auto node = make_object<ComputeAtStepNode>();
+  node->stage_id = stage_id;
+  node->target_stage_id = target_stage_id;
+  node->target_iter_id = target_iter_id;
+  data_ = std::move(node);
+}
+
+void ComputeAtStepNode::ApplyToSchedule(Array<te::Stage>* stages,
+                                        StageToAxesMap* stage_to_axes) const {
+  te::Stage stage = (*stages)[stage_id];
+  const IterVar& target_axis = (*stage_to_axes)[(*stages)[target_stage_id]][target_iter_id];
+  stage.compute_at((*stages)[target_stage_id], target_axis);
+
+  stages->Set(stage_id, std::move(stage));
+}
+
+String ComputeAtStepNode::PrintAsPythonAPI(Array<te::Stage>* stages,
+                                           StageToAxesMap* stage_to_axes) const {
+  std::stringstream ss;
+  const auto& stage = (*stages)[stage_id];
+  const auto& target_stage = (*stages)[target_stage_id];
+  ss << "s[" << CleanName(stage->op->name) << "].compute_at(s[" << CleanName(target_stage->op->name)
+     << "], " << CleanName((*stage_to_axes)[target_stage][target_iter_id]->var->name_hint) << ")\n";
+  ApplyToSchedule(stages, stage_to_axes);
+  return ss.str();
+}
+
+/********** Compute Root **********/
+ComputeRootStep::ComputeRootStep(int stage_id) {
+  auto node = make_object<ComputeRootStepNode>();
+  node->stage_id = stage_id;
+  data_ = std::move(node);
+}
+
+void ComputeRootStepNode::ApplyToSchedule(Array<te::Stage>* stages,
+                                          StageToAxesMap* stage_to_axes) const {
+  auto stage = (*stages)[stage_id];
+  stage.compute_root();
+
+  stages->Set(stage_id, std::move(stage));
+}
+
+String ComputeRootStepNode::PrintAsPythonAPI(Array<te::Stage>* stages,
+                                             StageToAxesMap* stage_to_axes) const {
+  std::stringstream ss;
+  const auto& stage = (*stages)[stage_id];
+  ss << "s[" << CleanName(stage->op->name) << "].compute_root()\n";
+  ApplyToSchedule(stages, stage_to_axes);
+  return ss.str();
+}
+
+/********** Compute Inline **********/
+ComputeInlineStep::ComputeInlineStep(int stage_id) {
+  auto node = make_object<ComputeInlineStepNode>();
+  node->stage_id = stage_id;
+  data_ = std::move(node);
+}
+
+void ComputeInlineStepNode::ApplyToSchedule(Array<te::Stage>* stages,
+                                            StageToAxesMap* stage_to_axes) const {
+  auto stage = (*stages)[stage_id];
+  stage.compute_inline();
+

Review comment:
       ditto

##########
File path: src/auto_scheduler/transform_step.cc
##########
@@ -82,6 +82,83 @@ String ReorderStepNode::PrintAsPythonAPI(Array<te::Stage>* stages,
   return ss.str();
 }
 
+/********** Compute At **********/
+ComputeAtStep::ComputeAtStep(int stage_id, int target_stage_id, int target_iter_id) {
+  auto node = make_object<ComputeAtStepNode>();
+  node->stage_id = stage_id;
+  node->target_stage_id = target_stage_id;
+  node->target_iter_id = target_iter_id;
+  data_ = std::move(node);
+}
+
+void ComputeAtStepNode::ApplyToSchedule(Array<te::Stage>* stages,
+                                        StageToAxesMap* stage_to_axes) const {
+  te::Stage stage = (*stages)[stage_id];
+  const IterVar& target_axis = (*stage_to_axes)[(*stages)[target_stage_id]][target_iter_id];
+  stage.compute_at((*stages)[target_stage_id], target_axis);
+
+  stages->Set(stage_id, std::move(stage));
+}
+
+String ComputeAtStepNode::PrintAsPythonAPI(Array<te::Stage>* stages,
+                                           StageToAxesMap* stage_to_axes) const {
+  std::stringstream ss;
+  const auto& stage = (*stages)[stage_id];
+  const auto& target_stage = (*stages)[target_stage_id];
+  ss << "s[" << CleanName(stage->op->name) << "].compute_at(s[" << CleanName(target_stage->op->name)
+     << "], " << CleanName((*stage_to_axes)[target_stage][target_iter_id]->var->name_hint) << ")\n";
+  ApplyToSchedule(stages, stage_to_axes);
+  return ss.str();
+}
+
+/********** Compute Root **********/
+ComputeRootStep::ComputeRootStep(int stage_id) {
+  auto node = make_object<ComputeRootStepNode>();
+  node->stage_id = stage_id;
+  data_ = std::move(node);
+}
+
+void ComputeRootStepNode::ApplyToSchedule(Array<te::Stage>* stages,
+                                          StageToAxesMap* stage_to_axes) const {
+  auto stage = (*stages)[stage_id];
+  stage.compute_root();
+

Review comment:
       nit: remove this blank line.

##########
File path: src/auto_scheduler/transform_step.cc
##########
@@ -82,6 +82,83 @@ String ReorderStepNode::PrintAsPythonAPI(Array<te::Stage>* stages,
   return ss.str();
 }
 
+/********** Compute At **********/
+ComputeAtStep::ComputeAtStep(int stage_id, int target_stage_id, int target_iter_id) {
+  auto node = make_object<ComputeAtStepNode>();
+  node->stage_id = stage_id;
+  node->target_stage_id = target_stage_id;
+  node->target_iter_id = target_iter_id;
+  data_ = std::move(node);
+}
+
+void ComputeAtStepNode::ApplyToSchedule(Array<te::Stage>* stages,
+                                        StageToAxesMap* stage_to_axes) const {
+  te::Stage stage = (*stages)[stage_id];
+  const IterVar& target_axis = (*stage_to_axes)[(*stages)[target_stage_id]][target_iter_id];
+  stage.compute_at((*stages)[target_stage_id], target_axis);

Review comment:
       nit: make it slightly more succinct
   ```suggestion
     const te::Stage& target_stage = (*stages)[target_stage_id];
     const IterVar& target_axis = (*stage_to_axes)[target_stage][target_iter_id];
     stage.compute_at(target_stage, target_axis);
   ```

##########
File path: tests/python/unittest/test_auto_scheduler_loop_state.py
##########
@@ -61,5 +61,79 @@ def test_split_fuse_reorder():
     assert s1[C].iters[4].range.extent == 8
     assert s1[C].iters[5].range.extent == 2
 
+    s1.parallel(C, j1)
+    s1.unroll(C, j2)
+    s1.vectorize(C, j3)
+    s1.bind(C, i1, "blockIdx.x")
+    s1.bind(C, i2, "vthread")
+    s1.bind(C, i3, "threadIdx.y")
+
+
+def test_compute_at_root_inline():
+    dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(1, 224, 224, 3, 64, 7, 2, 3))

Review comment:
       ```suggestion
       dag = auto_scheduler.ComputeDAG(conv2d_nchw_bn_relu(N=1, H=224, W=224, CI=3, CO=64, kernel_size=7, strides=2, padding=3))
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] junrushao1994 commented on pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

junrushao1994 commented on pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#issuecomment-660520586


   A minor typo we didn't catch before (github doesn't allow me to comment on that line because it is not related to this change): in file "loop_state.h", document for class StageNode,  "Similar to te::Stage in \`include/schedule.h\`", the path is incorrect.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] jcf94 commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

jcf94 commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r456318375



##########
File path: python/tvm/auto_scheduler/loop_state.py
##########
@@ -161,16 +202,116 @@ def fuse(self, stage, iters):
             The Stage to be fused, can be a Stage order index, Stage operation or stage
             output tensor.
         iters : List[Iterator]
-            The iterators to be fused
+            The iterators to be fused.
+
+        Returns
+        -------
+        res_it : Iterator
+            The fused Iterator.
+        """
+        self.state_object, res = _ffi_api.StateFuse(self.state_object,
+                                                    self._resolve_stage_id(stage), iters)
+        return res
+
+    def vectorize(self, stage, iterator):
+        """ Schedule primitive corresponds to te.vectorize.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be vectorized, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be vectorized.
 
         Returns
         -------
         res_it : Iterator
-            The fused Iterator
+            The vectorized Iterator.
         """
-        stage_id = self._resolve_stage_id(stage)
+        self.state_object, res = _ffi_api.StateVectorize(self.state_object,
+                                                         self._resolve_stage_id(stage), iterator)
+        return res
+
+    def parallel(self, stage, iterator):
+        """ Schedule primitive corresponds to te.parallel.
 
-        self.state_object, res = _ffi_api.StateFuse(self.state_object, stage_id, iters)
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be paralleled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be paralleled.
+
+        Returns
+        -------
+        res_it : Iterator
+            The paralleled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateParallel(self.state_object,
+                                                        self._resolve_stage_id(stage), iterator)
+        return res
+
+    def unroll(self, stage, iterator, max_unroll=None):
+        """ Schedule primitive corresponds to te.unroll.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be unrolled, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be unrolled.
+        max_unroll : Optional[int]
+            The max unroll limit. Iterator with extent larger than this limit will be skipped.
+
+        Returns
+        -------
+        res_it : Iterator
+            The unrolled Iterator.
+        """
+        self.state_object, res = _ffi_api.StateUnroll(self.state_object,
+                                                      self._resolve_stage_id(stage), iterator,
+                                                      max_unroll if max_unroll else -1)
+        return res
+
+    def bind(self, stage, iterator, thread_name):
+        """ Schedule primitive corresponds to te.bind.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The Stage to be binded, can be a Stage order index, Stage operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to be binded.
+        thread_name : str
+            The thread type to be binded. Currently support:
+            - vthread
+            - blockIdx.x
+            - threadIdx.x
+            - blockIdx.y
+            - threadIdx.y
+
+        Returns
+        -------
+        res_it : Iterator
+            The binded Iterator.
+        """
+        trans_table = {
+            "vthread": 4,
+            "blockIdx.x": 5,
+            "threadIdx.x": 6,
+            "blockIdx.y": 7,
+            "threadIdx.y": 8,
+        }

Review comment:
       Moved this to class static member.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] comaniac commented on a change in pull request #6073: [Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps

Posted by GitBox <gi...@apache.org>.

comaniac commented on a change in pull request #6073:
URL: https://github.com/apache/incubator-tvm/pull/6073#discussion_r457554188



##########
File path: src/auto_scheduler/transform_step.cc
##########
@@ -51,129 +53,539 @@ const char* IteratorAnnotationString[] = {
     "tensorize"     // kTensorized = 11
 };
 
-/********** Reorder **********/
-ReorderStep::ReorderStep(int stage_id, const Array<Integer>& after_ids) {
-  auto node = make_object<ReorderStepNode>();
-  node->stage_id = stage_id;
-  for (const auto& x : after_ids) {
-    CHECK(x->IsInstance<IntImmNode>());
+Step StepReadFromRecord(dmlc::JSONReader* reader) {
+  std::string name;
+  CHECK(reader->NextArrayItem());
+  reader->Read(&name);
+  if (name == AnnotationStepNode::record_prefix_str) {
+    return AnnotationStep(reader);
+  } else if (name == FuseStepNode::record_prefix_str) {
+    return FuseStep(reader);
+  } else if (name == ReorderStepNode::record_prefix_str) {
+    return ReorderStep(reader);
+  } else if (name == SplitStepNode::record_prefix_str) {
+    return SplitStep(reader);
+  } else if (name == ComputeAtStepNode::record_prefix_str) {
+    return ComputeAtStep(reader);
+  } else if (name == ComputeInlineStepNode::record_prefix_str) {
+    return ComputeInlineStep(reader);
+  } else if (name == ComputeRootStepNode::record_prefix_str) {
+    return ComputeRootStep(reader);
+  } else {
+    LOG(FATAL) << "Invalid step format: " << name;
   }
-  node->after_ids = after_ids;
-  data_ = std::move(node);
+  return Step();
 }
 
-void ReorderStepNode::ApplyToSchedule(Array<te::Stage>* stages,
-                                      StageToAxesMap* stage_to_axes) const {
-  auto stage = (*stages)[stage_id];
-  const Array<IterVar>& axes = stage_to_axes->at(stage);
-  CHECK_EQ(after_ids.size(), axes.size());
-
-  Array<IterVar> new_axes;
-  new_axes.reserve(axes.size());
-  for (auto i : after_ids) {
-    new_axes.push_back(axes[i]);
+void StepApplyToState(const Step& step, State* state, const ComputeDAG& dag) {
+  if (auto ps = step.as<AnnotationStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<FuseStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<ReorderStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<SplitStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<ComputeAtStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<ComputeInlineStepNode>()) {
+    ps->ApplyToState(state);
+  } else if (auto ps = step.as<ComputeRootStepNode>()) {
+    ps->ApplyToState(state);
+  } else {
+    LOG(FATAL) << "Invalid step: " << step;
   }
-  stage.reorder(new_axes);
-
-  stage_to_axes->Set(stage, std::move(new_axes));
-  stages->Set(stage_id, std::move(stage));
 }
 
-String ReorderStepNode::PrintAsPythonAPI(Array<te::Stage>* stages,
-                                         StageToAxesMap* stage_to_axes) const {
-  const auto& stage = (*stages)[stage_id];
-  std::stringstream ss;
-
-  ss << "s[" << CleanName(stage->op->name) << "].reorder(";
-  for (size_t i = 0; i < after_ids.size(); ++i) {
-    ss << CleanName((*stage_to_axes)[stage][after_ids[i]]->var->name_hint);
-    if (i != after_ids.size() - 1) {
-      ss << ", ";
-    }
+void StepApplyToSchedule(const Step& step, Array<te::Stage>* stages,

Review comment:
       For 3 functions in this file (StepApplyToState`, `StepApplyToSchedule`, and `StepApplyToPythonAPI`), it should be fine and clearer to directly call `step->XXX(...);` in `compute_dag.cc` so that we can avoid unnecessary step node dispatching.

##########
File path: src/auto_scheduler/transform_step.cc
##########
@@ -51,129 +53,539 @@ const char* IteratorAnnotationString[] = {
     "tensorize"     // kTensorized = 11
 };
 
-/********** Reorder **********/
-ReorderStep::ReorderStep(int stage_id, const Array<Integer>& after_ids) {
-  auto node = make_object<ReorderStepNode>();
-  node->stage_id = stage_id;
-  for (const auto& x : after_ids) {
-    CHECK(x->IsInstance<IntImmNode>());
+Step StepReadFromRecord(dmlc::JSONReader* reader) {

Review comment:
       It would be better to have a step registration mechanism to establish a step lookup table from record prefix to the step node class, so that we can totally hide step node dispatching from developers. The potential issues in the current implementation is that 1) people may forget to update this function when adding a new step, and 2) there's no mechanism to check duplicated record prefix of two steps.
   
   In addition, I feel this issue is important but I'm not sure if it's proper to address it in this PR. We may need versioning for records.

##########
File path: src/auto_scheduler/loop_state.cc
##########
@@ -244,285 +197,73 @@ Iterator State::unroll(int stage_id, const Iterator& it, int max_unroll) {
   AnnotationStep step =
       AnnotationStep(stage_id, GetIndex(stage->iters, it), IteratorAnnotation::kUnroll);
   CopyOnWrite()->transform_steps.push_back(step);
-  return DoAnnotationStep(step);
+  return step->ApplyToState(this);
 }
 
-Iterator State::bind(int stage_id, const Iterator& it, IteratorAnnotation thread_type) {
+Iterator State::vectorize(int stage_id, const Iterator& it) {
   const Stage& stage = operator->()->stages[stage_id];
-  if (thread_type < IteratorAnnotation::kVThread || thread_type > IteratorAnnotation::kThreadZ) {
-    LOG(FATAL) << "thread_type error, valid: kVThread, kBlockX, kBlockY, "
-               << "kThreadX, kThreadY, kBlockZ, kThreadZ";
-  }
-  AnnotationStep step = AnnotationStep(stage_id, GetIndex(stage->iters, it), thread_type);
+  AnnotationStep step =
+      AnnotationStep(stage_id, GetIndex(stage->iters, it), IteratorAnnotation::kVectorize);
   CopyOnWrite()->transform_steps.push_back(step);
-  return DoAnnotationStep(step);
-}
-
-/********** Step implementations for state **********/
-void State::DoReorderStep(const ReorderStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-  Array<Iterator> iters;
-  for (auto x : step->after_ids) {
-    iters.push_back(stage->iters[x]);
-  }
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id,
-                     Stage(stage->op, stage->op_type, iters, stage->compute_at, stage->attrs));
+  return step->ApplyToState(this);
 }
 
-void State::DoComputeAtStep(const ComputeAtStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Remove the bound information of each iterator since they may not be accurate after
-  // compute at
-  Array<Iterator> new_iters;
-  for (const Iterator& it : stage->iters) {
-    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
-                                           ComputeAtKind::kIter, stage->attrs));
-  // Update attach map
-  pstate->attach_map.SetComputeAtIter(step->stage_id, step->target_stage_id, step->target_iter_id);
-}
-
-void State::DoComputeRootStep(const ComputeRootStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Remove the bound information of each iterator since they may not be accurate after
-  // compute root
-  Array<Iterator> new_iters;
-  for (const Iterator& it : stage->iters) {
-    new_iters.push_back(Iterator(it->name, Range(), it->iter_kind, it->annotation));
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(step->stage_id, Stage(stage->op, stage->op_type, std::move(new_iters),
-                                           ComputeAtKind::kRoot, stage->attrs));
-  // Update attach map
-  pstate->attach_map.DeleteStage(step->stage_id);
+Iterator State::fuse(int stage_id, const Array<Iterator>& iters) {
+  const Stage& stage = operator->()->stages[stage_id];
+  Array<Integer> indices;
+  GetIndices(stage->iters, iters, &indices);
+  FuseStep step = FuseStep(stage_id, indices);
+  CopyOnWrite()->transform_steps.push_back(step);
+  return step->ApplyToState(this);
 }
 
-void State::DoComputeInlineStep(const ComputeInlineStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-
-  // Check the validity of compute_inline
-  for (size_t i = 0; i < stage->iters.size(); ++i) {
-    CHECK_EQ(operator->()->attach_map->iter_to_attached_stages.count(
-                 std::make_pair(step->stage_id, i)),
-             0)
-        << "Invalid compute_inline: There are some other stages that are attached to the "
-        << "target stage";
-  }
-
-  StateNode* pstate = CopyOnWrite();
-  auto new_stage = pstate->stages[step->stage_id];
-  new_stage.CopyOnWrite()->compute_at = ComputeAtKind::kInlined;
-  pstate->stages.Set(step->stage_id, std::move(new_stage));
-  // Update attach map
-  pstate->attach_map.DeleteStage(step->stage_id);
+void State::reorder(int stage_id, const Array<Iterator>& order) {
+  const Stage& stage = operator->()->stages[stage_id];
+  CHECK_EQ(order.size(), stage->iters.size()) << "The order of all iterators "
+                                              << "should be specified";
+  Array<Integer> after_ids;
+  GetIndices(stage->iters, order, &after_ids);
+  ReorderStep step = ReorderStep(stage_id, after_ids);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-// common part for DoSplitStep, DoFollowSplitStep, and DoFollowFusedSplitStep
-Array<Iterator> State::DoSplitStepCommon(int stage_id, int iter_id,
-                                         const Array<Optional<Integer>>& lengths,
-                                         bool inner_to_outer) {
+Array<Iterator> State::split(int stage_id, const Iterator& it,
+                             const Array<Optional<Integer>>& lengths, bool inner_to_outer) {
   const Stage& stage = operator->()->stages[stage_id];
-  const Iterator& it = stage->iters[iter_id];
-  size_t old_iter_size = stage->iters.size();
-  bool concrete = true;
-
-  Optional<PrimExpr> tosplit_min, tosplit_extent;
-  if (it->range.defined()) {
-    tosplit_min = it->range->min;
-    tosplit_extent = it->range->extent;
-  } else {
-    tosplit_min = NullOpt;
-    tosplit_extent = NullOpt;
-  }
-
-  Array<Iterator> outs;
-  for (size_t i = 0; i < lengths.size(); ++i) {
-    Optional<Integer> l;
-    String name;
-    if (inner_to_outer) {
-      l = lengths[lengths.size() - i - 1];
-      name = it->name + "." + std::to_string(lengths.size() - i);
-    } else {
-      l = lengths[i];
-      name = it->name + "." + std::to_string(i);
-    }
-    Iterator res;
-    if (l && tosplit_min && tosplit_extent) {
-      res = Iterator(name, Range::FromMinExtent(tosplit_min.value(), l.value()), it->iter_kind,
-                     IteratorAnnotation::kNone);
-      tosplit_min = Integer(0);
-      tosplit_extent = indexdiv(tosplit_extent.value() + l.value() - 1, l.value());
-    } else {
-      res = Iterator(name, Range(), it->iter_kind, IteratorAnnotation::kNone);
-      tosplit_min = NullOpt;
-      tosplit_extent = NullOpt;
-      concrete = false;
-    }
-    outs.push_back(std::move(res));
-  }
-
-  Range range;
-  if (tosplit_min && tosplit_extent) {
-    range = Range::FromMinExtent(tosplit_min.value(), tosplit_extent.value());
-  }
-  if (inner_to_outer) {
-    outs.push_back(Iterator(it->name + ".0", range, it->iter_kind, IteratorAnnotation::kNone));
-    // Reverse the Iterator array
-    Array<Iterator> temp(outs.rbegin(), outs.rend());
-    outs = std::move(temp);
-  } else {
-    outs.push_back(Iterator(it->name + "." + std::to_string(lengths.size()), range, it->iter_kind,
-                            IteratorAnnotation::kNone));
-  }
-
-  Array<Iterator> new_iters;
-  new_iters.insert(new_iters.end(), stage->iters.begin(), stage->iters.begin() + iter_id);
-  new_iters.insert(new_iters.end(), outs.begin(), outs.end());
-  new_iters.insert(new_iters.end(), stage->iters.begin() + iter_id + 1, stage->iters.end());
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(stage_id,
-                     Stage(stage->op, stage->op_type, new_iters, stage->compute_at, stage->attrs));
-  pstate->concrete &= concrete;
-
-  // Two vectors are used to represent the iterator relation before and after split
-  // The original iterators in AttachMap will be updated with the new iterators
-  std::vector<IterKey> from_iters;
-  std::vector<IterKey> to_iters;
-  for (size_t i = iter_id; i < old_iter_size; ++i) {
-    from_iters.emplace_back(stage_id, i);
-    to_iters.emplace_back(stage_id, i + lengths.size());
-  }
-  pstate->attach_map.UpdateIters(from_iters, to_iters);
-
-  return outs;
+  SplitStep step =
+      SplitStep(stage_id, GetIndex(stage->iters, it),
+                it->range.defined() ? it->range->extent : PrimExpr(), lengths, inner_to_outer);
+  CopyOnWrite()->transform_steps.push_back(step);
+  return step->ApplyToState(this);
 }
 
-Array<Iterator> State::DoSplitStep(const SplitStep& step) {
-  return DoSplitStepCommon(step->stage_id, step->iter_id, step->lengths, step->inner_to_outer);
+void State::compute_at(int stage_id, int target_stage_id, const Iterator& target_iter) {
+  const Stage& target_stage = operator->()->stages[target_stage_id];
+  ComputeAtStep step =
+      ComputeAtStep(stage_id, target_stage_id, GetIndex(target_stage->iters, target_iter));
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-Iterator State::DoFuseStep(const FuseStep& step) {
-  int stage_id = step->stage_id;
-  const Stage& stage = operator->()->stages[stage_id];
-  size_t old_iter_size = static_cast<int>(stage->iters.size());
-
-  String new_name;
-  PrimExpr new_extent = 1;
-  IteratorKind new_iter_kind = IteratorKind::kSpecial;
-
-  for (size_t i = 0; i < step->fused_ids.size(); ++i) {
-    if (i > 0) {
-      CHECK_EQ(step->fused_ids[i]->value, step->fused_ids[i - 1]->value + 1);
-    }
-
-    if (i != step->fused_ids.size() - 1) {
-      const auto& iter_to_attached_stage = operator->()->attach_map->iter_to_attached_stages;
-      if (iter_to_attached_stage.find(std::make_pair(stage_id, step->fused_ids[i])) !=
-          iter_to_attached_stage.end()) {
-        LOG(FATAL) << "Invalid Fuse. Trying to fuse iterators that have been attached by some "
-                   << "stages. State before fusion:\n"
-                   << *this;
-      }
-    }
-
-    const Iterator& it = stage->iters[step->fused_ids[i]];
-    new_name = new_name + it->name + "@";
-
-    if (it->range.defined() && new_extent.defined()) {
-      new_extent = new_extent * it->range->extent;
-    } else {
-      new_extent = PrimExpr();
-    }
-
-    if (i == 0) {
-      new_iter_kind = it->iter_kind;
-    } else {
-      if (new_iter_kind != it->iter_kind) {
-        new_iter_kind = IteratorKind::kMixed;
-      }
-    }
-  }
-
-  Range range;
-  if (new_extent.defined()) {
-    range = Range::FromMinExtent(0, new_extent);
-  }
-  Iterator new_it = Iterator(new_name, range, new_iter_kind, IteratorAnnotation::kNone);
-  Array<Iterator> new_iters;
-  new_iters.insert(new_iters.end(), stage->iters.begin(),
-                   stage->iters.begin() + step->fused_ids.front());
-  new_iters.push_back(new_it);
-  new_iters.insert(new_iters.end(), stage->iters.begin() + step->fused_ids.back() + 1,
-                   stage->iters.end());
-
-  StateNode* pstate = CopyOnWrite();
-  pstate->stages.Set(stage_id,
-                     Stage(stage->op, stage->op_type, new_iters, stage->compute_at, stage->attrs));
-
-  // Two vectors are used to represent the iterator relation before and after fuse
-  // The original iterators in AttachMap will be updated with the new iterators
-  std::vector<IterKey> from_iters;
-  std::vector<IterKey> to_iters;
-  const size_t begin_id = step->fused_ids.front(), end_id = step->fused_ids.back();
-  for (size_t i = 0; i < old_iter_size; ++i) {
-    if (i <= begin_id) {
-      continue;
-    } else if (i > end_id) {
-      // move forward
-      from_iters.emplace_back(stage_id, i);
-      to_iters.emplace_back(stage_id, i - end_id + begin_id);
-    } else {
-      // move to the fused id
-      from_iters.emplace_back(stage_id, i);
-      to_iters.emplace_back(stage_id, begin_id);
-    }
-  }
-  pstate->attach_map.UpdateIters(from_iters, to_iters);
-
-  return new_it;
+void State::compute_inline(int stage_id) {
+  ComputeInlineStep step = ComputeInlineStep(stage_id);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-Iterator State::DoAnnotationStep(const AnnotationStep& step) {
-  const Stage& stage = operator->()->stages[step->stage_id];
-  Iterator it = stage->iters[step->iter_id];
-
-  CHECK(it->annotation == IteratorAnnotation::kNone);
-  Iterator new_it = Iterator(it->name, it->range, it->iter_kind, step->annotation);
-  Stage new_stage = stage;
-  new_stage.CopyOnWrite()->iters.Set(step->iter_id, new_it);
-  CopyOnWrite()->stages.Set(step->stage_id, std::move(new_stage));
-  return new_it;
+void State::compute_root(int stage_id) {
+  ComputeRootStep step = ComputeRootStep(stage_id);
+  CopyOnWrite()->transform_steps.push_back(step);
+  step->ApplyToState(this);
 }
 
-void State::DoSteps(const ComputeDAG& dag) {
+void State::ApplySteps(const ComputeDAG& dag) {
   CHECK(operator->()->stages.size()) << "Invalid State with empty operation stages.";
 
+  // Call each step's ApplyToState method
   for (const auto& step : operator->()->transform_steps) {
-    if (auto ps = step.as<ReorderStepNode>()) {
-      DoReorderStep(GetRef<ReorderStep>(ps));
-    } else if (auto ps = step.as<ComputeAtStepNode>()) {
-      DoComputeAtStep(GetRef<ComputeAtStep>(ps));
-    } else if (auto ps = step.as<ComputeRootStepNode>()) {
-      DoComputeRootStep(GetRef<ComputeRootStep>(ps));
-    } else if (auto ps = step.as<ComputeInlineStepNode>()) {
-      DoComputeInlineStep(GetRef<ComputeInlineStep>(ps));
-    } else if (auto ps = step.as<SplitStepNode>()) {
-      DoSplitStep(GetRef<SplitStep>(ps));
-    } else if (auto ps = step.as<FuseStepNode>()) {
-      DoFuseStep(GetRef<FuseStep>(ps));
-    } else if (auto ps = step.as<AnnotationStepNode>()) {
-      DoAnnotationStep(GetRef<AnnotationStep>(ps));
-    } else {
-      LOG(FATAL) << "Invalid step: " << step;
-    }
+    StepApplyToState(step, this, dag);

Review comment:
       - Looks like we can remove `StepApplyToState` and simply use `step->ApplyToState(this)`?
   - `dag` seems unused in both this function and `StepApplyToState`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org