You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/04/18 07:43:18 UTC

[GitHub] [incubator-tvm] yongfeng-nv opened a new pull request #5367: WIP: Improve floormod (Don't merge)

yongfeng-nv opened a new pull request #5367: WIP: Improve floormod (Don't merge)
URL: https://github.com/apache/incubator-tvm/pull/5367
 
 
   Prototype to improve floormod for int_set.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r410945841
 
 

 ##########
 File path: src/arith/int_set.cc
 ##########
 @@ -730,12 +765,30 @@ Map<Var, IntSet> ConvertDomMap(
   return dmap;
 }
 
-IntSet EvalSet(PrimExpr e,
-               const Map<Var, IntSet>& dom_map) {
+IntSet EvalSet(PrimExpr e, const Map<Var, IntSet>& dom_map,
+               const std::unordered_map<IterVar, Range>& rmap) {
   Analyzer ana;
+  // Bind ana with rmap
+  for (auto entry : rmap) {
+    ana.Bind(entry.first->var, entry.second);
 
 Review comment:
   Let us try to upgrade the API away from the EvalSet, and directly use https://github.com/apache/incubator-tvm/blob/master/include/tvm/arith/analyzer.h#L366

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616178952
 
 
   In particular, my comment wrt to EvalSet means we should directly use the IntSetAnalyzer member of the analyzer, that can should have part of the Var's bound info

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yongfeng-nv edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv edited a comment on issue #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616035227
 
 
   The current behavior of IntervalSet floormod(a, b) is rough -- it returns [0, b-1], [-b+1, b-1], or everything.  It causes extra iterations to my schedule.  This is a simplified version:
   
       import tvm
       from tvm import te
       
       def test_bound_tile_mod():
           def compute(M_tiles, N_tiles, factor, dtype):
               # Algo
               M = M_tiles * factor
               N = N_tiles * factor
       
               A = tvm.te.placeholder((N, M), name='A', dtype=dtype)
               C = tvm.te.compute((N, M), lambda n, m: A[n, m], name='C')
               s = tvm.te.create_schedule(C.op)
       
               return s, A, C
       
           def schedule(s, factor, padding, A, C):
               C_local = s.cache_write(C, "local")
       
               n, m = C.op.axis
               bn, bm, ni, mi = s[C].tile(n, m, factor, factor)
               nio, nii = s[C].split(ni, 2)
               n = s[C].fuse(nii, mi)
               C_shared = s.cache_write(C, "shared")
               bn, bm, ni, mi = C_shared.op.axis       
               s[C_shared].storage_align(ni, factor * 2, padding)
       
               n, m = s[C].op.axis
               bn, bm, ni, mi = s[C].tile(n, m, factor, factor)
               s[C].set_scope("global")
               niio, niii = s[C].split(ni, 32)
               s[C_shared].compute_at(s[C], niio)
       
               return s
       
           s, A, C = compute(2, 2, 128, "float16")
           s = schedule(s, 128, 8, A, C)
           bounds = tvm.te.schedule.InferBound(s)
           print(tvm.lower(s, [A, C], simple_mode=True))
   
   It does 256x256 point-wise copying in 128x128 tiles from local to shared then to global.  I would like to allocate only a 32x128 shared memory and reuse it four times per tile.  In addition, I want to pad 8 data every two tile rows to avoid bank conflicts in shared memory with storage_align.  I expect C.shared (the local ->shared stage) does exactly 256x256 copying, but the following IR shows 4 times of that.
   
       // attr [C.local] storage_scope = "local"
       allocate C.local[float16 * 65536]
       // attr [C.shared] storage_scope = "shared"
       allocate C.shared[float16 * 16896]
       for (n.c, 0, 256) {
         for (m.c, 0, 256) {
           C.local[((n.c*256) + m.c)] = A[((n.c*256) + m.c)]
         }
       }
       for (n.outer, 0, 2) {
         for (m.outer, 0, 2) {
           for (n.inner.outer, 0, 4) {
             for (n.inner.outer.c, 0, 64) {
               for (n.inner.inner.m.inner.fused.c, 0, 256) {
                 C.shared[((n.inner.outer.c*264) + n.inner.inner.m.inner.fused.c)] = C.local[(((((n.outer*32768) + (n.inner.outer.c*512)) + (floordiv(n.inner.inner.m.inner.fused.c, 128)*256)) + (m.outer*128)) + floormod(n.inner.inner.m.inner.fused.c, 128))]
               }
             }
             for (n.inner.inner, 0, 32) {
               for (m.inner, 0, 128) {
                 C[(((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.inner*256)) + (m.outer*128)) + m.inner)] = C.shared[((((n.inner.outer*4224) + (floordiv(n.inner.inner, 2)*264)) + (floormod(n.inner.inner, 2)*128)) + m.inner)]
               }
             }
           }
         }
       }
               
   If there is another schedule to achieve this, please let me know.  As of now, I can only accomplish this by tiling both C.shared and C.  C's compute now involves several floormod: `C.shared(floordiv(n, 128), floordiv(m, 128), floordiv(floormod(n, 128), 2), ((floormod(floormod(n, 128), 2)*128) + floormod(m, 128)))`.  `m` and `n` further bring in leaf IterVars to floormod to PropBoundToInputs() during InferBound.  Because the best result from floormod(x, 128) is [0, 128-1], C.shared doesn't reduce domain, although it does compute_at s[C]'s n.inner.outer with range [0,3].
   
   With this PR, I am able to get expected IR -- `(n.inner.outer.c, 0, 64)` becoming `(n.inner.outer.c, 0, 16)`:
   
       // attr [C.local] storage_scope = "local"
       allocate C.local[float16 * 65536]
       // attr [C.shared] storage_scope = "shared"
       allocate C.shared[float16 * 4224]
       for (n.c, 0, 256) {
         for (m.c, 0, 256) {
           C.local[((n.c*256) + m.c)] = A[((n.c*256) + m.c)]
         }
       }
       for (n.outer, 0, 2) {
         for (m.outer, 0, 2) {
           for (n.inner.outer, 0, 4) {
             for (n.inner.outer.c, 0, 16) {
               for (n.inner.inner.m.inner.fused.c, 0, 256) {
                 C.shared[((n.inner.outer.c*264) + n.inner.inner.m.inner.fused.c)] = C.local[((((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.outer.c*512)) + (floordiv(n.inner.inner.m.inner.fused.c, 128)*256)) + (m.outer*128)) + floormod(n.inner.inner.m.inner.fused.c, 128))]
               }
             }
             for (n.inner.inner, 0, 32) {
               for (m.inner, 0, 128) {
                 C[(((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.inner*256)) + (m.outer*128)) + m.inner)] = C.shared[(((floordiv(n.inner.inner, 2)*264) + (floormod(n.inner.inner, 2)*128)) + m.inner)]
               }
             }
           }
         }
       }
   
   
   The improvement has two folds:
   1.   EvalSet takes a range map as input to help calculating floormod.  For example:  `[x*4, x*4+3] % 8 = [x*4-x/2*8, x*4+3-(x*4+3)/8*8)]`, if we don't know `x`'s range; but `[x*4, x*4+3]`, if we know `x`'s range is `[0, 1]`.  This is common from tiling.
         
   2.   For `a mod b`, if b is an IntImm, single point, and greater than 0, we do the following:
   
       // a mod b = a - b * (a/b) if
       // (a) a_max - a_min < b, i.e. that before mod, a's range doesn't cover [0, b)
       // and (b) a_min mod b <= a_max mod b, i.e. that a's range is still continuous after mod
   so that `[13, 5] % 10 = [3, 5]`
   
   IntervalSet evaluation is hard.  These improvements are adhoc to the problems I encountered.  I am open to any alternatives.
   
   Question: how about mod?  It's implementation looks same as floormod.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on issue #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616178512
 
 
   It would be great if we can dissect it down to specific cases in mind and discuss how to solve these cases, I see two kinds of problems:
   
   - (1) EvalSet need to be aware of the bound of Vars
      - Ideally these bounds should be populated in the analyzer via bind.
   - (2) `floormod([13, 15], 10) => [3, 5]` : This seems to be a clean case that we can fix first(independent from the Var bound problem), how about we create a separate PR for the var bound case and aim at fixing this one?
      - The fix would be `floormod([a, b], c)` , check if we can prove `floordiv(a, c) == floordiv(b, c)`, if so, we rewrite to `[a - floordiv(a, c) * c, b -  floordiv(b, c) * c]`, otherwise, follow the old rule.
   
   
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r410941690
 
 

 ##########
 File path: include/tvm/te/operation.h
 ##########
 @@ -112,6 +113,7 @@ class OperationNode : public tir::FunctionBaseNode {
       const Operation& self,
       arith::Analyzer* analyzer,
       const std::unordered_map<const VarNode*, IntSet>& dom_map,
+      const std::unordered_map<IterVar, Range>& rmap,
 
 Review comment:
   Given that we already pass the analyzer around, shall we simply bind the same of each variabe to the analyzer?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616182069
 
 
   cc @Hzfengsy @yzhliu @hzfan 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616178512
 
 
   It would be great if we can dissect it down to specific cases in mind and discuss how to solve these cases, I see two kinds of problems:
   
   - (1) EvalSet need to be aware of the bound of Vars
      - Ideally these bounds should be populated in the analyzer via bind.
   
   - ``[13, 5] % 10 = [3, 5]``` : This case is not very clear to me, how would you define ```[13, 5]```? Since if we define it as ```[lower_bound, upper bound]``` then it is an empty set. Is it a typo(e.g. ```[3, 5]```? If it is the case for ```[3, 5]```, seems we can do it by enhancing the rule of floormod.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on issue #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616178512
 
 
   It would be great if we can dissect it down to specific cases in mind and discuss how to solve these cases, I see two kinds of problems:
   
   - (1) EvalSet need to be aware of the bound of Vars
      - Ideally these bounds should be populated in the analyzer via bind.
   - (2) `floormod([13, 15], 10) => [3, 5]` : This seems to be a clean case that we can fix first(independent from the Var bound problem), how about we create a separate PR for the var bound case and aim at fixing this one?
      - For `floormod([a, b], c)`:  check if we can prove `floordiv(a, c) == floordiv(b, c)`, if so, we rewrite to `[a - floordiv(a, c) * c, b -  floordiv(b, c) * c]`, otherwise, follow the old rule.
     - Same rule can be applied to ConstIntBound
   
   
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on issue #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616178512
 
 
   It would be great if we can dissect it down to specific cases in mind and discuss how to solve these cases, I see two kinds of problems:
   
   - (1) EvalSet need to be aware of the bound of Vars
      - Ideally these bounds should be populated in the analyzer via bind.
   - (2) `floormod([13, 15], 10) => [3, 5]` : This seems to be a clean case that we can fix first(independent from the Var bound problem), how about we create a separate PR for the var bound case and aim at fixing this one?
      - For `floormod([a, b], c)`:  check if we can prove `floordiv(a, c) == floordiv(b, c)`, if so, we rewrite to `[a - floordiv(a, c) * c, b -  floordiv(b, c) * c]`, otherwise, follow the old rule.
   
   
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on issue #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616178512
 
 
   It would be great if we can dissect it down to specific cases in mind and discuss how to solve these cases, I see two kinds of problems:
   
   - (1) EvalSet need to be aware of the bound of Vars
      - Ideally these bounds should be populated in the analyzer via bind.
   
   - `[13, 5] % 10 = [3, 5]` : This case is not very clear to me, how would you define `[13, 5]`? Since if we define it as ```[lower_bound, upper bound]``` then it is an empty set. Is it a typo(e.g. `[3, 5]`? If it is the case for `[3, 5]`, seems we can do it by enhancing the rule of floormod.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yongfeng-nv commented on issue #5367: WIP: Improve floormod (Don't merge)

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on issue #5367: WIP: Improve floormod (Don't merge)
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616035227
 
 
   The current behavior of IntervalSet floormod(a, b) is rough -- it returns [0, b-1], [-b+1, b-1], or everything.  It causes extra iterations to my schedule.  This is a simplified version:
   
       import tvm
       from tvm import te
       
       def test_bound_tile_mod():
           def compute(M_tiles, N_tiles, factor, dtype):
               # Algo
               M = M_tiles * factor
               N = N_tiles * factor
       
               A = tvm.te.placeholder((N, M), name='A', dtype=dtype)
               C = tvm.te.compute((N, M), lambda n, m: A[n, m], name='C')
               s = tvm.te.create_schedule(C.op)
       
               return s, A, C
       
           def schedule(s, factor, padding, A, C):
               C_local = s.cache_write(C, "local")
       
               n, m = C.op.axis
               bn, bm, ni, mi = s[C].tile(n, m, factor, factor)
               nio, nii = s[C].split(ni, 2)
               n = s[C].fuse(nii, mi)
               C_shared = s.cache_write(C, "shared")
               bn, bm, ni, mi = C_shared.op.axis       
               s[C_shared].storage_align(ni, factor * 2, padding)
       
               n, m = s[C].op.axis
               bn, bm, ni, mi = s[C].tile(n, m, factor, factor)
               s[C].set_scope("global")
               niio, niii = s[C].split(ni, 32)
               s[C_shared].compute_at(s[C], niio)
       
               return s
       
           s, A, C = compute(2, 2, 128, "float16")
           s = schedule(s, 128, 8, A, C)
           bounds = tvm.te.schedule.InferBound(s)
           print(tvm.lower(s, [A, C], simple_mode=True))
   
   It does 256x256 point-wise copying in 128x128 tiles from local to shared then to global.  I would like to allocate only a 32x128 shared memory and reuse it four times per tile.  In addition, I want to pad 8 data every two tile rows to avoid bank conflicts in shared memory with storage_align.  I expect C.shared (the local ->shared stage) does exactly 256x256 copying, but the following IR shows 4 times of that.
   
       // attr [C.local] storage_scope = "local"
       allocate C.local[float16 * 65536]
       // attr [C.shared] storage_scope = "shared"
       allocate C.shared[float16 * 16896]
       for (n.c, 0, 256) {
         for (m.c, 0, 256) {
           C.local[((n.c*256) + m.c)] = A[((n.c*256) + m.c)]
         }
       }
       for (n.outer, 0, 2) {
         for (m.outer, 0, 2) {
           for (n.inner.outer, 0, 4) {
             for (n.inner.outer.c, 0, 64) {
               for (n.inner.inner.m.inner.fused.c, 0, 256) {
                 C.shared[((n.inner.outer.c*264) + n.inner.inner.m.inner.fused.c)] = C.local[(((((n.outer*32768) + (n.inner.outer.c*512)) + (floordiv(n.inner.inner.m.inner.fused.c, 128)*256)) + (m.outer*128)) + floormod(n.inner.inner.m.inner.fused.c, 128))]
               }
             }
             for (n.inner.inner, 0, 32) {
               for (m.inner, 0, 128) {
                 C[(((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.inner*256)) + (m.outer*128)) + m.inner)] = C.shared[((((n.inner.outer*4224) + (floordiv(n.inner.inner, 2)*264)) + (floormod(n.inner.inner, 2)*128)) + m.inner)]
               }
             }
           }
         }
       }
               
   If there is another schedule to achieve this, please let me know.  As of now, I can only accomplish this by tiling both C.shared and C.  C's compute now involves several floormod: `C.shared(floordiv(n, 128), floordiv(m, 128), floordiv(floormod(n, 128), 2), ((floormod(floormod(n, 128), 2)*128) + floormod(m, 128)))`.  `m` and `n` further bring in leaf IterVars to floormod to PropBoundToInputs() during InferBound.  Because the best result from floormod(x, 128) is [0, 128-1], C.shared doesn't reduce domain, although it does compute_at s[C]'s n.inner.outer with range [0,3].
   
   With this PR, I am able to get expected IR -- `(n.inner.outer.c, 0, 64)` becoming `(n.inner.outer.c, 0, 16)`:
   
       // attr [C.local] storage_scope = "local"
       allocate C.local[float16 * 65536]
       // attr [C.shared] storage_scope = "shared"
       allocate C.shared[float16 * 4224]
       for (n.c, 0, 256) {
         for (m.c, 0, 256) {
           C.local[((n.c*256) + m.c)] = A[((n.c*256) + m.c)]
         }
       }
       for (n.outer, 0, 2) {
         for (m.outer, 0, 2) {
           for (n.inner.outer, 0, 4) {
             for (n.inner.outer.c, 0, 16) {
               for (n.inner.inner.m.inner.fused.c, 0, 256) {
                 C.shared[((n.inner.outer.c*264) + n.inner.inner.m.inner.fused.c)] = C.local[((((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.outer.c*512)) + (floordiv(n.inner.inner.m.inner.fused.c, 128)*256)) + (m.outer*128)) + floormod(n.inner.inner.m.inner.fused.c, 128))]
               }
             }
             for (n.inner.inner, 0, 32) {
               for (m.inner, 0, 128) {
                 C[(((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.inner*256)) + (m.outer*128)) + m.inner)] = C.shared[(((floordiv(n.inner.inner, 2)*264) + (floormod(n.inner.inner, 2)*128)) + m.inner)]
               }
             }
           }
         }
       }
   
   
   The improvement has two folds:
   1.   EvalSet takes a range map as input to help calculating floormod.  For example:  `[x*4, x*4+3] % 8 = [x*4-x/2*8, x*4+3-(x*4+3)/8*8)]`, if we don't know `x`'s range; but `[x*4, x*4+3]`, if we know `x`'s range is `[0, 1]`.  This is common from tiling.
         
   2.   For `a mod b`, if b is an IntImm, single point, and greater than 0, we do the following:
   
       // a mod b = a - b * (a/b) if
       // (a) a_max - a_min < b, i.e. that before mod, a's range doesn't cover [0, b)
       // and (b) a_min mod b <= a_max mod b, i.e. that a's range is still continuous after mod
   so that `[13, 5] % 10 = [3, 5]`
   
   IntervalSet evaluation is hard.  These improvements are adhoc to the problems I encountered.  I open to any alternatives.
   
   Question: how about mod?  It's implementation looks same as floormod.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r410945718
 
 

 ##########
 File path: src/arith/int_set.cc
 ##########
 @@ -372,7 +387,27 @@ class IntervalSetEvaluator :
   }
 
   IntervalSet Eval(const PrimExpr& val) {
-    return this->VisitExpr(val);
+    IntervalSet result = this->VisitExpr(val);
 
 Review comment:
   Ideally we do not want to add an additional case here, as it could recursive , perhaps need more thoughts

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r410945647
 
 

 ##########
 File path: src/arith/int_set.cc
 ##########
 @@ -372,7 +387,27 @@ class IntervalSetEvaluator :
   }
 
   IntervalSet Eval(const PrimExpr& val) {
-    return this->VisitExpr(val);
+    IntervalSet result = this->VisitExpr(val);
+    // Use the IterVar range info bound to analyzer to further simplify
+    // and reduce the interval
+    auto min_value_expr = analyzer_->Simplify(result->min_value);
+    auto max_value_expr = analyzer_->Simplify(result->max_value);
+    auto min_bd = analyzer_->const_int_bound(min_value_expr);
+    auto max_bd = analyzer_->const_int_bound(max_value_expr);
+    if (min_bd->max_value == min_bd->min_value && max_bd->max_value == max_bd->min_value) {
+      const auto* min_ptr = result->min_value.as<IntImmNode>();
+      const auto* max_ptr = result->max_value.as<IntImmNode>();
+      // The following if statement is necessary.  When result is a single point of IntImm, such as
+      // [0, 0], both 0s refer the same ObjectRef.  We really don't want to create a new [0, 0]
+      // IntervalSet and have 0s refer two different ObjectRef.  They will confuse APIs, such as
+      // IntervalSetEvaluator::MatchPoint() and IntervalSetNode::IsSinglePoint().
+      if (min_ptr && max_ptr && min_bd->min_value == min_ptr->value &&
 
 Review comment:
   perhaps we should enhance MatchPoint to directly deal with IntImm

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414944483



##########
File path: include/tvm/arith/analyzer.h
##########
@@ -411,8 +412,9 @@ class TVM_DLL Analyzer {
    *
    * \param var The variable.
    * \param expr The expression we bind to.
+   * \param override Whether do we allow override of existing information.

Review comment:
       Rephrasing the comments.  Thanks




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] hzfan commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
hzfan commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414552082



##########
File path: src/te/operation/compute_op.cc
##########
@@ -231,20 +231,18 @@ void ComputeOpNode::PropBoundToInputs(
           // undefined behaviour), so we can intersect the estimated set of the argument with the
           // range expected by the tensor. However, intersection may result in overly complex
           // expressions, so we perform a more relaxed form of intersection.
-          IntSet arg_intset = EvalSet(call->args[i], dom_map);
+          IntSet arg_intset = analyzer->int_set(call->args[i], ConvertDomMap(dom_map));
           const arith::IntervalSetNode* arg_interval = arg_intset.as<arith::IntervalSetNode>();
           if (arg_interval) {
             PrimExpr shape_i_min_value = make_zero(t->shape[i].dtype());
             PrimExpr shape_i_max_value = t->shape[i] - 1;
             PrimExpr min_value = arg_interval->min_value;
             PrimExpr max_value = arg_interval->max_value;
             // Prefer the shape bounds only when we can prove they are tighter.
-            if (arith::is_neg_inf(min_value) ||
-                analyzer->CanProve(shape_i_min_value >= min_value)) {
+            if (arith::is_pos_inf(max_value) || arith::is_neg_inf(min_value) ||

Review comment:
       The condition seems quite loose. `max_value` gets updated when `arith::is_neg_inf(min_value)` holds. Is this intended?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616885908


   > It would be great if we can dissect it down to specific cases in mind and discuss how to solve these cases, I see two kinds of problems:
   > 
   > * (1) EvalSet need to be aware of the bound of Vars
   >   
   >   * Ideally these bounds should be populated in the analyzer via bind.
   > * (2) `floormod([13, 15], 10) => [3, 5]` : This seems to be a clean case that we can fix first(independent from the Var bound problem), how about we create a separate PR for the var bound case and aim at fixing this one?
   >   
   >   * For `floormod([a, b], c)`:  check if we can prove `floordiv(a, c) == floordiv(b, c)`, if so, we rewrite to `[a - floordiv(a, c) * c, b -  floordiv(b, c) * c]`, otherwise, follow the old rule.
   >   * Same rule can be applied to ConstIntBound
   
   This is also how I categorized the issues.  I can certainly make the second part a separated PR. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] hzfan commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
hzfan commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414004382



##########
File path: src/arith/const_int_bound.cc
##########
@@ -150,10 +150,12 @@ class ConstIntBoundAnalyzer::Impl :
       const PrimExprNode* op = expr.as<PrimExprNode>();
       auto val = bound_->find(op);
       if (val != bound_->end()) {
-        CHECK(val->second->min_value == res.min_value &&
-              val->second->max_value == res.max_value)
-          << "Detected bound for " << expr
-          << "conflicts with memorization";
+        auto everything = Everything(op->dtype);
+        CHECK(
+            (val->second->min_value == res.min_value && val->second->max_value == res.max_value) ||
+            (val->second->min_value == everything.min_value &&

Review comment:
       I agree that override with partial order is good. Can we check if `res` is contained in `val->second` ? (`val->second` being everything is a special case of this)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-617876621


   Note that directly fall into the above rule can lead to counter examples, e.g. ` [1, 21] % 10 `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414946295



##########
File path: src/te/operation/compute_op.cc
##########
@@ -231,20 +231,18 @@ void ComputeOpNode::PropBoundToInputs(
           // undefined behaviour), so we can intersect the estimated set of the argument with the
           // range expected by the tensor. However, intersection may result in overly complex
           // expressions, so we perform a more relaxed form of intersection.
-          IntSet arg_intset = EvalSet(call->args[i], dom_map);
+          IntSet arg_intset = analyzer->int_set(call->args[i], ConvertDomMap(dom_map));
           const arith::IntervalSetNode* arg_interval = arg_intset.as<arith::IntervalSetNode>();
           if (arg_interval) {
             PrimExpr shape_i_min_value = make_zero(t->shape[i].dtype());
             PrimExpr shape_i_max_value = t->shape[i] - 1;
             PrimExpr min_value = arg_interval->min_value;
             PrimExpr max_value = arg_interval->max_value;
             // Prefer the shape bounds only when we can prove they are tighter.
-            if (arith::is_neg_inf(min_value) ||
-                analyzer->CanProve(shape_i_min_value >= min_value)) {
+            if ((arith::is_pos_inf(max_value) && arith::is_neg_inf(min_value)) ||
+                (analyzer->CanProve(shape_i_min_value >= min_value) &&
+                 analyzer->CanProve(shape_i_max_value <= max_value))) {

Review comment:
       `tests/python/unittest/test_target_codegen_blob.py` exposes a case, where `shape_i` is `[0, 0]` and arg_interval is `[threadIdx.y, threadIdx.y]`,where `threadIdx.y`'s range is `[0, 7]`.  Either `[0, 0]` or `[threadIdx.y, threadIdx.y]` is fine.  The current logic ends with `[threadIdx.y, 0]`, messing up further transforms and triggering an assertion later.  We'd better to update both ends at the same time, i.e. treating IntervalSet as an atomic object.  Adding this explanation as comment in the code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-619645611


   Thanks @Hzfengsy @hzfan @yongfeng-nv !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r413277656



##########
File path: src/arith/int_set.cc
##########
@@ -372,7 +387,27 @@ class IntervalSetEvaluator :
   }
 
   IntervalSet Eval(const PrimExpr& val) {
-    return this->VisitExpr(val);
+    IntervalSet result = this->VisitExpr(val);
+    // Use the IterVar range info bound to analyzer to further simplify
+    // and reduce the interval
+    auto min_value_expr = analyzer_->Simplify(result->min_value);
+    auto max_value_expr = analyzer_->Simplify(result->max_value);
+    auto min_bd = analyzer_->const_int_bound(min_value_expr);
+    auto max_bd = analyzer_->const_int_bound(max_value_expr);
+    if (min_bd->max_value == min_bd->min_value && max_bd->max_value == max_bd->min_value) {
+      const auto* min_ptr = result->min_value.as<IntImmNode>();
+      const auto* max_ptr = result->max_value.as<IntImmNode>();
+      // The following if statement is necessary.  When result is a single point of IntImm, such as
+      // [0, 0], both 0s refer the same ObjectRef.  We really don't want to create a new [0, 0]
+      // IntervalSet and have 0s refer two different ObjectRef.  They will confuse APIs, such as
+      // IntervalSetEvaluator::MatchPoint() and IntervalSetNode::IsSinglePoint().
+      if (min_ptr && max_ptr && min_bd->min_value == min_ptr->value &&

Review comment:
       Moved the IntImm handling to MatchPoint and IntervalSetNode::IsSinglePoint().  Reverted changes here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yzhliu commented on pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yzhliu commented on pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-619238149


   @icemelon9 is this what you're looking for?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414674995



##########
File path: src/te/operation/compute_op.cc
##########
@@ -231,20 +231,18 @@ void ComputeOpNode::PropBoundToInputs(
           // undefined behaviour), so we can intersect the estimated set of the argument with the
           // range expected by the tensor. However, intersection may result in overly complex
           // expressions, so we perform a more relaxed form of intersection.
-          IntSet arg_intset = EvalSet(call->args[i], dom_map);
+          IntSet arg_intset = analyzer->int_set(call->args[i], ConvertDomMap(dom_map));
           const arith::IntervalSetNode* arg_interval = arg_intset.as<arith::IntervalSetNode>();
           if (arg_interval) {
             PrimExpr shape_i_min_value = make_zero(t->shape[i].dtype());
             PrimExpr shape_i_max_value = t->shape[i] - 1;
             PrimExpr min_value = arg_interval->min_value;
             PrimExpr max_value = arg_interval->max_value;
             // Prefer the shape bounds only when we can prove they are tighter.
-            if (arith::is_neg_inf(min_value) ||
-                analyzer->CanProve(shape_i_min_value >= min_value)) {
+            if (arith::is_pos_inf(max_value) || arith::is_neg_inf(min_value) ||

Review comment:
       No, I only mean to change a bound's ends in pair.  Correcting the code.  Thank you for point it out.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-617894234


   In my example `floormod([z*8+x*4, z*8+x*4+3], 8)`, the expected result is either `[0, 3]` or `[4, 7]`, depending on `x`.
   The current implementation returns `[0, 9]` for the test case of `[1, 21] mod 10`, because `21 - 1 >= 10`.
   Both logic are correct and equivalent in theory.  The current one depends on less complicated ops. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r411885530



##########
File path: src/arith/int_set.cc
##########
@@ -730,12 +765,30 @@ Map<Var, IntSet> ConvertDomMap(
   return dmap;
 }
 
-IntSet EvalSet(PrimExpr e,
-               const Map<Var, IntSet>& dom_map) {
+IntSet EvalSet(PrimExpr e, const Map<Var, IntSet>& dom_map,
+               const std::unordered_map<IterVar, Range>& rmap) {
   Analyzer ana;
+  // Bind ana with rmap
+  for (auto entry : rmap) {
+    ana.Bind(entry.first->var, entry.second);

Review comment:
       Don't need this change any more.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv edited a comment on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616035227


   The current behavior of IntervalSet floormod(a, b) is rough -- it returns [0, b-1], [-b+1, b-1], or everything.  It causes extra iterations to my schedule.  This is a simplified version:
   
       import tvm
       from tvm import te
       
       def test_bound_tile_mod():
           def compute(M_tiles, N_tiles, factor, dtype):
               # Algo
               M = M_tiles * factor
               N = N_tiles * factor
       
               A = tvm.te.placeholder((N, M), name='A', dtype=dtype)
               C = tvm.te.compute((N, M), lambda n, m: A[n, m], name='C')
               s = tvm.te.create_schedule(C.op)
       
               return s, A, C
       
           def schedule(s, factor, padding, A, C):
               C_local = s.cache_write(C, "local")
       
               n, m = C.op.axis
               bn, bm, ni, mi = s[C].tile(n, m, factor, factor)
               nio, nii = s[C].split(ni, 2)
               n = s[C].fuse(nii, mi)
               C_shared = s.cache_write(C, "shared")
               bn, bm, ni, mi = C_shared.op.axis       
               s[C_shared].storage_align(ni, factor * 2, padding)
       
               n, m = s[C].op.axis
               bn, bm, ni, mi = s[C].tile(n, m, factor, factor)
               s[C].set_scope("global")
               niio, niii = s[C].split(ni, 32)
               s[C_shared].compute_at(s[C], niio)
       
               return s
       
           s, A, C = compute(2, 2, 128, "float16")
           s = schedule(s, 128, 8, A, C)
           bounds = tvm.te.schedule.InferBound(s)
           print(tvm.lower(s, [A, C], simple_mode=True))
   
   It does 256x256 point-wise copying in 128x128 tiles from local to shared then to global.  I would like to allocate only a 32x128 shared memory and reuse it four times per tile.  In addition, I want to pad 8 data every two tile rows to avoid bank conflicts in shared memory with storage_align.  I expect C.shared (the local ->shared stage) does exactly 256x256 copying, but the following IR shows 4 times of that.
   
       // attr [C.local] storage_scope = "local"
       allocate C.local[float16 * 65536]
       // attr [C.shared] storage_scope = "shared"
       allocate C.shared[float16 * 16896]
       for (n.c, 0, 256) {
         for (m.c, 0, 256) {
           C.local[((n.c*256) + m.c)] = A[((n.c*256) + m.c)]
         }
       }
       for (n.outer, 0, 2) {
         for (m.outer, 0, 2) {
           for (n.inner.outer, 0, 4) {
             for (n.inner.outer.c, 0, 64) {
               for (n.inner.inner.m.inner.fused.c, 0, 256) {
                 C.shared[((n.inner.outer.c*264) + n.inner.inner.m.inner.fused.c)] = C.local[(((((n.outer*32768) + (n.inner.outer.c*512)) + (floordiv(n.inner.inner.m.inner.fused.c, 128)*256)) + (m.outer*128)) + floormod(n.inner.inner.m.inner.fused.c, 128))]
               }
             }
             for (n.inner.inner, 0, 32) {
               for (m.inner, 0, 128) {
                 C[(((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.inner*256)) + (m.outer*128)) + m.inner)] = C.shared[((((n.inner.outer*4224) + (floordiv(n.inner.inner, 2)*264)) + (floormod(n.inner.inner, 2)*128)) + m.inner)]
               }
             }
           }
         }
       }
               
   If there is another schedule to achieve this, please let me know.  As of now, I can only accomplish this by tiling both C.shared and C.  C's compute now involves several floormod: `C.shared(floordiv(n, 128), floordiv(m, 128), floordiv(floormod(n, 128), 2), ((floormod(floormod(n, 128), 2)*128) + floormod(m, 128)))`.  `m` and `n` further bring in leaf IterVars to floormod to PropBoundToInputs() during InferBound.  Because the best result from floormod(x, 128) is [0, 128-1], C.shared doesn't reduce domain, although it does compute_at s[C]'s n.inner.outer with range [0,3].
   
   With this PR, I am able to get expected IR -- `(n.inner.outer.c, 0, 64)` becoming `(n.inner.outer.c, 0, 16)`:
   
       // attr [C.local] storage_scope = "local"
       allocate C.local[float16 * 65536]
       // attr [C.shared] storage_scope = "shared"
       allocate C.shared[float16 * 4224]
       for (n.c, 0, 256) {
         for (m.c, 0, 256) {
           C.local[((n.c*256) + m.c)] = A[((n.c*256) + m.c)]
         }
       }
       for (n.outer, 0, 2) {
         for (m.outer, 0, 2) {
           for (n.inner.outer, 0, 4) {
             for (n.inner.outer.c, 0, 16) {
               for (n.inner.inner.m.inner.fused.c, 0, 256) {
                 C.shared[((n.inner.outer.c*264) + n.inner.inner.m.inner.fused.c)] = C.local[((((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.outer.c*512)) + (floordiv(n.inner.inner.m.inner.fused.c, 128)*256)) + (m.outer*128)) + floormod(n.inner.inner.m.inner.fused.c, 128))]
               }
             }
             for (n.inner.inner, 0, 32) {
               for (m.inner, 0, 128) {
                 C[(((((n.outer*32768) + (n.inner.outer*8192)) + (n.inner.inner*256)) + (m.outer*128)) + m.inner)] = C.shared[(((floordiv(n.inner.inner, 2)*264) + (floormod(n.inner.inner, 2)*128)) + m.inner)]
               }
             }
           }
         }
       }
   
   
   The improvement has two folds:
   1.   EvalSet takes a range map as input to help calculating floormod.  For example:  `[x*4, x*4+3] % 8 = [x*4-x/2*8, x*4+3-(x*4+3)/8*8)]`, if we don't know `x`'s range; but `[x*4, x*4+3]`, if we know `x`'s range is `[0, 1]`.  This is common from tiling.
         
   2.   For `a mod b`, if b is an IntImm, single point, and greater than 0, we do the following:
   
       // a mod b = a - b * (a/b) if
       // (a) a_max - a_min < b, i.e. that before mod, a's range doesn't cover [0, b)
       // and (b) a_min mod b <= a_max mod b, i.e. that a's range is still continuous after mod
   so that `[13, 15] % 10 = [3, 5]`
   
   IntervalSet evaluation is hard.  These improvements are adhoc to the problems I encountered.  I am open to any alternatives.
   
   Question: how about mod?  It's implementation looks same as floormod.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv edited a comment on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-617861746


   > > It would be great if we can dissect it down to specific cases in mind and discuss how to solve these cases, I see two kinds of problems:
   > > 
   > > * (1) EvalSet need to be aware of the bound of Vars
   > >   
   > >   * Ideally these bounds should be populated in the analyzer via bind.
   > > * (2) `floormod([13, 15], 10) => [3, 5]` : This seems to be a clean case that we can fix first(independent from the Var bound problem), how about we create a separate PR for the var bound case and aim at fixing this one?
   > >   
   > >   * For `floormod([a, b], c)`:  check if we can prove `floordiv(a, c) == floordiv(b, c)`, if so, we rewrite to `[a - floordiv(a, c) * c, b -  floordiv(b, c) * c]`, otherwise, follow the old rule.
   > >   * Same rule can be applied to ConstIntBound
   > 
   > This is also how I categorized the issues. I can certainly make the second part a separated PR.
   
   While implementing a separate PR, I got a test failure: `floormod([z*8+x*4, z*8+x*4+3], 8)`
   
   The suggested logic falls back to `[0, 7]`, because it can't prove `floordiv(x*4, 8) == floordiv(x*4+3, 8)`. 
    The current implementation is better, returning `[(x*4-floordiv(x, 2)*8, x*4+3-floordiv(x*4+3, 8)*8)]`.
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-617850372


   cc @yzhliu @hzfan @ZihengJiang @Hzfengsy would be great if you can help to review the PR, specificially wrt to the floormod simplification part


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yzhliu commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yzhliu commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414870644



##########
File path: include/tvm/arith/analyzer.h
##########
@@ -411,8 +412,9 @@ class TVM_DLL Analyzer {
    *
    * \param var The variable.
    * \param expr The expression we bind to.
+   * \param override Whether do we allow override of existing information.

Review comment:
       ```suggestion
      * \param override Whether do we allow override of existing binding variables.
   ```
   suggest to be more specific.

##########
File path: src/arith/int_set.cc
##########
@@ -311,6 +311,16 @@ inline IntervalSet Combine<tir::FloorModNode>(Analyzer* analyzer,
       LOG(FATAL) << "Modular by zero in CombineInterval Mod";
     }
     if (analyzer->CanProveGreaterEqual(divisor, 0)) {
+      if (b->min_value.as<tir::IntImmNode>()) {

Review comment:
       use `divisor` to keep consistent?

##########
File path: src/te/operation/compute_op.cc
##########
@@ -231,20 +231,18 @@ void ComputeOpNode::PropBoundToInputs(
           // undefined behaviour), so we can intersect the estimated set of the argument with the
           // range expected by the tensor. However, intersection may result in overly complex
           // expressions, so we perform a more relaxed form of intersection.
-          IntSet arg_intset = EvalSet(call->args[i], dom_map);
+          IntSet arg_intset = analyzer->int_set(call->args[i], ConvertDomMap(dom_map));
           const arith::IntervalSetNode* arg_interval = arg_intset.as<arith::IntervalSetNode>();
           if (arg_interval) {
             PrimExpr shape_i_min_value = make_zero(t->shape[i].dtype());
             PrimExpr shape_i_max_value = t->shape[i] - 1;
             PrimExpr min_value = arg_interval->min_value;
             PrimExpr max_value = arg_interval->max_value;
             // Prefer the shape bounds only when we can prove they are tighter.
-            if (arith::is_neg_inf(min_value) ||
-                analyzer->CanProve(shape_i_min_value >= min_value)) {
+            if ((arith::is_pos_inf(max_value) && arith::is_neg_inf(min_value)) ||
+                (analyzer->CanProve(shape_i_min_value >= min_value) &&
+                 analyzer->CanProve(shape_i_max_value <= max_value))) {

Review comment:
       could you elaborate a bit why do we need this change?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r413285295



##########
File path: src/te/schedule/bound.cc
##########
@@ -199,11 +202,13 @@ void InferRootBound(const Stage& stage,
         r = iv->dom;
       }
       if (relax_set.size() != 0) {
-        dom_map[iv->var.get()] = EvalSet(r, relax_set);
+        dom_map[iv->var.get()] = IntSet::interval(
+            analyzer.int_set(r->min, relax_set).min(),
+            analyzer.int_set(r->min + r->extent - 1, relax_set).max());
       } else {
         dom_map[iv->var.get()] = IntSet::range(r);
       }
-      analyzer.Bind(iv->var, r);
+      analyzer.Bind(iv->var, r, true);

Review comment:
       The previous Bind call binds root IterVars.  We have to override them here.  Is there a easy way to tell an IterVar root IterVar?  If so, we can avoid such binding/overriding.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r411796012



##########
File path: src/arith/int_set.cc
##########
@@ -372,7 +387,27 @@ class IntervalSetEvaluator :
   }
 
   IntervalSet Eval(const PrimExpr& val) {
-    return this->VisitExpr(val);
+    IntervalSet result = this->VisitExpr(val);
+    // Use the IterVar range info bound to analyzer to further simplify
+    // and reduce the interval
+    auto min_value_expr = analyzer_->Simplify(result->min_value);
+    auto max_value_expr = analyzer_->Simplify(result->max_value);
+    auto min_bd = analyzer_->const_int_bound(min_value_expr);
+    auto max_bd = analyzer_->const_int_bound(max_value_expr);
+    if (min_bd->max_value == min_bd->min_value && max_bd->max_value == max_bd->min_value) {
+      const auto* min_ptr = result->min_value.as<IntImmNode>();
+      const auto* max_ptr = result->max_value.as<IntImmNode>();
+      // The following if statement is necessary.  When result is a single point of IntImm, such as
+      // [0, 0], both 0s refer the same ObjectRef.  We really don't want to create a new [0, 0]
+      // IntervalSet and have 0s refer two different ObjectRef.  They will confuse APIs, such as
+      // IntervalSetEvaluator::MatchPoint() and IntervalSetNode::IsSinglePoint().
+      if (min_ptr && max_ptr && min_bd->min_value == min_ptr->value &&

Review comment:
       I think these two are the only APIs to be fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-616964958


   > In particular, my comment wrt to EvalSet means we should directly use the IntSetAnalyzer member of the analyzer, that can should have part of the Var's bound info
   
   Removing all changes about EvalSet().  Binding rmap to analyzer before calling analyzer->int_set().


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r413843881



##########
File path: src/arith/int_set.cc
##########
@@ -518,7 +533,14 @@ class IntervalSetEvaluator :
   // whether set is exactly single point that equals value.
   bool MatchPoint(const IntervalSet& set,
                   const PrimExpr& value) const {
-    return set->min_value.same_as(value) && set->max_value.same_as(value);
+    if (set->min_value.same_as(value) && set->max_value.same_as(value)) {

Review comment:
       I don't need to change MatchPoint and IsSinglePoint after binding all ranges to one analyzer.  I have reverted the changes to these two functions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv edited a comment on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv edited a comment on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-617894234


   In my example `floormod([z*8+x*4, z*8+x*4+3], 8)`, the expected result is either `[0, 3]` or `[4, 7]`, depending on `x`.
   The current PR's implementation returns `[0, 9]` for the test case of `[1, 21] mod 10`, because `21 - 1 >= 10`.
   Both logic are correct and equivalent in theory.  The current one depends on less complicated ops. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-617876621


   Note that directly fall into the mod can leads to counter examples, e.g. ` [1, 21] % 10 `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] hzfan commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
hzfan commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r413175611



##########
File path: src/arith/const_int_bound.cc
##########
@@ -150,10 +150,12 @@ class ConstIntBoundAnalyzer::Impl :
       const PrimExprNode* op = expr.as<PrimExprNode>();
       auto val = bound_->find(op);
       if (val != bound_->end()) {
-        CHECK(val->second->min_value == res.min_value &&
-              val->second->max_value == res.max_value)
-          << "Detected bound for " << expr
-          << "conflicts with memorization";
+        auto everything = Everything(op->dtype);
+        CHECK(
+            (val->second->min_value == res.min_value && val->second->max_value == res.max_value) ||
+            (val->second->min_value == everything.min_value &&

Review comment:
       I guess this is for the case where `override == true`, right? If so, what about make it explicit and just checks if override is true? Or can we clear the whole memo map when updating with override?

##########
File path: src/arith/int_set.cc
##########
@@ -518,7 +533,14 @@ class IntervalSetEvaluator :
   // whether set is exactly single point that equals value.
   bool MatchPoint(const IntervalSet& set,
                   const PrimExpr& value) const {
-    return set->min_value.same_as(value) && set->max_value.same_as(value);
+    if (set->min_value.same_as(value) && set->max_value.same_as(value)) {

Review comment:
       Can we reuse `CanProve` here to check if `set->min_value == value && set->max_value == value`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414259311



##########
File path: include/tvm/arith/int_set.h
##########
@@ -152,6 +152,22 @@ class IntSet : public ObjectRef {
 //-----------------------------------------------
 // Integer set legacy API.
 //------------------------------------------------
+    /*!
+     * \brief Convert std::unordered_map<const VarNode*, IntSet> to Map<Var, IntSet>
+     *
+     * \param dom_map The domain map to convert.
+     * \return The converted map.
+     */
+Map<Var, IntSet> ConvertDomMap(const std::unordered_map<const VarNode*, IntSet>& dom_map);
+// /*!
+//  * \brief Find an symbolic integer set that contains all possible values of
+//  *  e given the domain of each iteration variables.
+//  *
+//  * \param e The expression to be evaluated.
+//  * \param dom_map The domain of each variable.
+//  * \return An integer set that can cover all the possible values of e.
+//  */
+// IntSet EvalSet(PrimExpr e, const Map<Var, IntSet>& dom_map);

Review comment:
       My bad.  Clean it up.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r411886246



##########
File path: include/tvm/te/operation.h
##########
@@ -112,6 +113,7 @@ class OperationNode : public tir::FunctionBaseNode {
       const Operation& self,
       arith::Analyzer* analyzer,
       const std::unordered_map<const VarNode*, IntSet>& dom_map,
+      const std::unordered_map<IterVar, Range>& rmap,

Review comment:
       Don't need this change as no need to pass rmap to EvalSet any more.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r411785613



##########
File path: src/arith/int_set.cc
##########
@@ -372,7 +387,27 @@ class IntervalSetEvaluator :
   }
 
   IntervalSet Eval(const PrimExpr& val) {
-    return this->VisitExpr(val);
+    IntervalSet result = this->VisitExpr(val);
+    // Use the IterVar range info bound to analyzer to further simplify
+    // and reduce the interval
+    auto min_value_expr = analyzer_->Simplify(result->min_value);
+    auto max_value_expr = analyzer_->Simplify(result->max_value);
+    auto min_bd = analyzer_->const_int_bound(min_value_expr);
+    auto max_bd = analyzer_->const_int_bound(max_value_expr);
+    if (min_bd->max_value == min_bd->min_value && max_bd->max_value == max_bd->min_value) {
+      const auto* min_ptr = result->min_value.as<IntImmNode>();
+      const auto* max_ptr = result->max_value.as<IntImmNode>();
+      // The following if statement is necessary.  When result is a single point of IntImm, such as
+      // [0, 0], both 0s refer the same ObjectRef.  We really don't want to create a new [0, 0]
+      // IntervalSet and have 0s refer two different ObjectRef.  They will confuse APIs, such as
+      // IntervalSetEvaluator::MatchPoint() and IntervalSetNode::IsSinglePoint().
+      if (min_ptr && max_ptr && min_bd->min_value == min_ptr->value &&

Review comment:
       I tried to fix MatchPoint first, then I found out IntervalSetNode::IsSinglePoint() requiring update too and worried about more APIs to touch.  Now I am moving the IntImm single point handling to IntervalSet constructor.  At certain point, I think immediate values are better to refactor to singleton.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r413188821



##########
File path: src/arith/const_int_bound.cc
##########
@@ -150,10 +150,12 @@ class ConstIntBoundAnalyzer::Impl :
       const PrimExprNode* op = expr.as<PrimExprNode>();
       auto val = bound_->find(op);
       if (val != bound_->end()) {
-        CHECK(val->second->min_value == res.min_value &&
-              val->second->max_value == res.max_value)
-          << "Detected bound for " << expr
-          << "conflicts with memorization";
+        auto everything = Everything(op->dtype);
+        CHECK(
+            (val->second->min_value == res.min_value && val->second->max_value == res.max_value) ||
+            (val->second->min_value == everything.min_value &&

Review comment:
       It is not exactly override, but assuming a partial order that limit bound is better than inf bound.  I still keep assertion for different limit bounds, when override == false.
   This change alters API semantics, but I am not sure if this behavior worth another mode in addition to override.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r413183510



##########
File path: src/arith/const_int_bound.cc
##########
@@ -150,10 +150,12 @@ class ConstIntBoundAnalyzer::Impl :
       const PrimExprNode* op = expr.as<PrimExprNode>();
       auto val = bound_->find(op);
       if (val != bound_->end()) {
-        CHECK(val->second->min_value == res.min_value &&
-              val->second->max_value == res.max_value)
-          << "Detected bound for " << expr
-          << "conflicts with memorization";
+        auto everything = Everything(op->dtype);
+        CHECK(
+            (val->second->min_value == res.min_value && val->second->max_value == res.max_value) ||
+            (val->second->min_value == everything.min_value &&

Review comment:
       Good catch.  I forgot to bring this up for discussion.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] hzfan commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
hzfan commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414548737



##########
File path: src/te/operation/compute_op.cc
##########
@@ -231,20 +231,18 @@ void ComputeOpNode::PropBoundToInputs(
           // undefined behaviour), so we can intersect the estimated set of the argument with the
           // range expected by the tensor. However, intersection may result in overly complex
           // expressions, so we perform a more relaxed form of intersection.
-          IntSet arg_intset = EvalSet(call->args[i], dom_map);
+          IntSet arg_intset = analyzer->int_set(call->args[i], ConvertDomMap(dom_map));
           const arith::IntervalSetNode* arg_interval = arg_intset.as<arith::IntervalSetNode>();
           if (arg_interval) {
             PrimExpr shape_i_min_value = make_zero(t->shape[i].dtype());
             PrimExpr shape_i_max_value = t->shape[i] - 1;
             PrimExpr min_value = arg_interval->min_value;
             PrimExpr max_value = arg_interval->max_value;
             // Prefer the shape bounds only when we can prove they are tighter.
-            if (arith::is_neg_inf(min_value) ||
-                analyzer->CanProve(shape_i_min_value >= min_value)) {
+            if (arith::is_pos_inf(max_value) || arith::is_neg_inf(min_value) ||

Review comment:
       The condition seems quite loose. `max_value` gets updated when `arith::is_neg_inf(min_value)` holds. Is this intended?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-617861746


   > > It would be great if we can dissect it down to specific cases in mind and discuss how to solve these cases, I see two kinds of problems:
   > > 
   > > * (1) EvalSet need to be aware of the bound of Vars
   > >   
   > >   * Ideally these bounds should be populated in the analyzer via bind.
   > > * (2) `floormod([13, 15], 10) => [3, 5]` : This seems to be a clean case that we can fix first(independent from the Var bound problem), how about we create a separate PR for the var bound case and aim at fixing this one?
   > >   
   > >   * For `floormod([a, b], c)`:  check if we can prove `floordiv(a, c) == floordiv(b, c)`, if so, we rewrite to `[a - floordiv(a, c) * c, b -  floordiv(b, c) * c]`, otherwise, follow the old rule.
   > >   * Same rule can be applied to ConstIntBound
   > 
   > This is also how I categorized the issues. I can certainly make the second part a separated PR.
   
   While implementing a separate PR, I got a test failure: floormod([z*8+x*4, z*8+x*4+3], 8)
   
   The suggested logic falls back to [0, 7], because it can't prove floordiv(x*4, 8) == floordiv(x*4+3, 8). 
    The current implementation is better, returning [(x*4-floordiv(x, 2)*8, x*4+3-floordiv(x*4+3, 8)*8)].
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414263234



##########
File path: src/arith/int_set.cc
##########
@@ -311,6 +311,21 @@ inline IntervalSet Combine<tir::FloorModNode>(Analyzer* analyzer,
       LOG(FATAL) << "Modular by zero in CombineInterval Mod";
     }
     if (analyzer->CanProveGreaterEqual(divisor, 0)) {
+      if (const auto* ptr = b->min_value.as<tir::IntImmNode>()) {
+        // a mod b = a - b * (a/b) if
+        // (i) a_max - a_min < b, i.e. that before mod, a's range doesn't cover [0, b)
+        // and (ii) a_min mod b <= a_max mod b, i.e. that a's range is still continuous after mod
+        auto tmax = a->max_value - b->min_value * floordiv(a->max_value, b->min_value);
+        tmax = analyzer->Simplify(tmax);
+        auto tmin = a->min_value - b->min_value * floordiv(a->min_value, b->min_value);
+        tmin = analyzer->Simplify(tmin);
+        auto tset = IntervalSet(tmin, tmax);
+        bool within_range = analyzer->CanProveLess(a->max_value - a->min_value, ptr->value);
+        bool wrap_around = analyzer->CanProve(tset->max_value < tset->min_value);

Review comment:
       I missed the point that CanProve is necessary but not sufficient here.  Once I modified the condition to tset->max_value >= tset->min_value, floormod([z*8+x*4, z*8+x*4+3], 8) fell back to [0, 7], matching your earlier suggested floormod implementation, because the it can't prove (((x*4) + 3) - (floordiv(((x*4) + 3), 8)*8)) >= ((x*4) - (floordiv(x, 2)*8)).  Therefore, I switch to the simple implementation and modify the test.

##########
File path: src/arith/int_set.cc
##########
@@ -311,6 +311,21 @@ inline IntervalSet Combine<tir::FloorModNode>(Analyzer* analyzer,
       LOG(FATAL) << "Modular by zero in CombineInterval Mod";
     }
     if (analyzer->CanProveGreaterEqual(divisor, 0)) {
+      if (const auto* ptr = b->min_value.as<tir::IntImmNode>()) {
+        // a mod b = a - b * (a/b) if
+        // (i) a_max - a_min < b, i.e. that before mod, a's range doesn't cover [0, b)
+        // and (ii) a_min mod b <= a_max mod b, i.e. that a's range is still continuous after mod
+        auto tmax = a->max_value - b->min_value * floordiv(a->max_value, b->min_value);
+        tmax = analyzer->Simplify(tmax);
+        auto tmin = a->min_value - b->min_value * floordiv(a->min_value, b->min_value);
+        tmin = analyzer->Simplify(tmin);
+        auto tset = IntervalSet(tmin, tmax);
+        bool within_range = analyzer->CanProveLess(a->max_value - a->min_value, ptr->value);
+        bool wrap_around = analyzer->CanProve(tset->max_value < tset->min_value);
+        if (within_range && !wrap_around) {
+          return tset;

Review comment:
       No longer an issue with the update.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414940958



##########
File path: src/arith/int_set.cc
##########
@@ -311,6 +311,16 @@ inline IntervalSet Combine<tir::FloorModNode>(Analyzer* analyzer,
       LOG(FATAL) << "Modular by zero in CombineInterval Mod";
     }
     if (analyzer->CanProveGreaterEqual(divisor, 0)) {
+      if (b->min_value.as<tir::IntImmNode>()) {

Review comment:
       Make sense.  I will make the change.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-618503288


   @hzfan please take another look and https://tvm.apache.org/docs/contribute/code_review.html#approve-and-request-changes-explicitly


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414034656



##########
File path: src/arith/const_int_bound.cc
##########
@@ -150,10 +150,12 @@ class ConstIntBoundAnalyzer::Impl :
       const PrimExprNode* op = expr.as<PrimExprNode>();
       auto val = bound_->find(op);
       if (val != bound_->end()) {
-        CHECK(val->second->min_value == res.min_value &&
-              val->second->max_value == res.max_value)
-          << "Detected bound for " << expr
-          << "conflicts with memorization";
+        auto everything = Everything(op->dtype);
+        CHECK(
+            (val->second->min_value == res.min_value && val->second->max_value == res.max_value) ||
+            (val->second->min_value == everything.min_value &&

Review comment:
       @hzfan do you mean to update val->second, when res is a subset?  It is a looser check than the current change.  I haven't seen any case that a limit bound improves from a looser limit bound.  If we don't have any such case now, I'd like to let this check assert for people to justify when they change the behavior. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414259311



##########
File path: include/tvm/arith/int_set.h
##########
@@ -152,6 +152,22 @@ class IntSet : public ObjectRef {
 //-----------------------------------------------
 // Integer set legacy API.
 //------------------------------------------------
+    /*!
+     * \brief Convert std::unordered_map<const VarNode*, IntSet> to Map<Var, IntSet>
+     *
+     * \param dom_map The domain map to convert.
+     * \return The converted map.
+     */
+Map<Var, IntSet> ConvertDomMap(const std::unordered_map<const VarNode*, IntSet>& dom_map);
+// /*!
+//  * \brief Find an symbolic integer set that contains all possible values of
+//  *  e given the domain of each iteration variables.
+//  *
+//  * \param e The expression to be evaluated.
+//  * \param dom_map The domain of each variable.
+//  * \return An integer set that can cover all the possible values of e.
+//  */
+// IntSet EvalSet(PrimExpr e, const Map<Var, IntSet>& dom_map);

Review comment:
       My bad.  Cleaned it up.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r413843721



##########
File path: src/arith/int_set.cc
##########
@@ -372,7 +387,27 @@ class IntervalSetEvaluator :
   }
 
   IntervalSet Eval(const PrimExpr& val) {
-    return this->VisitExpr(val);
+    IntervalSet result = this->VisitExpr(val);

Review comment:
       This change is no longer needed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen edited a comment on pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-619645611


   Thanks @yongfeng-nv @Hzfengsy @hzfan @yzhliu 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r413205094



##########
File path: src/arith/int_set.cc
##########
@@ -518,7 +533,14 @@ class IntervalSetEvaluator :
   // whether set is exactly single point that equals value.
   bool MatchPoint(const IntervalSet& set,
                   const PrimExpr& value) const {
-    return set->min_value.same_as(value) && set->max_value.same_as(value);
+    if (set->min_value.same_as(value) && set->max_value.same_as(value)) {

Review comment:
       Make sense.  I will keep the .same_as(value) calls for performance and use CanProve for the additional comparison.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on issue #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on issue #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#issuecomment-617887709


   Some asterisk symbols do not show in my previous comment.  I just updated it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] tqchen commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414159969



##########
File path: include/tvm/arith/int_set.h
##########
@@ -152,6 +152,22 @@ class IntSet : public ObjectRef {
 //-----------------------------------------------
 // Integer set legacy API.
 //------------------------------------------------
+    /*!
+     * \brief Convert std::unordered_map<const VarNode*, IntSet> to Map<Var, IntSet>
+     *
+     * \param dom_map The domain map to convert.
+     * \return The converted map.
+     */
+Map<Var, IntSet> ConvertDomMap(const std::unordered_map<const VarNode*, IntSet>& dom_map);
+// /*!
+//  * \brief Find an symbolic integer set that contains all possible values of
+//  *  e given the domain of each iteration variables.
+//  *
+//  * \param e The expression to be evaluated.
+//  * \param dom_map The domain of each variable.
+//  * \return An integer set that can cover all the possible values of e.
+//  */
+// IntSet EvalSet(PrimExpr e, const Map<Var, IntSet>& dom_map);

Review comment:
       keep the commented region?

##########
File path: src/arith/int_set.cc
##########
@@ -311,6 +311,21 @@ inline IntervalSet Combine<tir::FloorModNode>(Analyzer* analyzer,
       LOG(FATAL) << "Modular by zero in CombineInterval Mod";
     }
     if (analyzer->CanProveGreaterEqual(divisor, 0)) {
+      if (const auto* ptr = b->min_value.as<tir::IntImmNode>()) {
+        // a mod b = a - b * (a/b) if
+        // (i) a_max - a_min < b, i.e. that before mod, a's range doesn't cover [0, b)
+        // and (ii) a_min mod b <= a_max mod b, i.e. that a's range is still continuous after mod
+        auto tmax = a->max_value - b->min_value * floordiv(a->max_value, b->min_value);
+        tmax = analyzer->Simplify(tmax);
+        auto tmin = a->min_value - b->min_value * floordiv(a->min_value, b->min_value);
+        tmin = analyzer->Simplify(tmin);
+        auto tset = IntervalSet(tmin, tmax);
+        bool within_range = analyzer->CanProveLess(a->max_value - a->min_value, ptr->value);
+        bool wrap_around = analyzer->CanProve(tset->max_value < tset->min_value);
+        if (within_range && !wrap_around) {
+          return tset;

Review comment:
       Consider move the tset construction after the within range check.

##########
File path: src/arith/int_set.cc
##########
@@ -311,6 +311,21 @@ inline IntervalSet Combine<tir::FloorModNode>(Analyzer* analyzer,
       LOG(FATAL) << "Modular by zero in CombineInterval Mod";
     }
     if (analyzer->CanProveGreaterEqual(divisor, 0)) {
+      if (const auto* ptr = b->min_value.as<tir::IntImmNode>()) {
+        // a mod b = a - b * (a/b) if
+        // (i) a_max - a_min < b, i.e. that before mod, a's range doesn't cover [0, b)
+        // and (ii) a_min mod b <= a_max mod b, i.e. that a's range is still continuous after mod
+        auto tmax = a->max_value - b->min_value * floordiv(a->max_value, b->min_value);
+        tmax = analyzer->Simplify(tmax);
+        auto tmin = a->min_value - b->min_value * floordiv(a->min_value, b->min_value);
+        tmin = analyzer->Simplify(tmin);
+        auto tset = IntervalSet(tmin, tmax);
+        bool within_range = analyzer->CanProveLess(a->max_value - a->min_value, ptr->value);
+        bool wrap_around = analyzer->CanProve(tset->max_value < tset->min_value);

Review comment:
       This can be dangerious, since CanProve returns true if it can be proven, but if it is false, it does not mean that the condition won't hold. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] hzfan commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
hzfan commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414548737



##########
File path: src/te/operation/compute_op.cc
##########
@@ -231,20 +231,18 @@ void ComputeOpNode::PropBoundToInputs(
           // undefined behaviour), so we can intersect the estimated set of the argument with the
           // range expected by the tensor. However, intersection may result in overly complex
           // expressions, so we perform a more relaxed form of intersection.
-          IntSet arg_intset = EvalSet(call->args[i], dom_map);
+          IntSet arg_intset = analyzer->int_set(call->args[i], ConvertDomMap(dom_map));
           const arith::IntervalSetNode* arg_interval = arg_intset.as<arith::IntervalSetNode>();
           if (arg_interval) {
             PrimExpr shape_i_min_value = make_zero(t->shape[i].dtype());
             PrimExpr shape_i_max_value = t->shape[i] - 1;
             PrimExpr min_value = arg_interval->min_value;
             PrimExpr max_value = arg_interval->max_value;
             // Prefer the shape bounds only when we can prove they are tighter.
-            if (arith::is_neg_inf(min_value) ||
-                analyzer->CanProve(shape_i_min_value >= min_value)) {
+            if (arith::is_pos_inf(max_value) || arith::is_neg_inf(min_value) ||

Review comment:
       The condition seems quite loose. `max_value` gets updated when `arith::is_neg_inf(min_value)` holds. Is this intended?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-tvm] yongfeng-nv commented on a change in pull request #5367: Improve IntervalSet's floormod

Posted by GitBox <gi...@apache.org>.
yongfeng-nv commented on a change in pull request #5367:
URL: https://github.com/apache/incubator-tvm/pull/5367#discussion_r414674995



##########
File path: src/te/operation/compute_op.cc
##########
@@ -231,20 +231,18 @@ void ComputeOpNode::PropBoundToInputs(
           // undefined behaviour), so we can intersect the estimated set of the argument with the
           // range expected by the tensor. However, intersection may result in overly complex
           // expressions, so we perform a more relaxed form of intersection.
-          IntSet arg_intset = EvalSet(call->args[i], dom_map);
+          IntSet arg_intset = analyzer->int_set(call->args[i], ConvertDomMap(dom_map));
           const arith::IntervalSetNode* arg_interval = arg_intset.as<arith::IntervalSetNode>();
           if (arg_interval) {
             PrimExpr shape_i_min_value = make_zero(t->shape[i].dtype());
             PrimExpr shape_i_max_value = t->shape[i] - 1;
             PrimExpr min_value = arg_interval->min_value;
             PrimExpr max_value = arg_interval->max_value;
             // Prefer the shape bounds only when we can prove they are tighter.
-            if (arith::is_neg_inf(min_value) ||
-                analyzer->CanProve(shape_i_min_value >= min_value)) {
+            if (arith::is_pos_inf(max_value) || arith::is_neg_inf(min_value) ||

Review comment:
       No, I only mean to change a bound's ends in pair.  Correcting the code.  Thank you for pointing it out.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org