You are viewing a plain text version of this content. The canonical link for it is here.
Posted to by Giuseppe Rossini via TVM Discuss <> on 2020/08/28 17:13:05 UTC

[TVM Discuss] [Development] Loop partitioning, padding and tensorization

Hi all, 

In my effort to accelerate AArch64 through tensorization, I incurred into an issue. 

Basically, I am padding my input tensor, to let `tensorize` work (I need rows to be multiple of 4 and cols to be multiple of 16). 

However, bound inference removes padding (since it is not used) and, when I tile the computation,  `tir.likely` statements appear. This results in `tensorize` producing the error:

TVMError: Tensorize failed, split condition tir.likely(((...)) relies on var defined inside tensorize scope

One solution is to add a (sufficiently complex) zero multiplication by a padding element, to trick the bound inference (see for example [here]( 

However, this is very hacky and it is not supposed to last (as the bound inference gets smarter, it might detect that the added element is zero). 

The question is: should we try to come up with a "good" solution for this? 

One idea might be to let tensorize accept `@tir.likely` statements and replace them with a "variable size" tensorization which will be provided by the developer. 

For instance, we might add a `_intrin_func_variable` private function that gets called only when a *variable tensorization* (i.e., a tensorization over `@tir.likely`) is needed. 

I have also read through [this post](, but it doesn't seem to arrive to a concrete solution. 

Any ideas? 

@anijain2305, @FrozenGene, @matt-arm, @tqchen , @ramana-arm

[Visit Topic]( to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](

[TVM Discuss] [Development] Loop partitioning, padding and tensorization

Posted by Giuseppe Rossini via TVM Discuss <>.

Hi Animesh, 
The problem is that I need padding added in the middle of TIR on my (transformed) data tensor. 

I.e., something like
A1 = im2col(A)
A2 = pad(A1)
C_padded = te.compute([M,N], lambda i, j : sum(A2[i,k]*B[k,j], k)
C = unpad(C)+requantization

Then I tile on `C` and tensorize on the inner tile (which is where the problem started). Note that I cannot fuse the requantization to the main computation because of the `unpad`
Also, it would be nice to not pad `A` at all, but to work on a solution that can automatically detect the borders and invoke different kind of tensorizations (if provided) or use scalar computation for the borders (if multiple `tensorizations` are not provided). 

In this way I don't need unpadding and the computation could become:
A1 = im2col(A)
C = te.compute([M,N], lambda i, j : sum(A1[i,k]*B[k,j], k)) + requantization #tensorization handles everything automatically

What do you think?

[Visit Topic]( to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](

[TVM Discuss] [Development] Loop partitioning, padding and tensorization

Posted by Animesh Jain via TVM Discuss <>.

How about using Relay Legalize pass to add an explicit padding at the graph level?

[Visit Topic]( to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](