You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/03/16 06:37:01 UTC

[GitHub] [tvm] masahi opened a new pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

masahi opened a new pull request #7669:
URL: https://github.com/apache/tvm/pull/7669


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi edited a comment on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi edited a comment on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800543774


   Ok updated to cast to float32 only in the problematic case, which is VK + dynamic input on TIR scan. I think this is an acceptable solution for now. Of course, the best solution is to implement TIR level CSE, since the host is doing the same compute anyway and there is no point computing log2 etc in device.
   
   Interestingly, TIR mergepath kernel used in sort, which is also littered with glsl log2 and ceil, doesn't cast to float64 before log2 in the GPU IR. If you see the IR dump https://gist.github.com/masahi/c0979c61907af15f9924b3b3d72fe6a7, there is no `float64` anywhere. But for TIR scan downsweep kernel, there is cast to `float64`. So I removed cast to float32 in TIR sort.
   
   It could also be the case that our SPIRV codegen for int64 to float64 cast is busted, but I haven't checked. Another weird thing is that glsl log2 works correctly if the input size is static.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800584527


   @mbrookhart @tqchen The SPIRV spec says their log2 intrinsics only support 16 or 32 bit floating point https://www.khronos.org/registry/spir-v/specs/1.0/GLSL.std.450.html
   
   `The operand x must be a scalar or vector whose component type is 16-bit or 32-bit floating-point.`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800567338


   Made it a draft while I am reading about clz bit hacks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi edited a comment on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi edited a comment on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800543774


   Ok updated to cast to float32 only in the problematic case, which is VK + dynamic input on TIR scan. I think this is an acceptable solution for now. Of course, the best solution is to implement TIR level CSE, since the host is doing the same compute anyway and there is no point computing log2 etc in device.
   
   Interestingly, TIR mergepath kernel used in sort, which is also littered with glsl log2 and ceil, doesn't cast to float64 before log2 in the GPU IR. If you see the IR dump https://gist.github.com/masahi/c0979c61907af15f9924b3b3d72fe6a7, there is no `float64` anywhere. But for TIR scan downsweep kernel, there is a cast to `float64`. So I removed cast to float32 in TIR sort.
   
   It could also be the case that our SPIRV codegen for int64 to float64 cast is busted, but I haven't checked. Another weird thing is that glsl log2 works correctly if the input size is static.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi edited a comment on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi edited a comment on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800543774


   Ok updated to cast to float32 only in the problematic case, which is VK + dynamic input on TIR scan. I think this is an acceptable solution for now. Of course, the best solution is to implement TIR level CSE, since the host is doing the same compute anyway and there is no point computing log2 etc in device.
   
   Interestingly, TIR mergepath kernel used in sort, which is also littered with glsl log2 and ceil, doesn't cast to float64 before log2 in GPU IR. If you see the IR dump https://gist.github.com/masahi/c0979c61907af15f9924b3b3d72fe6a7, there is no `float64` anywhere. But for TIR scan downsweep kernel, there is cast to `float64`. So I removed cast to float32 in TIR sort.
   
   It could also be the case that our SPIRV codegen for int64 to float64 cast is busted, but I haven't checked. Another weird thing is that glsl log2 works correctly if the input size is static.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi edited a comment on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi edited a comment on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800610821


   Looks like one reasonable way to implement `ceil(log2(x)` is ` 32 - clz(x) + (x & (x-1) ? 1 : 0)` for 32 bit integers. We need to be careful with 32 bit vs 64 bit and signed vs unsigned.
   
   We need to add intrinsic lowering of `tvm.tir.clz` for llvm and spirv. I'll do that next week.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800610821


   Looks like a one reasnoable way to implement `ceil(log2(x)` is ` 32 - clz(x) + (x & (x-1) ? 1 : 0)` for 32 bit integers. We need to be careful with 32 bit vs 64 bit and signed vs unsigned.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-821755788


   Thanks @mbrookhart @tqchen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi edited a comment on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi edited a comment on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800543774


   Ok updated to cast to float32 only in the problematic case, which is VK + dynamic input on TIR scan. I think this is an acceptable solution for now. Of course, the best solution is to implement TIR level CSE, since the host is doing the same compute anyway and there is no point computing log2 etc in device.
   
   Interestingly, TIR mergepath kernel used in sort, which is also littered with glsl log2 and ceil, doesn't cast to float64 before log2 in GPU IR. If you see the IR dump https://gist.github.com/masahi/c0979c61907af15f9924b3b3d72fe6a7, there is no `float64` anywhere. But for TIR scan downsweep kernel, there is cast to `float64`. So I removed cast to float32 in TIR sort.
   
   It could also be the case that our SPIRV codegen for int64 to float64 is busted, but I haven't checked. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-821180858


   @mbrookhart I'm finally back with this, we can now do integer ceil(log2(x)) without cast to float for vulkan.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800543774


   Ok updated to cast to float32 only in the problematic case, which is VK + dynamic input. I think this is an acceptable solution for now. Of course, the best solution is to implement TIR level CSE, since the host is doing the same compute anyway and there is no point computing log2 etc in device.
   
   Interestingly, TIR mergepath kernel used in sort, which is also littered with glsl log2 and ceil, doesn't cast to float64 before log2 in GPU IR. If you see the IR dump https://gist.github.com/masahi/c0979c61907af15f9924b3b3d72fe6a7, there is no `float64` anywhere. But for TIR scan downsweep kernel, there is cast to `float64`. 
   
   It could also be the case that our SPIRV codegen for int64 to float64 is busted, but I haven't checked. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tqchen commented on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

tqchen commented on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800465668


   Perhaps we should think about other alternatives for such an intrinsics.
   
   see 
   
   - https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
   - `__builtin_clz` in c++ code
   - https://stackoverflow.com/questions/39046194/is-there-a-way-to-use-clz-in-a-vulkan-compute-shader


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi edited a comment on pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi edited a comment on pull request #7669:
URL: https://github.com/apache/tvm/pull/7669#issuecomment-800543774


   Ok updated to cast to float32 only in the problematic case, which is VK + dynamic input on TIR scan. I think this is an acceptable solution for now. Of course, the best solution is to implement TIR level CSE, since the host is doing the same compute anyway and there is no point computing log2 etc in device.
   
   Interestingly, TIR mergepath kernel used in sort, which is also littered with glsl log2 and ceil, doesn't cast to float64 before log2 in the GPU IR. If you see the IR dump https://gist.github.com/masahi/c0979c61907af15f9924b3b3d72fe6a7, there is no `float64` anywhere. But for TIR scan downsweep kernel, there is a cast to `float64`. So I removed cast to float32 in TIR sort.
   
   It could also be the case that our SPIRV codegen for int64 to float64 cast is busted, but I haven't checked. Another weird thing is that glsl log2 on fp64 works correctly if the input size is static.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi merged pull request #7669: [TOPI][SPIRV] Cast to float32 not float64 before log2 in sort/scan

Posted by GitBox <gi...@apache.org>.

masahi merged pull request #7669:
URL: https://github.com/apache/tvm/pull/7669


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org