You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/08/11 08:13:03 UTC

[GitHub] [tvm] cxx122 opened a new issue, #12377: [Bug] Inconsistent caused by 65535f16*0f16 after using compute_inline

cxx122 opened a new issue, #12377:
URL: https://github.com/apache/tvm/issues/12377

   ```
   TENSOR_0 = te.compute([14], lambda rck:te.max_value("float16")*te.min_value("uint16"), name ="TENSOR_1")
   TENSOR_1 = te.compute([11], lambda oco:te.max_value("uint16")*TENSOR_0[oco], name ="TENSOR_2")
   ```
   The tir program before compute_inline:
   ```
   @main = primfn(TENSOR_1_1: handle, TENSOR_2_1: handle) -> ()
     attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
     buffers = {TENSOR_1: Buffer(TENSOR_1_2: Pointer(float16), float16, [14], []),
                TENSOR_2: Buffer(TENSOR_2_2: Pointer(float16), float16, [11], [])}
     buffer_map = {TENSOR_1_1: TENSOR_1, TENSOR_2_1: TENSOR_2}
     preflattened_buffer_map = {TENSOR_1_1: TENSOR_1_3: Buffer(TENSOR_1_2, float16, [14], []), TENSOR_2_1: TENSOR_2_3: Buffer(TENSOR_2_2, float16, [11], [])} {
     for (rck: int32, 0, 11) {
       TENSOR_1[rck] = 0f16
     }
     for (oco: int32, 0, 11) {
       TENSOR_2[oco] = (65535f16*TENSOR_1[oco])
     }
   }
   ```
   The tir program after compute_inline:
   ```
   @main = primfn(TENSOR_1_1: handle, TENSOR_2_1: handle) -> ()
     attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
     buffers = {TENSOR_1: Buffer(TENSOR_1_2: Pointer(float16), float16, [14], []),
                TENSOR_2: Buffer(TENSOR_2_2: Pointer(float16), float16, [11], [])}
     buffer_map = {TENSOR_1_1: TENSOR_1, TENSOR_2_1: TENSOR_2}
     preflattened_buffer_map = {TENSOR_1_1: TENSOR_1_3: Buffer(TENSOR_1_2, float16, [14], []), TENSOR_2_1: TENSOR_2_3: Buffer(TENSOR_2_2, float16, [11], [])} {
     for (oco: int32, 0, 11) {
       TENSOR_2[oco] = 0f16
     }
   }
   ```
   
   ### Actual behavior
   
   ```
   Traceback (most recent call last):
     File "/Scuzer/src/bugs/bug18/IncorrectResult__a2ff70c4-1ee9-4d8a-bf70-8e0c1be8a343/Incorrect_bug.py", line 36, in <module>
       tvm.testing.assert_allclose(pre_list[1].numpy(), after_list[1].numpy(),rtol=1e-5)
     File "/Scuzer/tvm_cov_patch/tvm/python/tvm/testing/utils.py", line 114, in assert_allclose
       np.testing.assert_allclose(actual, desired, rtol=rtol, atol=atol, verbose=True)
     File "/root/miniconda3/envs/py38/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 1527, in assert_allclose
       assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
     File "/root/miniconda3/envs/py38/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 764, in assert_array_compare
       flagged = func_assert_same_pos(x, y, func=isnan, hasval='nan')
     File "/root/miniconda3/envs/py38/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 740, in func_assert_same_pos
       raise AssertionError(msg)
   AssertionError: 
   Not equal to tolerance rtol=1e-05, atol=1e-07
   
   x and y nan location mismatch:
    x: array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],
         dtype=float16)
    y: array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float16)
   ```
   
   ### Environment
   
   Operating System: Ubuntu 18.04, TVM version: tag0.9.0 [d361585]
   
   ### Steps to reproduce
   
   ```
   import os
   import numpy as np
   import tvm
   from tvm import te, auto_scheduler, topi
   import tvm.testing
   
   TENSOR_0 = te.compute([14], lambda rck:te.max_value("float16")*te.min_value("uint16"), name ="TENSOR_1")
   TENSOR_1 = te.compute([11], lambda oco:te.max_value("uint16")*TENSOR_0[oco], name ="TENSOR_2")
   s = te.create_schedule(TENSOR_1.op)
   tensor_list = [TENSOR_0,TENSOR_1]
   
   dev = tvm.cpu(0)
   pre_list = []
   after_list = []
   for tensor in tensor_list:
       shape = [x.value if 'value' in dir(x) and isinstance(x.value, int) else 1 for x in tensor.shape]
       params = (5*np.random.uniform(size=shape)).astype(tensor.dtype)
       pre_list.append(tvm.nd.array(params.copy(), dev))
       after_list.append(tvm.nd.array(params.copy(), dev))
   
   pre_mod = tvm.lower(s, tensor_list, simple_mode=True)
   with tvm.transform.PassContext(opt_level=4):
       f = tvm.build(pre_mod)
   f(*pre_list)
   
   s[TENSOR_0].compute_inline()
   
   now_mod = tvm.lower(s, tensor_list, simple_mode=True)
   with tvm.transform.PassContext(opt_level=4):
       f = tvm.build(now_mod)
   f(*after_list)
   
   tvm.testing.assert_allclose(pre_list[1].numpy(), after_list[1].numpy(),rtol=1e-5)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] wrongtest-intellif commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline

Posted by GitBox <gi...@apache.org>.

wrongtest-intellif commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1221273660

   Actually there is no `"65535f16"`, should be `nan` because it exceeds maximum of fp16, it seems to be an issue in literal construction and constant folding.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] cxx122 commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline

Posted by GitBox <gi...@apache.org>.

cxx122 commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1218918153

   Thanks, when I submitted this bug I also considered that it might be because of this problem. This may not be a bug in the strict sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] ganler commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline

Posted by GitBox <gi...@apache.org>.

ganler commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1221354644

   @wrongtest-intellif Good point. `"65535f16"` is actually `inf`. But `inf * 0` gets us a `nan`. :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] ganler commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline

Posted by GitBox <gi...@apache.org>.

ganler commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1218601515

   Similarly in many rest bugs reports, as `opt_level=4` is specified which indicates fast math optimization, it is highly possible to have those numerical "inconsistency" when the computation is not well-formed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] cxx122 closed issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline

Posted by GitBox <gi...@apache.org>.

cxx122 closed issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline
URL: https://github.com/apache/tvm/issues/12377


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] ganler commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline

Posted by GitBox <gi...@apache.org>.

ganler commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1218581872

   @cxx122 Thanks for the report. It seems you are trying to compute "65535f16 * 0f16" which returns "nan" as an undefined behavior. 
   
   ![image](https://user-images.githubusercontent.com/38074777/185257403-c8c87fbc-a0b5-4651-bef5-299db7bc30bf.png)
   
   Since its output is "nan" and according to IEEE 754 that "nan" is not comparable, I don't think it is suitable to regard this as an inconsistency bug since the computation itself is ill-formed and undefined. From a fuzzing prespective, IMO, those should be regarded as false alarms that the algorithm should try to avoid sythesizing programs with undefined behaviors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org