You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/08/11 08:13:03 UTC
[GitHub] [tvm] cxx122 opened a new issue, #12377: [Bug] Inconsistent caused by 65535f16*0f16 after using compute_inline
cxx122 opened a new issue, #12377:
URL: https://github.com/apache/tvm/issues/12377
```
TENSOR_0 = te.compute([14], lambda rck:te.max_value("float16")*te.min_value("uint16"), name ="TENSOR_1")
TENSOR_1 = te.compute([11], lambda oco:te.max_value("uint16")*TENSOR_0[oco], name ="TENSOR_2")
```
The tir program before compute_inline:
```
@main = primfn(TENSOR_1_1: handle, TENSOR_2_1: handle) -> ()
attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
buffers = {TENSOR_1: Buffer(TENSOR_1_2: Pointer(float16), float16, [14], []),
TENSOR_2: Buffer(TENSOR_2_2: Pointer(float16), float16, [11], [])}
buffer_map = {TENSOR_1_1: TENSOR_1, TENSOR_2_1: TENSOR_2}
preflattened_buffer_map = {TENSOR_1_1: TENSOR_1_3: Buffer(TENSOR_1_2, float16, [14], []), TENSOR_2_1: TENSOR_2_3: Buffer(TENSOR_2_2, float16, [11], [])} {
for (rck: int32, 0, 11) {
TENSOR_1[rck] = 0f16
}
for (oco: int32, 0, 11) {
TENSOR_2[oco] = (65535f16*TENSOR_1[oco])
}
}
```
The tir program after compute_inline:
```
@main = primfn(TENSOR_1_1: handle, TENSOR_2_1: handle) -> ()
attr = {"from_legacy_te_schedule": True, "global_symbol": "main", "tir.noalias": True}
buffers = {TENSOR_1: Buffer(TENSOR_1_2: Pointer(float16), float16, [14], []),
TENSOR_2: Buffer(TENSOR_2_2: Pointer(float16), float16, [11], [])}
buffer_map = {TENSOR_1_1: TENSOR_1, TENSOR_2_1: TENSOR_2}
preflattened_buffer_map = {TENSOR_1_1: TENSOR_1_3: Buffer(TENSOR_1_2, float16, [14], []), TENSOR_2_1: TENSOR_2_3: Buffer(TENSOR_2_2, float16, [11], [])} {
for (oco: int32, 0, 11) {
TENSOR_2[oco] = 0f16
}
}
```
### Actual behavior
```
Traceback (most recent call last):
File "/Scuzer/src/bugs/bug18/IncorrectResult__a2ff70c4-1ee9-4d8a-bf70-8e0c1be8a343/Incorrect_bug.py", line 36, in <module>
tvm.testing.assert_allclose(pre_list[1].numpy(), after_list[1].numpy(),rtol=1e-5)
File "/Scuzer/tvm_cov_patch/tvm/python/tvm/testing/utils.py", line 114, in assert_allclose
np.testing.assert_allclose(actual, desired, rtol=rtol, atol=atol, verbose=True)
File "/root/miniconda3/envs/py38/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 1527, in assert_allclose
assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
File "/root/miniconda3/envs/py38/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 764, in assert_array_compare
flagged = func_assert_same_pos(x, y, func=isnan, hasval='nan')
File "/root/miniconda3/envs/py38/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 740, in func_assert_same_pos
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-05, atol=1e-07
x and y nan location mismatch:
x: array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],
dtype=float16)
y: array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float16)
```
### Environment
Operating System: Ubuntu 18.04, TVM version: tag0.9.0 [d361585]
### Steps to reproduce
```
import os
import numpy as np
import tvm
from tvm import te, auto_scheduler, topi
import tvm.testing
TENSOR_0 = te.compute([14], lambda rck:te.max_value("float16")*te.min_value("uint16"), name ="TENSOR_1")
TENSOR_1 = te.compute([11], lambda oco:te.max_value("uint16")*TENSOR_0[oco], name ="TENSOR_2")
s = te.create_schedule(TENSOR_1.op)
tensor_list = [TENSOR_0,TENSOR_1]
dev = tvm.cpu(0)
pre_list = []
after_list = []
for tensor in tensor_list:
shape = [x.value if 'value' in dir(x) and isinstance(x.value, int) else 1 for x in tensor.shape]
params = (5*np.random.uniform(size=shape)).astype(tensor.dtype)
pre_list.append(tvm.nd.array(params.copy(), dev))
after_list.append(tvm.nd.array(params.copy(), dev))
pre_mod = tvm.lower(s, tensor_list, simple_mode=True)
with tvm.transform.PassContext(opt_level=4):
f = tvm.build(pre_mod)
f(*pre_list)
s[TENSOR_0].compute_inline()
now_mod = tvm.lower(s, tensor_list, simple_mode=True)
with tvm.transform.PassContext(opt_level=4):
f = tvm.build(now_mod)
f(*after_list)
tvm.testing.assert_allclose(pre_list[1].numpy(), after_list[1].numpy(),rtol=1e-5)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] wrongtest-intellif commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline
Posted by GitBox <gi...@apache.org>.
wrongtest-intellif commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1221273660
Actually there is no `"65535f16"`, should be `nan` because it exceeds maximum of fp16, it seems to be an issue in literal construction and constant folding.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] cxx122 commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline
Posted by GitBox <gi...@apache.org>.
cxx122 commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1218918153
Thanks, when I submitted this bug I also considered that it might be because of this problem. This may not be a bug in the strict sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] ganler commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline
Posted by GitBox <gi...@apache.org>.
ganler commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1221354644
@wrongtest-intellif Good point. `"65535f16"` is actually `inf`. But `inf * 0` gets us a `nan`. :-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] ganler commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline
Posted by GitBox <gi...@apache.org>.
ganler commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1218601515
Similarly in many rest bugs reports, as `opt_level=4` is specified which indicates fast math optimization, it is highly possible to have those numerical "inconsistency" when the computation is not well-formed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] cxx122 closed issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline
Posted by GitBox <gi...@apache.org>.
cxx122 closed issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline
URL: https://github.com/apache/tvm/issues/12377
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] ganler commented on issue #12377: [Bug] Inconsistency caused by 65535f16*0f16 after using compute_inline
Posted by GitBox <gi...@apache.org>.
ganler commented on issue #12377:
URL: https://github.com/apache/tvm/issues/12377#issuecomment-1218581872
@cxx122 Thanks for the report. It seems you are trying to compute "65535f16 * 0f16" which returns "nan" as an undefined behavior.
![image](https://user-images.githubusercontent.com/38074777/185257403-c8c87fbc-a0b5-4651-bef5-299db7bc30bf.png)
Since its output is "nan" and according to IEEE 754 that "nan" is not comparable, I don't think it is suitable to regard this as an inconsistency bug since the computation itself is ill-formed and undefined. From a fuzzing prespective, IMO, those should be regarded as false alarms that the algorithm should try to avoid sythesizing programs with undefined behaviors.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org