You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/05/26 21:29:12 UTC
[GitHub] [tvm] jwfromm opened a new pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
jwfromm opened a new pull request #8143:
URL: https://github.com/apache/tvm/pull/8143
There's been a long-known issue where sometimes during alter_op_layout, the source IRModule is mutated. Sometimes this can cause errors during task extraction. One example model where the issue pops up is [yolov3-tiny](https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny-yolov3). An easy workaround to avoid this bug is making a copy of the input module before applying optimization passes. This PR adds a copy step to both autotvm and auto_scheduler. I'm not sure what tests to add since the bug is extremely difficult to pin down. It does trigger with the above linked yolo model though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi merged pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
masahi merged pull request #8143:
URL: https://github.com/apache/tvm/pull/8143
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jwfromm commented on a change in pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
jwfromm commented on a change in pull request #8143:
URL: https://github.com/apache/tvm/pull/8143#discussion_r640151629
##########
File path: python/tvm/auto_scheduler/relay_integration.py
##########
@@ -64,19 +65,27 @@ def call_all_topi_funcs(mod, params, target):
disabled_pass={"AutoSchedulerLayoutRewrite"},
):
try:
- opt_mod, _ = relay.optimize(mod, target, params)
+ # TODO(jwfromm) Remove this once AlterOpLayout bug that mutates
+ # source module is fixed. Until then, create a clone.
+ mod_clone = deepcopy(mod)
Review comment:
I think we actually need to have both. The problem is that in the first try we attempt to apply `optimize`, which can mutate the source module. Then if that fails, we try to use `compiler.lower`, which again can mutate the source module.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] comaniac commented on a change in pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #8143:
URL: https://github.com/apache/tvm/pull/8143#discussion_r640157536
##########
File path: python/tvm/auto_scheduler/relay_integration.py
##########
@@ -64,19 +65,27 @@ def call_all_topi_funcs(mod, params, target):
disabled_pass={"AutoSchedulerLayoutRewrite"},
):
try:
- opt_mod, _ = relay.optimize(mod, target, params)
+ # TODO(jwfromm) Remove this once AlterOpLayout bug that mutates
+ # source module is fixed. Until then, create a clone.
+ mod_clone = deepcopy(mod)
Review comment:
Ah I see...that's what you meant by the source module was mutated. Yeah this is definitely a bug to be fixed and this is a reasonable workaround.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jwfromm commented on a change in pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
jwfromm commented on a change in pull request #8143:
URL: https://github.com/apache/tvm/pull/8143#discussion_r640151629
##########
File path: python/tvm/auto_scheduler/relay_integration.py
##########
@@ -64,19 +65,27 @@ def call_all_topi_funcs(mod, params, target):
disabled_pass={"AutoSchedulerLayoutRewrite"},
):
try:
- opt_mod, _ = relay.optimize(mod, target, params)
+ # TODO(jwfromm) Remove this once AlterOpLayout bug that mutates
+ # source module is fixed. Until then, create a clone.
+ mod_clone = deepcopy(mod)
Review comment:
I think we actually need to have both. The problem is that in the first try we attempt to apply `optimize`, which can mutate the source module. Then if that fails, we try to use `vm.lower`, which again can mutate the source module.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] comaniac commented on a change in pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #8143:
URL: https://github.com/apache/tvm/pull/8143#discussion_r640139721
##########
File path: python/tvm/autotvm/task/relay_integration.py
##########
@@ -53,18 +54,22 @@ def _lower(mod, target, params):
# If failed to compile, then fallback to use VM compiler.
# TODO: Currently VM compiler is likely to stack overflow for large models.
try:
- opt_mod, _ = relay.optimize(mod, target, params)
+ # TODO(jwfromm) Remove this once AlterOpLayout bug that mutates
+ # source module is fixed. Until then, create a clone.
+ mod_clone = deepcopy(mod)
Review comment:
ditto.
##########
File path: python/tvm/auto_scheduler/relay_integration.py
##########
@@ -64,19 +65,27 @@ def call_all_topi_funcs(mod, params, target):
disabled_pass={"AutoSchedulerLayoutRewrite"},
):
try:
- opt_mod, _ = relay.optimize(mod, target, params)
+ # TODO(jwfromm) Remove this once AlterOpLayout bug that mutates
+ # source module is fixed. Until then, create a clone.
+ mod_clone = deepcopy(mod)
Review comment:
This line can be lifted out of the try-catch block so that L79 can be simplified.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jwfromm commented on a change in pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
jwfromm commented on a change in pull request #8143:
URL: https://github.com/apache/tvm/pull/8143#discussion_r640151629
##########
File path: python/tvm/auto_scheduler/relay_integration.py
##########
@@ -64,19 +65,27 @@ def call_all_topi_funcs(mod, params, target):
disabled_pass={"AutoSchedulerLayoutRewrite"},
):
try:
- opt_mod, _ = relay.optimize(mod, target, params)
+ # TODO(jwfromm) Remove this once AlterOpLayout bug that mutates
+ # source module is fixed. Until then, create a clone.
+ mod_clone = deepcopy(mod)
Review comment:
I think we actually need to have both. The problem is that in the first try we attempt to apply `optimize`, which can mutate the source module. Then if that fails, we try to use `compiler.lower`, which again can mutate the source module. If we tried to apply `compiler.lower` to `mod_clone` after `optimize` without a second copy, we could hit an error due to invalid shapes from alter_op_layout.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
masahi commented on pull request #8143:
URL: https://github.com/apache/tvm/pull/8143#issuecomment-854444855
I'm getting a strange error during task extraction after this commit. Something bad happens during `deepcopy`:
```
Traceback (most recent call last):
File "/home/masa/anaconda3/envs/torch-1.7/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/home/masa/anaconda3/envs/torch-1.7/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/masa/projects/dev/tvm/python/tvm/auto_scheduler/relay_integration.py", line 79, in call_all_topi_funcs
mod_clone = deepcopy(mod)
File "/home/masa/anaconda3/envs/torch-1.7/lib/python3.7/copy.py", line 180, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/masa/anaconda3/envs/torch-1.7/lib/python3.7/copy.py", line 283, in _reconstruct
y.__setstate__(state)
File "/home/masa/projects/dev/tvm/python/tvm/runtime/object.py", line 91, in __setstate__
self.__init_handle_by_constructor__(_ffi_node_api.LoadJSON, handle)
File "/home/masa/projects/dev/tvm/python/tvm/_ffi/_ctypes/object.py", line 136, in __init_handle_by_constructor__
handle = __init_by_constructor__(fconstructor, args)
File "/home/masa/projects/dev/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 260, in __init_handle_by_constructor__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
5: TVMFuncCall
4: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::TypedPackedFunc<tvm::runtime::ObjectRef (std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >)>::AssignTypedLambda<tvm::runtime::ObjectRef (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char
> >)>(tvm::runtime::ObjectRef (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRe
tValue*&&)
3: tvm::LoadJSON(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
2: tvm::ReflectionVTable::VisitAttrs(tvm::runtime::Object*, tvm::AttrVisitor*) const
1: tvm::FieldDependencyFinder::Visit(char const*, tvm::runtime::ObjectRef*)
0: void tvm::FieldDependencyFinder::ParseValue<unsigned long>(char const*, unsigned long*) const
File "../src/node/serialization.cc", line 291
JSONReader: cannot find field axis
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] jwfromm commented on pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
jwfromm commented on pull request #8143:
URL: https://github.com/apache/tvm/pull/8143#issuecomment-850541353
AlterOpLayout is applied during task extraction so it definitely has this bug. Try autoscheduling the linked yolo model and you'll encounter it without this fix.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #8143: [AutoTVM][AutoScheduler] Add workaround to alter op layout bug in task extraction.
Posted by GitBox <gi...@apache.org>.
masahi commented on pull request #8143:
URL: https://github.com/apache/tvm/pull/8143#issuecomment-854444855
I'm getting a strange error during task extraction after this commit. Something bad happens during `deepcopy`:
```
Traceback (most recent call last):
File "/home/masa/anaconda3/envs/torch-1.7/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/home/masa/anaconda3/envs/torch-1.7/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/masa/projects/dev/tvm/python/tvm/auto_scheduler/relay_integration.py", line 79, in call_all_topi_funcs
mod_clone = deepcopy(mod)
File "/home/masa/anaconda3/envs/torch-1.7/lib/python3.7/copy.py", line 180, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/masa/anaconda3/envs/torch-1.7/lib/python3.7/copy.py", line 283, in _reconstruct
y.__setstate__(state)
File "/home/masa/projects/dev/tvm/python/tvm/runtime/object.py", line 91, in __setstate__
self.__init_handle_by_constructor__(_ffi_node_api.LoadJSON, handle)
File "/home/masa/projects/dev/tvm/python/tvm/_ffi/_ctypes/object.py", line 136, in __init_handle_by_constructor__
handle = __init_by_constructor__(fconstructor, args)
File "/home/masa/projects/dev/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 260, in __init_handle_by_constructor__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
5: TVMFuncCall
4: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::TypedPackedFunc<tvm::runtime::ObjectRef (std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >)>::AssignTypedLambda<tvm::runtime::ObjectRef (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char
> >)>(tvm::runtime::ObjectRef (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRe
tValue*&&)
3: tvm::LoadJSON(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
2: tvm::ReflectionVTable::VisitAttrs(tvm::runtime::Object*, tvm::AttrVisitor*) const
1: tvm::FieldDependencyFinder::Visit(char const*, tvm::runtime::ObjectRef*)
0: void tvm::FieldDependencyFinder::ParseValue<unsigned long>(char const*, unsigned long*) const
File "../src/node/serialization.cc", line 291
JSONReader: cannot find field axis
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org