You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/12/08 21:52:17 UTC

[GitHub] [tvm] jwfromm opened a new pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

jwfromm opened a new pull request #7063:
URL: https://github.com/apache/tvm/pull/7063


   The recent addition of tensorcore schedules has broken TVM's ability to compile for cuda on a machine without a GPU. This is because the strategy registration for tensorcores calls `tvm.gpu(0).compute_version`, which fails when no gpu is present. I've changed the behavior of `nvcc.have_tensorcore` to check `AutotvmGlobalScope.current.cuda_target_arch` when a GPU isn't present. This allows a user to call something like `tvm.autotvm.measure.measure_methods.set_cuda_target_arch("sm_62")` to specify a cuda cross compilation target on a machine without a GPU and build correctly.
   
   I'm not sure how to test this since it would require a CPU node that's built with the cuda toolkit. Let me know if you have an opinion on tests to add to prevent an error like this from sneaking in again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jwfromm commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
jwfromm commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r538845138



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
     compute_version : str
         compute capability of a GPU (e.g. "7.0")
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            compute_version = AutotvmGlobalScope.current.cuda_target_arch

Review comment:
       Those are both good points, thanks @comaniac. I'll take a look at PassContext and see if it can be integrated there instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] comaniac commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r538921161



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
     compute_version : str
         compute capability of a GPU (e.g. "7.0")
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            compute_version = AutotvmGlobalScope.current.cuda_target_arch

Review comment:
       I was expecting something like:
   
   ```python
   with tvm.transform.PassContext(opt_level=3, config={"relay.backend.cuda_target_arch": "sm_80"}):
       lib = relay.build(mod, target=target, params=params)
   ```
   
   And in `ncvv.py`:
   
   ```python
   # Here "sm_75" is the default value in case users didn't provide this config.
   cuda_target_arch = PassContext.current().config.get("relay.backend.cuda_target_arch", "sm_75")
   major, minor = cuda_target_arch.split("_")[1]
   compute_version = major + "." + minor
   ```
   
   Does that make sense to you?
   If so, you may refer to a recent PR I sent, which adds "relay.backend.use_auto_scheduler" to PassContext: https://github.com/apache/tvm/pull/6903




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jwfromm commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
jwfromm commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r539661492



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
     compute_version : str
         compute capability of a GPU (e.g. "7.0")
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            compute_version = AutotvmGlobalScope.current.cuda_target_arch

Review comment:
       I've changed `have_tensorcore` to instead extract the architecture from a target. Can you take another look and let me know what you think? One downside is that `have_tensorcore` is used in places that don't have access to the target object such as in topi functions, so we have to support both the old way of getting target from `tvm.gpu...` and the target object.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jwfromm commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
jwfromm commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r538915180



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
     compute_version : str
         compute capability of a GPU (e.g. "7.0")
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            compute_version = AutotvmGlobalScope.current.cuda_target_arch

Review comment:
       Just to be clear, would you want to set 'cuda_target_arch' as part of a `PassContext` config or is there a different approach you were thinking of? When I investigated it wasn't quite as clear a fit as I thought.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jwfromm commented on pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
jwfromm commented on pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#issuecomment-741092239


   @anwang2009 @tqchen @adelbertc Can you guys take a look at this PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] comaniac commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r539516443



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
     compute_version : str
         compute capability of a GPU (e.g. "7.0")
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            compute_version = AutotvmGlobalScope.current.cuda_target_arch

Review comment:
       Hmm this is a good point. Putting CUDA target arch to target definitely makes more sense. Then the solution becomes:
   
   ```python
   target = Target("cuda -arch=sm_80")
   ...
   cuda_target_arch = target.attrs["arch"] if "arch" in target.attrs else "sm_75"
   ...
   ```
   
   I found that CUDA target already has this attribute so the above solution actually works now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jwfromm commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
jwfromm commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r539509778



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
     compute_version : str
         compute capability of a GPU (e.g. "7.0")
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            compute_version = AutotvmGlobalScope.current.cuda_target_arch

Review comment:
       I thought about this some more overnight and I'm not sure adding it PassContext makes any more sense than having it in AutoTVMGlobalScope. We really should be specifying this information as part of a `tvm.Target` since the cuda architecture purely describes the hardware target and doesn't really relate to relay passes or autotvm directly. I think this should be done once we move further from string based targets to objects like those introduced in #6218. I'd argue that for now, applying a bandaid fix to Autotvm.GlobalScope as in the current PR is the best way to temporarily solve the problem. What do you think @comaniac? I'd also be interested in hearing what @tqchen thinks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] comaniac commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r538921161



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
     compute_version : str
         compute capability of a GPU (e.g. "7.0")
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            compute_version = AutotvmGlobalScope.current.cuda_target_arch

Review comment:
       I was expecting something like:
   
   ```python
   with tvm.transform.PassContext(opt_level=3, config={"relay.backend.cuda_target_arch": "sm_75"}):
       lib = relay.build(mod, target=target, params=params)
   ```
   
   And in `ncvv.py`:
   
   ```python
   cuda_target_arch = PassContext.current().config.get("relay.backend.cuda_target_arch", "sm_75")
   major, minor = cuda_target_arch.split("_")[1]
   compute_version = major + "." + minor
   ```
   
   Does that make sense to you?
   If so, you may refer to a recent PR I sent, which adds "relay.backend.use_auto_scheduler" to PassContext: https://github.com/apache/tvm/pull/6903




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] comaniac commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r539685119



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +269,34 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None, target=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
-    compute_version : str
-        compute capability of a GPU (e.g. "7.0")
+    compute_version : str, optional
+        compute capability of a GPU (e.g. "7.0").
+
+    target : tvm.target.Target, optional
+        The compilation target, will be used to determine arch if compute_version
+        isn't specified.
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            if target is None or "arch" not in target.attrs:
+                warnings.warn(
+                    "Cannot find cuda architecture, try specifying it by adding '-arch=sm_xx'"
+                    "to your target. Tensorcore schedules will be disabled."

Review comment:
       ```suggestion
                       "Tensorcore will be disabled due to no CUDA architecture specified.
                       Try specifying it by adding '-arch=sm_xx' to your target."
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] comaniac commented on a change in pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r538840025



##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
     return False
 
 
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
     """Either TensorCore support is provided in the compute capability or not
 
     Parameters
     ----------
     compute_version : str
         compute capability of a GPU (e.g. "7.0")
     """
+    if compute_version is None:
+        if tvm.gpu(0).exist:
+            compute_version = tvm.gpu(0).compute_version
+        else:
+            compute_version = AutotvmGlobalScope.current.cuda_target_arch

Review comment:
       - It seems to me that we should move this config to PassContext, because this affects how Relay op strategy select the implementation in general but not specific to AutoTVM.
   - Should properly handle the case that `cuda_target_arch` is `None`. Like we could print a warning saying that we will not consider tensor core due to missing information.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] comaniac merged pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
comaniac merged pull request #7063:
URL: https://github.com/apache/tvm/pull/7063


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] comaniac commented on pull request #7063: [Relay][Strategy] Allow cuda cross compilation without physical device.

Posted by GitBox <gi...@apache.org>.
comaniac commented on pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#issuecomment-742638947


   Thanks @jwfromm @anwang2009.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org