You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/04/19 12:39:04 UTC

[GitHub] [tvm] pfk-beta opened a new issue, #11058: [Bug] tuning cannot send task but tracker - device connection is ok

pfk-beta opened a new issue, #11058:
URL: https://github.com/apache/tvm/issues/11058

   Hello,
   
   I think that I have correct rpc connection between my computer and android device:
   1. I have compiled tvm_rpc executable using Docker.android_demo with modifiication. I have put there headers for OpenCL, and compiled binary to use them.
   2. I have put executable to the device, and `libc++_shared.so`  `libtvm_runtime.so` libraries, as well.
   3. When I run command `adb -s HQUCI76PJJ7TFYLZ shell "LD_LIBRARY_PATH=/data/local/tmp /data/local/tmp/tvm_rpc server --host=192.168.1.122 --port=5000 --tracker=192.168.1.135:9190 --key=prague --work-dir=/data/local/tmp/"`, it looks like correct working. I see following info:
   
   ```
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:96: host        = 192.168.1.122
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:97: port        = 5000
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:98: port_end    = 9099
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:99: tracker     = ('192.168.1.135', 9190)
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:100: key         = prague
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:101: custom_addr = 
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:102: work_dir    = /data/local/tmp/
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:103: silent      = False
   [14:28:48] /workspace/apps/cpp_rpc/main.cc:264: Starting CPP Server, Press Ctrl+C to stop.
   [14:28:48] /workspace/apps/cpp_rpc/rpc_server.cc:130: bind to 192.168.1.122:5000
   [14:28:48] /workspace/apps/cpp_rpc/rpc_tracker_client.h:201: Tracker connecting to 192.168.1.135:9190
   ```
   4. I have compiled runtime on my computer, with RPC and LLVM on. When I run rpc_tracker, and query rpc_tracker I see following infomration (I think it is working correctly):
   ```
   Tracker address 192.168.1.135:9190
   
   Server List
   ----------------------------
   server-address  key
   ----------------------------
   192.168.1.210:5000      server:prague
   192.168.1.122:5000      server:prague
   ```
   ```
   python3 -m tvm.exec.rpc_tracker --port 9190
   INFO:RPCTracker:bind to 0.0.0.0:9190
   ```
   5. When I run tuning script, `tvm_rpc` gives me little message `[14:23:59] /workspace/apps/cpp_rpc/rpc_server.cc:300: Connection success 192.168.1.135:60674`. Then I get stacktrace from tuning script:
   ```
   File "/home/piotr/projects/odai-meta/odai-tvm/docker/tcl/tcl_scripts/tune.py", line 152, in <module>
       main(args)
     File "/home/piotr/projects/odai-meta/odai-tvm/docker/tcl/tcl_scripts/tune.py", line 99, in main
       tune_model(args, mod, params, target)
     File "/home/piotr/projects/odai-meta/odai-tvm/docker/tcl/tcl_scripts/tune.py", line 77, in tune_model
       task_tuner.tune(
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/python/tvm/autotvm/tuner/xgboost_tuner.py", line 105, in tune
       super(XGBTuner, self).tune(*args, **kwargs)
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/python/tvm/autotvm/tuner/tuner.py", line 112, in tune
       measure_batch = create_measure_batch(self.task, measure_option)
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/python/tvm/autotvm/measure/measure.py", line 282, in create_measure_batch
       attach_objects = runner.set_task(task)
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/python/tvm/autotvm/measure/measure_methods.py", line 291, in set_task
       raise RuntimeError(
   RuntimeError: Cannot get remote devices from the tracker. Please check the status of tracker by 'python -m tvm.exec.query_rpc_tracker --port [THE PORT YOU USE]' and make sure you have free devices on the queue status.
   ``` 
   (But, in another terminal I'm runing query in `watch` command, so it is working correctly).
   
   6. After one minute I see in `tvm_rpc` following message: 
   ```
   [14:24:59] /workspace/apps/cpp_rpc/rpc_server.cc:198: Child pid=12547 killed (timeout = 60), Process status = 15
   [14:24:59] /workspace/apps/cpp_rpc/rpc_server.cc:229: Socket Connection Closed
   ```
   and tuning script is exiting with following stacktrace:
   ```
   Exception in thread Thread-2:
   Traceback (most recent call last):
     File "/usr/lib/python3.9/threading.py", line 973, in _bootstrap_inner
       self.run()
     File "/usr/lib/python3.9/threading.py", line 910, in run
       self._target(*self._args, **self._kwargs)
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/python/tvm/autotvm/measure/measure_methods.py", line 769, in _check
       while not dev.exist:  # wait until we get an available device
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/python/tvm/_ffi/runtime_ctypes.py", line 262, in exist
       return self._GetDeviceAttr(self.device_type, self.device_id, 0) != 0
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/python/tvm/_ffi/runtime_ctypes.py", line 246, in _GetDeviceAttr
       return tvm.runtime._ffi_api.GetDeviceAttr(device_type, device_id, attr_id)
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
       raise get_last_ffi_error()
   tvm._ffi.base.TVMError: Traceback (most recent call last):
     4: TVMFuncCall
     3: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), __mk_TVM1::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
     2: tvm::runtime::RPCDeviceAPI::GetAttr(DLDevice, tvm::runtime::DeviceAttrKind, tvm::runtime::TVMRetValue*)
     1: non-virtual thunk to tvm::runtime::RPCClientSession::GetAttr(DLDevice, tvm::runtime::DeviceAttrKind, tvm::runtime::TVMRetValue*)
     0: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::RPCEndpoint::Init()::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#2}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)
     File "/home/piotr/projects/odai-meta/odai-tvm/tvm/src/runtime/rpc/rpc_endpoint.cc", line 681
   TVMError: 
   ---------------------------------------------------------------
   An error occurred during the execution of TVM.
   For more information, please see: https://tvm.apache.org/docs/errors.html
   ---------------------------------------------------------------
     Check failed: (code == RPCCode::kReturn) is false: code=1
    Done.
   ```
   
   7. Both, in building tvm_rpc, and in runing tuning script I'm using sourcecode from `v0.9.dev0` git tag, is it ok?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [I] [Bug] tuning cannot send task but tracker - device connection is ok [tvm]

Posted by "pfk-beta (via GitHub)" <gi...@apache.org>.

pfk-beta closed issue #11058: [Bug] tuning cannot send task but tracker - device connection is ok
URL: https://github.com/apache/tvm/issues/11058


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] pfk-beta commented on issue #11058: [Bug] tuning cannot send task but tracker - device connection is ok

Posted by GitBox <gi...@apache.org>.

pfk-beta commented on issue #11058:
URL: https://github.com/apache/tvm/issues/11058#issuecomment-1102600200

   tuning script:
   ```
   import os
   import sys
   import argparse
   
   import numpy as np
   from tcl_scripts.common import load_model
   
   import tvm
   from tvm import relay, autotvm
   import tvm.relay.testing
   from tvm.autotvm.tuner import XGBTuner, GATuner, RandomTuner, GridSearchTuner
   from tvm.contrib.utils import tempdir
   import tvm.contrib.graph_executor as runtime
   from tvm.contrib import ndk
   
   if not sys.warnoptions:
       import warnings
       warnings.simplefilter("ignore") # Change the filter in this process
       os.environ["PYTHONWARNINGS"] = "ignore" # Also affect subprocesses
   
   
   def create_tuning_log_name(args):
       target = args.target.replace(' ', '_')
   
       return f"{args.model_name}__{args.rpc_key}__{target}__{args.n_trials}.log"
   
   
   def get_task_tuner(task, tuner_name):
       if tuner_name == "xgb" or tuner_name == "xgb-rank":
           tuner = XGBTuner(task, loss_type="rank")
       elif tuner_name == "ga":
           tuner = GATuner(task, pop_size=50)
       elif tuner_name == "random":
           tuner = RandomTuner(task)
       elif tuner_name == "gridsearch":
           tuner = GridSearchTuner(task)
       else:
           raise ValueError("Invalid tuner: " + tuner_name)
       return tuner
   
   
   def tune_model(args, mod, params, target):
       measure_option = autotvm.measure_option(
           builder=autotvm.LocalBuilder(build_func="ndk"),
           runner=autotvm.RPCRunner(
               args.rpc_key,
               host=args.rpc_tracker,
               port=args.rpc_port,
               number=args.runner_number,
               repeat=args.runner_repeat,
               timeout=60,
           )
       )
       
       tasks = autotvm.task.extract_from_program(
           mod["main"],
           target=target,
           params=params,
           ops=(relay.op.get("nn.conv2d"),),
       ) # TODO: parametrize ops which will be tuned
   
       if not tasks:
           print("No tasks...")
   
       task_log_filename = None
       for i, task in enumerate(reversed(tasks)):
           prefix = "[Task %2d/%2d] " % (i + 1, len(tasks))
           task_log_filename = args.output_tuning_log + f".tmp"
   
           task_tuner = get_task_tuner(task, "xgb")
   
           if os.path.isfile(task_log_filename):
               task_tuner.load_history(
                   autotvm.record.load_from_file(task_log_filename))
   
           n_trials = min(args.n_trials, len(task.config_space))
           task_tuner.tune(
               n_trial=n_trials,
               early_stopping=args.early_stopping,
               measure_option=measure_option,
               callbacks=[
                   autotvm.callback.progress_bar(n_trials, prefix=prefix),
                   autotvm.callback.log_to_file(task_log_filename),
               ],
           )
   
       if task_log_filename:
           # pick best records to a cache file
           autotvm.record.pick_best(task_log_filename, args.output_tuning_log)
           os.remove(task_log_filename)
   
   
   
   def main(args):
       target = tvm.target.Target(args.target, host=args.target_host)
   
       mod, params = load_model(args)
   
       tune_model(args, mod, params, target)
   
   
   if __name__ == "__main__":
       parser = argparse.ArgumentParser()
   
       parser.add_argument('--model_name', required=True,
           help="How do you name this model? "
           "This value is used for generating name, "
           "if output_tuning_log is not specified. No spaces.")
       parser.add_argument('--input_model', required=True,
           help="Fullpath to model")
       parser.add_argument('--input_name', required=True,
           help="Name of input node")
       parser.add_argument('--input_shape', required=True,
           help="Shape of input node, coma-separated, no spaces.")
       parser.add_argument('--input_dtype',
           default="float32", required=True,
           help="Dtype of input node")
   
       parser.add_argument('--rpc_tracker', 
           required=True,
           help="IP address of RPC tracker")
       parser.add_argument('--rpc_port', 
           type=int, required=True,
           help="IP port of RPC tracker")
       parser.add_argument('--rpc_key', 
           required=True,
           help="Key of RPC tracker")
   
       parser.add_argument('--output_tuning_log', 
           default=None,
           help="Where to save tuning output to be used for benchmark.")
       parser.add_argument('--runner_number', type=int, default=4, 
           help="Number of separate benchmark runs")
       parser.add_argument('--runner_repeat', type=int, default=3,
           help="Number of inference in one run")
       parser.add_argument('--n_trials', 
           type=int, default=10, 
           help="Number of trials. Must be larnger than 1. Typically 2000...")
       parser.add_argument('--early_stopping', 
           type=int, default=400,
           help='Early stopping for tuning. Ignore when no tuning.')
   
       parser.add_argument('--target', default="opencl", help="")
       parser.add_argument('--target_host',
           default="llvm -mtriple=aarch64-linux-gnu", help="")
   
       args = parser.parse_args()
       args.input_shape = eval(args.input_shape)
       if not args.output_tuning_log:
           args.output_tuning_log = create_tuning_log_name(args)
   
       main(args)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

Re: [I] [Bug] tuning cannot send task but tracker - device connection is ok [tvm]

Posted by "pfk-beta (via GitHub)" <gi...@apache.org>.

pfk-beta commented on issue #11058:
URL: https://github.com/apache/tvm/issues/11058#issuecomment-1883793631

   I'm pretty sure it was my fault, because I didn't know TVM well. I'm closing it, because my current setup is working.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org