You are viewing a plain text version of this content. The canonical link for it is here.
Posted to discuss-archive@tvm.apache.org by Julien Posso via TVM Discuss <no...@discuss.tvm.ai> on 2020/08/27 15:24:18 UTC

[TVM Discuss] [Questions] [VTA] Inference questions


Hi, as a new user I have some questions about using VTA in simulation and RPC server mode:

1. Are fully connected layers (and non-quantized convolutional layers) executed by target CPU (ARM CPU of the board) ? Or by host CPU (x86 CPU of my computer) ?

2. What is measured exactly when using VTA in tsim with the timer() function: Only part offloaded to VTA or also layers executed by target ARM CPU ? It is related to question 1.

3. The value returned by timer() function when I execute the MxNet tutorial (https://tvm.apache.org/docs/vta/tutorials/frontend/deploy_classification.html#sphx-glr-vta-tutorials-frontend-deploy-classification-py) in tsim is about 90 seconds! Why is it so far from the results in the publication?

4. How to interpret the simulation stats in tsim (cycle_count)? and in fsim (inp_load_nbytes, etc...)?

5. Is it possible to measure execution time layer by layer to identify a bottleneck in the neural network ?

Thanks in advance :smiley:





---
[Visit Topic](https://discuss.tvm.ai/t/vta-inference-questions/7738/1) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/0e3a29b37b3b77c43ca202cab6f434d5d35d6df923e75ac08d54e67f64f717f3).