You are viewing a plain text version of this content. The canonical link for it is here.
Posted to discuss-archive@tvm.apache.org by Thierry via TVM Discuss <no...@discuss.tvm.ai> on 2019/11/07 18:34:34 UTC

[TVM Discuss] [Questions] VTA Conv2d Optimization Schedule and Optimal Throughput


@joyliu37 thanks for looking into this. There is at the moment two VTA design sources: the initial design (which was used in the TVM and VTA papers) that was generated with HLS - this is the design that one can test and deploy on the Pynq/Ultra96 boards and run workloads like Resnet-18. We've also ran tuning on this design to obtain the close to "compute bound" performance on the device (as shown by the roofline plots in the TVM paper). The reason it's not 100% compute bound is because the GEMM and ALU share the same task-level pipeline stage.
The second design (which is specified in Chisel, and supports cycle accurate simulation) is a new addition and is under development/refinement. 

Finally on TSIM not modeling DRAM bandwidth: we will be bandwidth limited due to port width. It might not incorporate a latency model, but it should throttle DRAM access due to the memory interface width.





---
[Visit Topic](http://tracking.discuss.tvm.ai/tracking/click?d=RfbtgwlnIp1akh3X-9OeuBQiztdippIJB1vm252PZ9PfA1EnEZzUJSlR5qr8uEzxHLoxAFIEnNqsZvCqT-6cszCyoEs58j1gqGsdc0jGQX7KxWoaVP8SIuzZnzpn29Mm6aMc_qDLURmak6zrAq280TozP-dg0wLhcSx1vShucbLw2vInl00l6zqyqTP9o3vrPB_Tl_RugHVl7Y40XZR35DA1) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](http://tracking.discuss.tvm.ai/tracking/click?d=7cFgOaAA4XIBVlVKt_oyC07uihTjg4Q6cjeBRNRTiPphU2ZYHhr_Zq3kbTZ8qNtShdscdkgubhz1jLM9SIDBbD4dkuB6s-hcrTJzBxQELjewNtob6tz0dacybDhEdb3iiVY0RPH_BPWSXgMCdHxqsD-9FFiIjeMmAkKnr4buKb_DOZGzAt8pE-6gr9JrOBuUucK5B-3k_r5jFDlFCq1YMAThY3uDj52QX3HR3WNoOPT-0).

Tianqi Chen, UW, Seattle, WA, 98105, United States
http://tracking.discuss.tvm.ai/tracking/unsubscribe?msgid=Z4h9GMBahpHr3SmjIhBErQ2