You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Sandaruwan Kumarasingha <sa...@cse.mrt.ac.lk> on 2022/05/13 12:45:51 UTC
Requesting Solutions to Improve Impala Performance with a huge Kudu Data Load
Hi Team,
Our team is working on a huge data load in Kudu and we are currently facing
a performance issue . We are hoping you can guide us on a solution to the
below mentioned concerns.
We have 212 million data loads in Kudu. Currently for such a data load,
when loading through impala, 47 seconds are spent for query processing and
loading overall. We have used default configurations in Kudu and Impala
with 6 node clusters to get these numbers.We haven’t reached the
performance we expected.
I have attached the impala profile and DDL of the table creation. We have
used impala-3.4.0 and kudu-1.15.0 versions.
*What can we do to reduce the time spent for loading 212 million data loads
from 47 seconds to 10 seconds through impala?*
We would be grateful if you can provide us with some solutions at
your earliest possible.
Thank You! Regards,
Sandaruwan Kumarasingha.