You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Jim Apple <jb...@cloudera.com> on 2016/03/31 17:55:17 UTC

Re: Impala compute incremental stats and insert speed becomes slowly when the partitions and the amount of data is larger

bcc:impala-user@, to:user@impala.incubator.apache.org

How many columns do you have? How many impalad nodes are there? How much
memory is your catalog configured to run with?

Incremental stats are expensive to store in the catalog, and may be
expensive to distribute to the impalads.

http://www.cloudera.com/documentation/enterprise/latest/topics/impala_perf_resources.html
recommends fewer than 30,000 partitions. Also, "compute stats" without
 "incremental" may be something worth trying.


On Thu, Mar 31, 2016 at 5:41 AM, Qinggang Wang <qi...@gmail.com>
wrote:

> Hi All,
>         There is a table has about a hundred billion data and fifty
> thousand partitions in the impala.  It becomes  troublesome that when we
> insert new partitions and execute compute incremental stats , the speed of
>  insert as well as compute stats either becomes very slowly compared with
> the condition that the number of partitions and the amount of data is
> small. The time of insert and compute stats either more that 80 seconds
> now, while neither of the time of insert and compute stats more than 2
> seconds when the data is small.  As there are 68 partitions one day, it is
> really cost much time in insert and compute. Is there any way to solve that?
>
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "Impala User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to impala-user+unsubscribe@cloudera.org.
>