You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/11/09 07:22:00 UTC

[jira] [Commented] (IMPALA-11659) Server network util of Statestored is much higher than impala-3.4

    [ https://issues.apache.org/jira/browse/IMPALA-11659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630801#comment-17630801 ] 

Quanlong Huang commented on IMPALA-11659:
-----------------------------------------

Maybe we can simply fix this by not sending updates to executors. They just send stats to statestore, and statestore sends updates to just the coordinators.

This is similar to the pattern of the catalog topic, i.e. catalogd -> statestore -> coordinators. Statestore just propagates the catalog updates sent from catalogd to all coordinators. It won't send the updates back to the catalogd. This is achieved by adding a special filter_prefix when registering to statestore:
{code:cpp}
  // The catalogd never needs to read any entries from the topic. It only publishes
  // entries. So, we set a prefix to some random character that we know won't be a
  // prefix of any key. This saves a bit of network communication from the statestore
  // back to the catalog.
  string filter_prefix = "!";
  Status status = statestore_subscriber_->AddTopic(IMPALA_CATALOG_TOPIC,
      /* is_transient=*/ false, /* populate_min_subscriber_topic_version=*/ false,
      filter_prefix, cb);
{code}
[https://github.com/apache/impala/blob/4bc86ac638fb85deda317b9b1dfd1f574fa0553b/be/src/catalog/catalog-server.cc#L320-L327]

> Server network util of Statestored is much higher than impala-3.4
> -----------------------------------------------------------------
>
>                 Key: IMPALA-11659
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11659
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Perf Investigation
>    Affects Versions: Impala 4.1.0
>            Reporter: Yuchen Fan
>            Priority: Major
>         Attachments: image-2022-10-13-19-49-28-456.png, image-2022-10-13-19-57-31-248.png, image-2022-10-13-20-08-24-931.png
>
>
> We found that server network traffic of Statestored is rised immediatrly after upgraded to Impala-4.1. Using 'iftop' shows Statestored has aboud 6MB/s outbound network communication with every Impalad. With more than 250 nodes of Impalad, Statestored server has 10~15Gb/s(1~2GB/s) outbound(60x higher than it before upgrade) and about 90Mb/s(>10MB/s) inbound(10x higher than it before upgrade) network communication, which will occupy about 75% server network util. TCP packet snapshot shows detail of packet is 'Pool' and 'Stat' information in impala-request-queue topic. We found that value of key 'STAT:' contains per host stat of all Impalads. Related function is from https://issues.apache.org/jira/browse/IMPALA-8762. So initial update size of impala-request-queue is more than 4MB when there are more than 250 Impalads. If cluster has more Impalad, Statestored server may hang because of network blocking.
> I think coordinators updating topic of per host statistics is just fine. All Impalads(including coordinators and executors) registering subscriber to topic of impala-request-queue makes Statestored need to broadcast all per node statistics to all Impalads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org