You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Grant Henke (Jira)" <ji...@apache.org> on 2020/06/02 16:08:00 UTC

[jira] [Assigned] (KUDU-1802) Deserializing scan tokens should avoid round-trip to master

     [ https://issues.apache.org/jira/browse/KUDU-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Henke reassigned KUDU-1802:
---------------------------------

    Assignee: Grant Henke

> Deserializing scan tokens should avoid round-trip to master
> -----------------------------------------------------------
>
>                 Key: KUDU-1802
>                 URL: https://issues.apache.org/jira/browse/KUDU-1802
>             Project: Kudu
>          Issue Type: Improvement
>          Components: client, perf
>    Affects Versions: 1.2.0
>            Reporter: Todd Lipcon
>            Assignee: Grant Henke
>            Priority: Major
>              Labels: ramp-up
>
> Currently, KuduScanToken::DeserializeIntoScanner calls KuduClient::OpenTable() which makes a GetTableSchema call to the master. This round trip is a bit expensive because it's always a "thundering herd" for an Impala query or Spark job -- every host deserializes a bunch of scan tokens at the same time and ends up having to back off.
> We should consider some ways to avoid this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)