You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Peikai Zheng (JIRA)" <ji...@apache.org> on 2018/09/26 02:54:00 UTC

[jira] [Created] (IMPALA-7627) Parallel the fetching permission process

Peikai Zheng created IMPALA-7627:
------------------------------------

             Summary: Parallel the fetching permission process
                 Key: IMPALA-7627
                 URL: https://issues.apache.org/jira/browse/IMPALA-7627
             Project: IMPALA
          Issue Type: Improvement
            Reporter: Peikai Zheng


There are three phases when the Catalogd loading the metadata of a table.
Firstly, the Catalogd fetches the metadata from Hive metastore;
Then, the Catalogd fetches the permission of each partition from HDFS NameNode;
Finally, the Catalogd loads the file descriptor from HDFS NameNode.

According to my test result:

||Average Time(GetFileInfoThread=10) || phase 1 || phase 2 || phase 3||			
|idm.sauron_message|9.9917115|459.2106944|95.0179163|
|default.revenue_enriched|12.3377474|111.2969046|40.827472|
|default.upp_raw_prod|1.5143162|50.0251426|12.6805323|
|default.hit_to_beacon_playback_prod|1.4294509|49.7670539|18.3557858|
|default.sitetracking_enriched|13.0003804|112.8746656|42.1824032|
|default.player_custom_event|9.2618705|493.4865302|116.4986184|
|default.revenue_day_est|57.9116561|106.5028664|24.005822|

The majority of the time occupied by the second phase. 

So, I suggest to parallel the second phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)