You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Prasanth Jayachandran (JIRA)" <ji...@apache.org> on 2017/11/28 09:38:00 UTC

[jira] [Updated] (HIVE-18160) Jar localization during session initialization is slow

     [ https://issues.apache.org/jira/browse/HIVE-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasanth Jayachandran updated HIVE-18160:
-----------------------------------------
    Description: 
Same Jar getting localized multiple times resulting in SHA256 computation several times causes slow session initialization time.
{code}
2017-11-28T00:40:55,795 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 241 ms
2017-11-28T00:40:56,105 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: e20986f3a422f8fa5eb61c5a2756cd6f7d2b779dbcab49eae6f2c8dfff7ad2a2 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-llap-tez-3.0.0-SNAPSHOT.jar of length: 109.53KB in 1 ms
2017-11-28T00:40:56,353 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 231 ms
2017-11-28T00:40:56,602 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 241 ms
2017-11-28T00:40:56,612 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 686d66b825fdc4fc241e0591e7646a1bbca1c7114a7224c41da7f4795cf9477a for file: file:/work/hadoop/hadoop/hadoop-dist/target/hadoop-2.9.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-registry-2.9.0-SNAPSHOT.jar of length: 122.72KB in 2 ms
{code} 

From above logs, sha256 is computed 3 times for hive-exec jar and each invocation takes around 240ms. 

  was:
Same Jar getting localized multiple times resulting in SHA256 computation several times causes slow session initialization time. Also, the default sha256 implementation from commons-codec uses 1KB buffer to read jar file which is slow (buffer size not configurable).
{code}
2017-11-28T00:40:55,795 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 241 ms
2017-11-28T00:40:56,105 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: e20986f3a422f8fa5eb61c5a2756cd6f7d2b779dbcab49eae6f2c8dfff7ad2a2 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-llap-tez-3.0.0-SNAPSHOT.jar of length: 109.53KB in 1 ms
2017-11-28T00:40:56,353 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 231 ms
2017-11-28T00:40:56,602 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 241 ms
2017-11-28T00:40:56,612 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 686d66b825fdc4fc241e0591e7646a1bbca1c7114a7224c41da7f4795cf9477a for file: file:/work/hadoop/hadoop/hadoop-dist/target/hadoop-2.9.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-registry-2.9.0-SNAPSHOT.jar of length: 122.72KB in 2 ms
{code} 

From above logs, sha256 is computed 3 times for hive-exec jar and each invocation takes around 240ms. 


> Jar localization during session initialization is slow
> ------------------------------------------------------
>
>                 Key: HIVE-18160
>                 URL: https://issues.apache.org/jira/browse/HIVE-18160
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>
> Same Jar getting localized multiple times resulting in SHA256 computation several times causes slow session initialization time.
> {code}
> 2017-11-28T00:40:55,795 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 241 ms
> 2017-11-28T00:40:56,105 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: e20986f3a422f8fa5eb61c5a2756cd6f7d2b779dbcab49eae6f2c8dfff7ad2a2 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-llap-tez-3.0.0-SNAPSHOT.jar of length: 109.53KB in 1 ms
> 2017-11-28T00:40:56,353 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 231 ms
> 2017-11-28T00:40:56,602 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 55aa783d2dda0599fb89a37daae2a2efebf0eed0d4f6e99e3ce140d2fa2f0c30 for file: file:/work/hive/hive-git/packaging/target/apache-hive-3.0.0-SNAPSHOT-bin/apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-exec-3.0.0-SNAPSHOT.jar of length: 35.68MB in 241 ms
> 2017-11-28T00:40:56,612 INFO  [main]: tez.TezSessionState (TezSessionState.java:createJarLocalResource(716)) - Computed sha: 686d66b825fdc4fc241e0591e7646a1bbca1c7114a7224c41da7f4795cf9477a for file: file:/work/hadoop/hadoop/hadoop-dist/target/hadoop-2.9.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-registry-2.9.0-SNAPSHOT.jar of length: 122.72KB in 2 ms
> {code} 
> From above logs, sha256 is computed 3 times for hive-exec jar and each invocation takes around 240ms. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)