You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (Jira)" <ji...@apache.org> on 2020/05/01 01:11:00 UTC

[jira] [Commented] (HIVE-23175) Skip serializing hadoop and tez config on HS side

    [ https://issues.apache.org/jira/browse/HIVE-23175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097085#comment-17097085 ] 

Hive QA commented on HIVE-23175:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/13001688/HIVE-23175.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/22044/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/22044/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-22044/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2020-05-01 01:07:12.660
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-22044/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2020-05-01 01:07:12.663
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 4a82ccf HIVE-23319: multi_insert_partitioned is flaky (Vineet Garg, reviewed by Jesus Camacho Rodriguez)
+ git clean -f -d
Removing ${project.basedir}/
Removing itests/${project.basedir}/
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 4a82ccf HIVE-23319: multi_insert_partitioned is flaky (Vineet Garg, reviewed by Jesus Camacho Rodriguez)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2020-05-01 01:07:13.822
+ rm -rf ../yetus_PreCommit-HIVE-Build-22044
+ mkdir ../yetus_PreCommit-HIVE-Build-22044
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-22044
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-22044/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch
Trying to apply the patch with -p0
Going to apply patch with: git apply -p0
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: executing: [/tmp/protoc9104690568870685335.exe, --version]
libprotoc 2.6.1
protoc-jar: executing: [/tmp/protoc9104690568870685335.exe, -I/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore, --java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/target/generated-sources, /data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
protoc-jar: executing: [/tmp/protoc7520667168321513472.exe, --version]
libprotoc 2.6.1
ANTLR Parser Generator  Version 3.5.2
Output file /data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-server/target/generated-sources/org/apache/hadoop/hive/metastore/parser/FilterParser.java does not exist: must build /data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g
org/apache/hadoop/hive/metastore/parser/Filter.g
ANTLR Parser Generator  Version 3.5.2
Output file /data/hiveptest/working/apache-github-source-source/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveLexer.java does not exist: must build /data/hiveptest/working/apache-github-source-source/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
org/apache/hadoop/hive/ql/parse/HiveLexer.g
Output file /data/hiveptest/working/apache-github-source-source/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParser.java does not exist: must build /data/hiveptest/working/apache-github-source-source/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
org/apache/hadoop/hive/ql/parse/HiveParser.g
log4j:WARN No appenders could be found for logger (DataNucleus.Persistence).
log4j:WARN Please initialize the log4j system properly.
DataNucleus Enhancer (version 4.1.17) for API "JDO"
DataNucleus Enhancer completed with success for 43 classes.
Processing annotations
Annotations processed
Processing annotations
No elements to process
Output file /data/hiveptest/working/apache-github-source-source/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HintParser.java does not exist: must build /data/hiveptest/working/apache-github-source-source/parser/src/java/org/apache/hadoop/hive/ql/parse/HintParser.g
org/apache/hadoop/hive/ql/parse/HintParser.g
Generating vector expression code
Generating vector expression test code
Processing annotations
Annotations processed
Processing annotations
No elements to process
[ERROR] COMPILATION ERROR : 
[ERROR] /data/hiveptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java:[176,34] cannot find symbol
  symbol:   method createConfFromBaseConfAndPayload(org.apache.tez.runtime.api.ProcessorContext)
  location: class org.apache.tez.common.TezUtils
[ERROR] /data/hiveptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java:[130,53] cannot find symbol
  symbol:   method getVertexConfiguration()
  location: variable initializerContext of type org.apache.tez.runtime.api.InputInitializerContext
[ERROR] /data/hiveptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java:[131,13] cannot find symbol
  symbol:   method addToConfFromByteString(org.apache.hadoop.conf.Configuration,com.google.protobuf.ByteString)
  location: class org.apache.tez.common.TezUtils
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile (default-compile) on project hive-exec: Compilation failure: Compilation failure:
[ERROR] /data/hiveptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java:[176,34] cannot find symbol
[ERROR] symbol:   method createConfFromBaseConfAndPayload(org.apache.tez.runtime.api.ProcessorContext)
[ERROR] location: class org.apache.tez.common.TezUtils
[ERROR] /data/hiveptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java:[130,53] cannot find symbol
[ERROR] symbol:   method getVertexConfiguration()
[ERROR] location: variable initializerContext of type org.apache.tez.runtime.api.InputInitializerContext
[ERROR] /data/hiveptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java:[131,13] cannot find symbol
[ERROR] symbol:   method addToConfFromByteString(org.apache.hadoop.conf.Configuration,com.google.protobuf.ByteString)
[ERROR] location: class org.apache.tez.common.TezUtils
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hive-exec
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-22044
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 13001688 - PreCommit-HIVE-Build

> Skip serializing hadoop and tez config on HS side
> -------------------------------------------------
>
>                 Key: HIVE-23175
>                 URL: https://issues.apache.org/jira/browse/HIVE-23175
>             Project: Hive
>          Issue Type: Improvement
>          Components: Tez
>            Reporter: Mustafa Iman
>            Assignee: Mustafa Iman
>            Priority: Major
>         Attachments: HIVE-23175.1.patch, HIVE-23175.2.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> HiveServer spends a lot of time serializing configuration objects. We can skip putting hadoop and tez config xml files in payload assuming that the configs are the same on both HS and Task side. This depends on Tez to load local xml configs when creating config objects [https://issues.apache.org/jira/browse/TEZ-4137] 
> Ideally we should be able to skip hive-site.xml too. However, if we skip hive-site.xml at that stage, then we make wrong choices at tez dag build stage due to missing configs.
> In the ideal version of this, we should not be both looking up configs and putting new configs from and to the same config object at DAG and Vertex build phases. Instead we should be looking up from a HS2's HiveConf object and writing to a brand new JobConf for each vertex. That way we would not have any unnecessary item in the jobconf for any vertex. However Dag and Vertex build stages (TezTask#build) and a lot of other components called from there treat a single config object both the source of HS2 side config and the target JobConf that they are putting vertex level options into. It is very hard to separate these concerns now.
> With this patch, we are reducing the size of JobConf (per vertex) by ~65%. It should improve the transmit latency. However, most significant gains are at CPU time while compressing job configs as the config objects are much smaller now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)