You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2018/03/30 04:12:00 UTC

[jira] [Commented] (HIVE-18814) Support Add Partition For Acid tables

    [ https://issues.apache.org/jira/browse/HIVE-18814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420130#comment-16420130 ] 

Hive QA commented on HIVE-18814:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12916921/HIVE-18814.02.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9915/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9915/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9915/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-03-30 04:09:36.760
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9915/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-03-30 04:09:36.776
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 3e3292b HIVE-19029 - Load Data should prevent loading acid files (Eugene Koifman, reviewed by Jason Dere)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 3e3292b HIVE-19029 - Load Data should prevent loading acid files (Eugene Koifman, reviewed by Jason Dere)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-03-30 04:09:41.286
+ rm -rf ../yetus_PreCommit-HIVE-Build-9915
+ mkdir ../yetus_PreCommit-HIVE-Build-9915
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-9915
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9915/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch
error: patch failed: ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java:447
Falling back to three-way merge...
Applied patch to 'ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java' with conflicts.
Going to apply patch with: git apply -p0
error: patch failed: ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java:447
Falling back to three-way merge...
Applied patch to 'ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java' with conflicts.
U ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12916921 - PreCommit-HIVE-Build

> Support Add Partition For Acid tables
> -------------------------------------
>
>                 Key: HIVE-18814
>                 URL: https://issues.apache.org/jira/browse/HIVE-18814
>             Project: Hive
>          Issue Type: New Feature
>          Components: Transactions
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Major
>         Attachments: HIVE-18814.01.patch, HIVE-18814.02.patch
>
>
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual%2BDDL#LanguageManualDDL-AddPartitions]
> Add Partition command creates a {{Partition}} metadata object and sets the location to the directory containing data files.
> In current master (Hive 3.0), Add partition on an acid table doesn't fail and at read time the data is decorated with row__id but the original transaction is 0.  I suspect in earlier Hive versions this will throw or return no data.
> Since this new partition didn't have data before, assigning txnid:0 isn't going to generate duplicate IDs but it could violate Snapshot Isolation in multi stmt txns.  Suppose txnid:7 runs {{select * from T}}.  Then txnid:8 adds a partition to T.  Now if txnid:7 runs the same query again, it will see the data in the new partition.
> This can't be release like this since a delete on this data (added via Add partition) will use row_ids with txnid:0 so a later upgrade that sees un-compacted may generate row_ids with different txnid (assuming this is fixed by then)
>  
> One option is follow Load Data approach and create a new delta_x_x/ and move/copy the data there.
>  
> Another is to allocate a new writeid and save it in Partition metadata.  This could then be used to decorate data with ROW__IDs.  This avoids move/copy but retains data "outside" of the table tree which make it more likely that this data will be modified in some way which can really break things if done after and SQL update/delete on this data have happened. 
>  
> It performs no validations on add (except for partition spec) so any file with any format can be added.  It allows add to bucketed tables as well.
> Seems like a very dangerous command.  Maybe a better option is to block it and advise using Load Data.  Alternatively, make this do Add partition metadata op followed by Load Data. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)