You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Lars Volker (Code Review)" <ge...@cloudera.org> on 2017/03/08 18:17:28 UTC

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6317

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................

IMPALA-4615: Fix create_table.sql command order

INSERT OVERWRITE commands in Hive will only affect partitions that Hive
knows about. If an external table gets dropped and recreated, then
'MSCK REPAIR TABLE' needs to be executed to recover any preexisting
partitions. Otherwise, an INSERT OVERWRITE will not remove the data
files in those partitions and will fail to move the new data in place.

More information can be found here:
http://www.ericlin.me/hive-insert-overwrite-does-not-remove-existing-data

Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
---
M testdata/avro_schema_resolution/create_table.sql
1 file changed, 11 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/6317/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................

IMPALA-4615: Fix create_table.sql command order

INSERT OVERWRITE commands in Hive will only affect partitions that Hive
knows about. If an external table gets dropped and recreated, then
'MSCK REPAIR TABLE' needs to be executed to recover any preexisting
partitions. Otherwise, an INSERT OVERWRITE will not remove the data
files in those partitions and will fail to move the new data in place.

More information can be found here:
http://www.ericlin.me/hive-insert-overwrite-does-not-remove-existing-data

I tested the fix by running the following commands, making sure that the
second run of the .sql script completed without errors.

export JDBC_URL="jdbc:hive2://${HS2_HOST_PORT}/default;"
export HS2_HOST_PORT=localhost:11050

beeline -n $USER -u "${JDBC_URL}" -f ${IMPALA_HOME}/testdata/avro_schema_resolution/create_table.sql
beeline -n $USER -u "${JDBC_URL}" -f ${IMPALA_HOME}/testdata/avro_schema_resolution/create_table.sql

Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
---
M testdata/avro_schema_resolution/create_table.sql
1 file changed, 11 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/6317/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


Patch Set 4:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/365/

-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


Patch Set 1:

(1 comment)

Thanks for the review. I updated the commit message.

http://gerrit.cloudera.org:8080/#/c/6317/1//COMMIT_MSG
Commit Message:

Line 13: files in those partitions and will fail to move the new data in place.
> How did you test this change?
Added an explanation to the commit message. I'm not sure how to test this on every run of the test suite, but since we've been hitting it from time to time I think we should be ok.


-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


Patch Set 2: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/361/

-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


Patch Set 4: Code-Review+2

I figured out that the submit tests failed since I was lacking a final MSCK command. I fixed that, did more manual testing, updated the commit message with my findings, and rebase.

Carrying Alex's +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6317

to look at the new patch set (#3).

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................

IMPALA-4615: Fix create_table.sql command order

INSERT OVERWRITE commands in Hive will only affect partitions that Hive
knows about. If an external table gets dropped and recreated, then
'MSCK REPAIR TABLE' needs to be executed to recover any preexisting
partitions. Otherwise, an INSERT OVERWRITE will not remove the data
files in those partitions and will fail to move the new data in place.

More information can be found here:
http://www.ericlin.me/hive-insert-overwrite-does-not-remove-existing-data

I tested the fix by running the following commands, making sure that the
second run of the .sql script completed without errors.

export JDBC_URL="jdbc:hive2://${HS2_HOST_PORT}/default;"
export HS2_HOST_PORT=localhost:11050

beeline -n $USER -u "${JDBC_URL}" -f ${IMPALA_HOME}/testdata/avro_schema_resolution/create_table.sql
beeline -n $USER -u "${JDBC_URL}" -f ${IMPALA_HOME}/testdata/avro_schema_resolution/create_table.sql

Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
---
M testdata/avro_schema_resolution/create_table.sql
1 file changed, 13 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/6317/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6317/1//COMMIT_MSG
Commit Message:

Line 13: files in those partitions and will fail to move the new data in place.
How did you test this change?


http://gerrit.cloudera.org:8080/#/c/6317/1/testdata/avro_schema_resolution/create_table.sql
File testdata/avro_schema_resolution/create_table.sql:

Line 166: MSCK REPAIR TABLE avro_coldef;
Wow, nice catch!


-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


IMPALA-4615: Fix create_table.sql command order

INSERT OVERWRITE commands in Hive will only affect partitions that Hive
knows about. If an external table gets dropped and recreated, then
'MSCK REPAIR TABLE' needs to be executed to recover any preexisting
partitions. Otherwise, an INSERT OVERWRITE will not remove the data
files in those partitions and will fail to move the new data in place.

More information can be found here:
http://www.ericlin.me/hive-insert-overwrite-does-not-remove-existing-data

I tested the fix by running the following commands, making sure that the
second run of the .sql script completed without errors and validating
the number of lines was correct (10) after both runs.

export JDBC_URL="jdbc:hive2://${HS2_HOST_PORT}/default;"
export HS2_HOST_PORT=localhost:11050

beeline -n $USER -u "${JDBC_URL}" -f ${IMPALA_HOME}/testdata/avro_schema_resolution/create_table.sql
beeline -n $USER -u "${JDBC_URL}" -f ${IMPALA_HOME}/testdata/avro_schema_resolution/create_table.sql

Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Reviewed-on: http://gerrit.cloudera.org:8080/6317
Reviewed-by: Lars Volker <lv...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M testdata/avro_schema_resolution/create_table.sql
1 file changed, 13 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Lars Volker: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6317

to look at the new patch set (#4).

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................

IMPALA-4615: Fix create_table.sql command order

INSERT OVERWRITE commands in Hive will only affect partitions that Hive
knows about. If an external table gets dropped and recreated, then
'MSCK REPAIR TABLE' needs to be executed to recover any preexisting
partitions. Otherwise, an INSERT OVERWRITE will not remove the data
files in those partitions and will fail to move the new data in place.

More information can be found here:
http://www.ericlin.me/hive-insert-overwrite-does-not-remove-existing-data

I tested the fix by running the following commands, making sure that the
second run of the .sql script completed without errors and validating
the number of lines was correct (10) after both runs.

export JDBC_URL="jdbc:hive2://${HS2_HOST_PORT}/default;"
export HS2_HOST_PORT=localhost:11050

beeline -n $USER -u "${JDBC_URL}" -f ${IMPALA_HOME}/testdata/avro_schema_resolution/create_table.sql
beeline -n $USER -u "${JDBC_URL}" -f ${IMPALA_HOME}/testdata/avro_schema_resolution/create_table.sql

Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
---
M testdata/avro_schema_resolution/create_table.sql
1 file changed, 13 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/6317/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/361/

-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4615: Fix create table.sql command order

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4615: Fix create_table.sql command order
......................................................................


Patch Set 2: Code-Review+2

Thanks for fixing long-standing and annoying issue, Lars!

-- 
To view, visit http://gerrit.cloudera.org:8080/6317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0f68eeb75ba2f43b96b8f3d82f902e291d3bd396
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-HasComments: No