You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org> on 2021/03/26 05:51:15 UTC

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Wenzhe Zhou has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17234


Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................

IMPALA-10607: Fixed test_ctas_exprs failure for S3 build

New test case TestDecimalOverflowExprs::test_ctas_exprs was added
in the patch for IMPALA-10564. But it failed in S3 build with
Parquet format and complained the Parquet file had an invalid file
length when accessing a table. The table was created by CTAS which
finished with error "decimal expression overflowed". Verified this
issue does not happen if query option s3_skip_insert_staging is set
as false.
When s3_skip_insert_staging is set true by default, INSERT writing
to S3 goes directly to their final location rather than being
copied there by the coordinator. If CTAS finishs with error during
INSERT, the parquet partition file is left in un-finalized without
file footer.  This causes subsequent query failed with error like
"have an invalid file length on S3" when the query attemps to
access the same table.

This patch fixed the issue by deleting the un-finalized file in
its final location when AppendRows() return error and staging has
been skipped.

Testing:
 - Reproduced the test failure in local box with defaultFS as s3.
   Verified the fixing by running test_ctas_exprs with defaultFS
   as s3.
 - Passed core tests.

Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
---
M be/src/exec/hdfs-table-sink.cc
M tests/query_test/test_decimal_queries.py
2 files changed, 47 insertions(+), 14 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/17234/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8450/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Mar 2021 06:11:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................

IMPALA-10607: Fixed test_ctas_exprs failure for S3 build

New test case TestDecimalOverflowExprs::test_ctas_exprs was added
in the patch for IMPALA-10564. But it failed in S3 build with
Parquet format and complained the Parquet file had an invalid file
length when accessing a table. The table was created by CTAS which
finished with error "decimal expression overflowed". Verified this
issue does not happen if query option s3_skip_insert_staging is set
as false.
When s3_skip_insert_staging is set true by default, INSERT writing
to S3 goes directly to their final location rather than being
copied there by the coordinator. If CTAS finishs with error during
INSERT, the parquet partition file is left in un-finalized without
file footer.  This causes subsequent query failed with error like
"have an invalid file length on S3" when the query attemps to
access the same table.

This patch fixed the issue by deleting the un-finalized file in
its final location when AppendRows() return error and staging has
been skipped.

Testing:
 - Reproduced the test failure in local box with defaultFS as s3.
   Verified the fixing by running test_ctas_exprs with defaultFS
   as s3.
 - Passed core tests.

Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
---
M be/src/exec/hdfs-table-sink.cc
M tests/query_test/test_decimal_queries.py
2 files changed, 47 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/17234/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 31 Mar 2021 21:18:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8451/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Mar 2021 06:19:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7034/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 31 Mar 2021 21:18:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................

IMPALA-10607: Fixed test_ctas_exprs failure for S3 build

New test case TestDecimalOverflowExprs::test_ctas_exprs was added
in the patch for IMPALA-10564. But it failed in S3 build with
Parquet format and complained the Parquet file had an invalid file
length when accessing a table. The table was created by CTAS which
finished with error "decimal expression overflowed". Verified this
issue does not happen if query option s3_skip_insert_staging is set
as false.
When s3_skip_insert_staging is set true by default, INSERT writing
to S3 goes directly to their final location rather than being
copied there by the coordinator. If CTAS finishs with error during
INSERT, the parquet partition file is left in un-finalized without
file footer.  This causes subsequent query failed with error like
"have an invalid file length on S3" when the query attemps to
access the same table.

This patch fixed the issue by deleting the un-finalized file in
its final location when AppendRows() return error and staging has
been skipped.

Testing:
 - Reproduced the test failure in local box with defaultFS as s3.
   Verified the fixing by running test_ctas_exprs with defaultFS
   as s3.
 - Passed core tests.

Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Reviewed-on: http://gerrit.cloudera.org:8080/17234
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/hdfs-table-sink.cc
M tests/query_test/test_decimal_queries.py
2 files changed, 47 insertions(+), 14 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 31 Mar 2021 21:18:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17234/2/tests/query_test/test_decimal_queries.py
File tests/query_test/test_decimal_queries.py:

http://gerrit.cloudera.org:8080/#/c/17234/2/tests/query_test/test_decimal_queries.py@184
PS2, Line 184: L
> flake8: E501 line too long (94 > 90 characters)
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Mar 2021 05:58:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 01 Apr 2021 03:07:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10607: Fixed test ctas exprs failure for S3 build

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17234 )

Change subject: IMPALA-10607: Fixed test_ctas_exprs failure for S3 build
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17234/2/tests/query_test/test_decimal_queries.py
File tests/query_test/test_decimal_queries.py:

http://gerrit.cloudera.org:8080/#/c/17234/2/tests/query_test/test_decimal_queries.py@184
PS2, Line 184: L
flake8: E501 line too long (94 > 90 characters)



-- 
To view, visit http://gerrit.cloudera.org:8080/17234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f64ab987aeada2cda41502e8c5dbbc229daefd
Gerrit-Change-Number: 17234
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Mar 2021 05:52:01 +0000
Gerrit-HasComments: Yes