You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Fucun Chu (Code Review)" <ge...@cloudera.org> on 2021/11/21 11:43:02 UTC

[Impala-ASF-CR] WIP IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Fucun Chu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18028


Change subject: WIP IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................

WIP IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

This patch fixes the data loading problem of integrating Apache Hive 3
and switches to the tez engine.

Add HIVE-21569, HIVE-20038 patches and recompile the hive-exec module.

Todos:
- The number of tpch_nested_parquet.customer files is inconsistent with
 that generated by cdp
- Need more testing

Testing:
- Manually perform data loading steps.

Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
---
M buildall.sh
M fe/src/test/resources/hive-site.xml.py
M testdata/bin/generate-schema-statements.py
M testdata/bin/load_nested.py
M testdata/cluster/hive/README
A testdata/cluster/hive/patch1-HIVE-21569.diff
A testdata/cluster/hive/patch2-HIVE-20038.diff
M tests/util/test_file_parser.py
8 files changed, 346 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/18028/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 8: Code-Review+2

(1 comment)

Thanks for updating the patch! Carry Csaba's +1.

http://gerrit.cloudera.org:8080/#/c/18028/7/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/18028/7/testdata/bin/generate-schema-statements.py@479
PS7, Line 479:   else:
             :     statement += SET_HIVE_
> In CDP Hive:
I see. Maybe that config never exists.. I read the doc again and realize it's an ideal..



-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 8
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 15 Jul 2022 12:46:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Fucun Chu (Code Review)" <ge...@cloudera.org>.
Fucun Chu has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................

IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

This patch fixes the data loading problem of integrating Apache Hive 3
and switches to the tez engine.

Add HIVE-21569, HIVE-20038 patches and recompile the hive-exec module.

Testing:
- Manually perform data loading steps.

Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
---
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/resources/hive-site.xml.py
M testdata/bin/generate-schema-statements.py
M testdata/bin/load_nested.py
M testdata/bin/patch_hive.sh
M testdata/cluster/hive/README
A testdata/cluster/hive/patch1-HIVE-21569.diff
A testdata/cluster/hive/patch2-HIVE-20038.diff
M tests/util/test_file_parser.py
11 files changed, 381 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/18028/8
-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 8
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8339/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 10
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Jul 2022 05:23:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 9: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 9
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 15 Jul 2022 12:46:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 7: Code-Review+1

lgtm, thanks for working on this!


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 7
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Tue, 03 May 2022 11:43:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 7:

FWIW, Hive 3.1.3 is just released: https://www.mail-archive.com/dev@hive.apache.org/msg142674.html

Maybe we can bump our Apache Hive version to it.


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 7
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Wed, 04 May 2022 11:21:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 10: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 10
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Jul 2022 10:12:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10954/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 8
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Jul 2022 16:28:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8318/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 9
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 15 Jul 2022 12:46:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 7:

(2 comments)

Thanks a lot for working on this!

http://gerrit.cloudera.org:8080/#/c/18028/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18028/7//COMMIT_MSG@15
PS7, Line 15: - Manually perform data loading steps.
Are you able to run a subset of the tests? I'm curious if it's mature to add this into our precommit job.


http://gerrit.cloudera.org:8080/#/c/18028/7/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/18028/7/testdata/bin/generate-schema-statements.py@479
PS7, Line 479:   # For Apache Hive, "hive.hbase.bulk does not exist" exception will be thrown and there
             :   # is only warning in cdp
I'm curious about the exception. This feature seems already in Hive for some years: https://cwiki.apache.org/confluence/display/hive/hbasebulkload

Is it due to we recompile Hive?



-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 7
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Tue, 03 May 2022 12:34:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Fucun Chu (Code Review)" <ge...@cloudera.org>.
Fucun Chu has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18028/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18028/5//COMMIT_MSG@15
PS5, Line 15: - Manually perform data loading steps.
> We can skip tests that depends on this.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 7
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 02 May 2022 12:49:03 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10520/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 7
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 02 May 2022 13:08:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................

IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

This patch fixes the data loading problem of integrating Apache Hive 3
and switches to the tez engine.

Add HIVE-21569, HIVE-20038 patches and recompile the hive-exec module.

Testing:
- Manually perform data loading steps.

Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Reviewed-on: http://gerrit.cloudera.org:8080/18028
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/resources/hive-site.xml.py
M testdata/bin/generate-schema-statements.py
M testdata/bin/load_nested.py
M testdata/bin/patch_hive.sh
M testdata/cluster/hive/README
A testdata/cluster/hive/patch1-HIVE-21569.diff
A testdata/cluster/hive/patch2-HIVE-20038.diff
M tests/util/test_file_parser.py
11 files changed, 381 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 11
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Fucun Chu (Code Review)" <ge...@cloudera.org>.
Fucun Chu has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................

IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

This patch fixes the data loading problem of integrating Apache Hive 3
and switches to the tez engine.

Add HIVE-21569, HIVE-20038 patches and recompile the hive-exec module.

Testing:
- Manually perform data loading steps.

Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
---
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/resources/hive-site.xml.py
M testdata/bin/generate-schema-statements.py
M testdata/bin/load_nested.py
M testdata/bin/patch_hive.sh
M testdata/cluster/hive/README
A testdata/cluster/hive/patch1-HIVE-21569.diff
A testdata/cluster/hive/patch2-HIVE-20038.diff
M tests/util/test_file_parser.py
11 files changed, 381 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/18028/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 7
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 10: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 10
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Jul 2022 05:23:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: WIP IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18028/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18028/5//COMMIT_MSG@15
PS5, Line 15: - The number of tpch_nested_parquet.customer files is inconsistent with
We can skip tests that depends on this.



-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Fri, 11 Mar 2022 10:38:34 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18028 )

Change subject: WIP IMPALA-10871 (part 2): Apache Hive 3: fixes for dataset loading
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9822/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18028
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86a1fdffc70b8d9a3bc97a72b5b939021dc496f1
Gerrit-Change-Number: 18028
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Sun, 21 Nov 2021 12:06:49 +0000
Gerrit-HasComments: No