You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Jingsong Lee (Jira)" <ji...@apache.org> on 2020/01/07 08:16:00 UTC

[jira] [Created] (FLINK-15498) Using HiveCatalog in TPC-DS e2e

Jingsong Lee created FLINK-15498:
------------------------------------

             Summary: Using HiveCatalog in TPC-DS e2e
                 Key: FLINK-15498
                 URL: https://issues.apache.org/jira/browse/FLINK-15498
             Project: Flink
          Issue Type: Improvement
          Components: Table SQL / Planner, Tests
            Reporter: Jingsong Lee
             Fix For: 1.11.0


In 1.10, we have made great progress in the performance and function of batch. After our internal test, the performance is significantly ahead of hive.

But it's hard for users to reproduce. They need to have some research on TPC-DS to write test code.

We can consider changing the E2E test of TPC-DS to HiveCatalog, which is roughly divided into two stages:
 # The first stage is prepare of hive. Prepare the tables of TPC-DS. Insert the data and prepare the metastore. And analysis the tables.
 # The second stage is the analysis of Flink. Only select and check results.

Users can play with it only by changing the data scale of the first stage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)