You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2022/04/22 11:11:00 UTC
[jira] [Created] (HIVE-26168) EXPLAIN DDL command output is not deterministic
Stamatis Zampetakis created HIVE-26168:
------------------------------------------
Summary: EXPLAIN DDL command output is not deterministic
Key: HIVE-26168
URL: https://issues.apache.org/jira/browse/HIVE-26168
Project: Hive
Issue Type: Bug
Components: HiveServer2
Reporter: Stamatis Zampetakis
The EXPLAIN DDL command (HIVE-24596) can be used to recreate the schema for a given query in order to debug planner issues. This is achieved by fetching information from the metastore and outputting series of DDL commands.
The output commands though may appear in different order among runs since there is no mechanism to enforce an explicit order.
Consider for instance the following scenario.
{code:sql}
CREATE TABLE customer
(
`c_custkey` bigint,
`c_name` string,
`c_address` string
);
INSERT INTO customer VALUES (1, 'Bob', '12 avenue Mansart'), (2, 'Alice', '24 avenue Mansart');
EXPLAIN DDL SELECT c_custkey FROM customer WHERE c_name = 'Bob';
{code}
+Result 1+
{noformat}
ALTER TABLE default.customer UPDATE STATISTICS SET('numRows'='2','rawDataSize'='48' );
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAIChJLg1AGD1aCNBg==
{noformat}
+Result 2+
{noformat}
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED
ALTER TABLE default.customer UPDATE STATISTICS SET('numRows'='2','rawDataSize'='48' );
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAIChJLg1AGD1aCNBg==
{noformat}
The two results are equivalent but the statements appear in a different order. This is not a big issue cause the results remain correct but it may lead to test flakiness so it might be worth addressing.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)