You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Shubham Chaurasia (Jira)" <ji...@apache.org> on 2020/02/18 14:49:00 UTC
[jira] [Created] (HIVE-22903) Vectorized row_number() resets the
row number after one batch in case of constant expression in partition
clause
Shubham Chaurasia created HIVE-22903:
----------------------------------------
Summary: Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause
Key: HIVE-22903
URL: https://issues.apache.org/jira/browse/HIVE-22903
Project: Hive
Issue Type: Bug
Components: UDF, Vectorization
Affects Versions: 4.0.0
Reporter: Shubham Chaurasia
Assignee: Shubham Chaurasia
Vectorized row number implementation resets the row number when constant expression is passed in partition clause.
Repro Query
{code}
select row_number() over(partition by 1) r1, t from over10k_n8;
Or
select row_number() over() r1, t from over10k_n8;
{code}
where table over10k_n8 contains more than 1024 records.
This happens because currently in VectorPTFOperator, we reset evaluators if only partition clause is there.
{code:java}
// If we are only processing a PARTITION BY, reset our evaluators.
if (!isPartitionOrderBy) {
groupBatches.resetEvaluators();
}
{code}
To resolve, we should also check if the entire partition clause is a constant expression, if it is so then we should not do {{groupBatches.resetEvaluators()}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)