You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2021/03/23 15:04:00 UTC
[jira] [Updated] (HIVE-24761) Vectorization: Support PTF - bounded
start windows
[ https://issues.apache.org/jira/browse/HIVE-24761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor updated HIVE-24761:
--------------------------------
Summary: Vectorization: Support PTF - bounded start windows (was: Support vectorization for bounded windows in PTF)
> Vectorization: Support PTF - bounded start windows
> --------------------------------------------------
>
> Key: HIVE-24761
> URL: https://issues.apache.org/jira/browse/HIVE-24761
> Project: Hive
> Issue Type: Sub-task
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {code}
> notVectorizedReason: PTF operator: *** only UNBOUNDED start frame is supported
> {code}
> Currently, bounded windows are not supported in VectorPTFOperator. If we simply remove the check compile-time:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java#L2911
> {code}
> if (!windowFrameDef.isStartUnbounded()) {
> setOperatorIssue(functionName + " only UNBOUNDED start frame is supported");
> return false;
> }
> {code}
> We get incorrect results, that's because vectorized codepath completely ignores boundaries, and simply iterates through all the input batches in [VectorPTFGroupBatches|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFGroupBatches.java#L172]:
> {code}
> for (VectorPTFEvaluatorBase evaluator : evaluators) {
> evaluator.evaluateGroupBatch(batch);
> if (isLastGroupBatch) {
> evaluator.doLastBatchWork();
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)