You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "ttttttz (Code Review)" <ge...@cloudera.org> on 2022/06/10 03:21:29 UTC

[Impala-ASF-CR](3.x) IMPALA-11296: Fix infinite loop when reading orc files

ttttttz has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/18571 )

Change subject: IMPALA-11296: Fix infinite loop when reading orc files
......................................................................

IMPALA-11296: Fix infinite loop when reading orc files

When querying an ORC table, selecting only the missing fields of ORC files causes the query to be executed indefinitely. The corresponding execution node will see some resident threads that occupy CPU abnormally. The problem is caused by this: when OrcComplexColumnReader.children_.empty() is true, OrcComplexColumnReader.row_idx_ will remain constant, causing an infinite loop at HdfsOrcScanner::TransferTuples(). We should allow empty 'children_' for original files.

Testing:
- Added a test to test_scanners.py that ensures the query can be executed successfully when selecting only the missing fields of ORC files.

Change-Id: Ic7ecf5e9c94ffcc02d3ca6c2ec8d55a685ec3968
---
M be/src/exec/orc-column-readers.cc
M tests/query_test/test_scanners.py
2 files changed, 26 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/18571/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18571
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 3.x
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic7ecf5e9c94ffcc02d3ca6c2ec8d55a685ec3968
Gerrit-Change-Number: 18571
Gerrit-PatchSet: 4
Gerrit-Owner: ttttttz <24...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: ttttttz <24...@qq.com>