You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Mustafa İman (Jira)" <ji...@apache.org> on 2020/12/09 01:41:00 UTC
[jira] [Created] (HIVE-24510) Vectorize compute_bit_vector
Mustafa İman created HIVE-24510:
-----------------------------------
Summary: Vectorize compute_bit_vector
Key: HIVE-24510
URL: https://issues.apache.org/jira/browse/HIVE-24510
Project: Hive
Issue Type: Improvement
Reporter: Mustafa İman
Assignee: Mustafa İman
After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute stats functions are vectorizable. Only function that is not vectorizable is "compute_bit_vector" for ndv statistics computation. This causes "create table as select" and "insert overwrite select" queries to run in non-vectorized mode.
Even a very naive implementation of vectorized compute_bit_vector gives about 50% performance improvement on simple "insert overwrite select" queries. That is because entire mapper or reducer can run in vectorized mode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)