You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Peter Varga (Jira)" <ji...@apache.org> on 2020/09/14 16:45:00 UTC
[jira] [Created] (HIVE-24162) Query based compaction looses bloom
filter
Peter Varga created HIVE-24162:
----------------------------------
Summary: Query based compaction looses bloom filter
Key: HIVE-24162
URL: https://issues.apache.org/jira/browse/HIVE-24162
Project: Hive
Issue Type: Bug
Reporter: Peter Varga
Assignee: Peter Varga
*Steps to reproduce:*
{noformat}
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
| CREATE TABLE `bloomTest`( |
| `msisdn` string, |
| `imsi` varchar(20), |
| `imei` bigint, |
| `cell_id` bigint) |
| ROW FORMAT SERDE |
| 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' |
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
| LOCATION |
| 's3a://dwxtpcds30-wwgq-dwx-managed/clusters/env-6cwwgq/warehouse-1580338415-7dph/warehouse/tablespace/managed/hive/del_db.db/bloomtest' |
| TBLPROPERTIES ( |
| 'bucketing_version'='2', |
| 'orc.bloom.filter.columns'='msisdn,cell_id,imsi', |
| 'orc.bloom.filter.fpp'='0.02', |
| 'transactional'='true', |
| 'transactional_properties'='default', |
| 'transient_lastDdlTime'='1597222946') |
+----------------------------------------------------+
insert into bloomTest values ("a", "b", 10, 20);
insert into bloomTest values ("aa", "bb", 100, 200);
insert into bloomTest values ("aaa", "bbb", 1000, 2000);
select * from bloomTest;
+-------------------+-----------------+-----------------+--------------------+
| bloomtest.msisdn | bloomtest.imsi | bloomtest.imei | bloomtest.cell_id |
+-------------------+-----------------+-----------------+--------------------+
| a | b | 10 | 20 |
| aa | bb | 100 | 200 |
| aaa | bbb | 1000 | 2000 |
+-------------------+-----------------+-----------------+--------------------+
{noformat}
- Compact the table
{code:java}
alter table bloomTest compact 'MAJOR';
{code}
- Wait for the compaction to be over and check for bloom filters in dataset.
- delta would have it, but not in the base dataset.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)