You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Igor Kuzmenko <f1...@gmail.com> on 2017/08/21 15:00:47 UTC

Unexpected query result

Runnuning simple '*select count(*) from test_table*'  query returned me
500_000 result.
But when i run '*select count(distinct field) from test_table*' query
result is 500_001.

How it coud happen, that in table with 500_000 records have 500_001 unique
field values?

I'm using Hive from HDP 2.5.0 platform.
Table stored as ORC.

RE: Unexpected query result

Posted by Frank Luo <jl...@merkleinc.com>.
One possibility is that count(*) gives a cached stat, while count(distinct field) actually read data and perform the logic.

Try to set the below and test again:

set hive.compute.query.using.stats=false;



From: Igor Kuzmenko [mailto:f1sherox@gmail.com]
Sent: Monday, August 21, 2017 10:01 AM
To: user@hive.apache.org
Subject: Unexpected query result

Runnuning simple 'select count(*) from test_table'  query returned me 500_000 result.
But when i run 'select count(distinct field) from test_table' query result is 500_001.

How it coud happen, that in table with 500_000 records have 500_001 unique field values?

I'm using Hive from HDP 2.5.0 platform.
Table stored as ORC.

Access Merkle’s award-winning Digital Marketing Report for the latest trends and benchmarks in digital marketing<http://www2.merkleinc.com/l/47252/2017-01-25/4k525x>

This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege.