You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Yue Liu <ai...@gmail.com> on 2015/07/11 05:42:47 UTC

How to use the index in Parquet to improve the query

Hi, All,

I am using Hive-1.2.1, and store table as Parquet. Now I have a query as
below:

select count(1)
from lineitem
where l_quantity=1.0;

I read the document of Parquet, it said Parquet have the similar Min and
Max statistics like ORC to filter unrelated data.

But I notice that the records number showed by Counter RECORDS_IN is the
same with the whole table.

That is, the index in Parquet does not work.

What are the reasons?

Thanks!

Re: Hive install with PostgreSQL

Posted by Bharath Kumar <bh...@gmail.com>.
No you need not , install only on one node .

On Sun, Jul 12, 2015 at 9:52 AM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   I want to have Hive store it’s Metadata on PostgreSQL. Do I need to
> have an instance of PostgreSQL on every node?
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics, LLC
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>



-- 
Warm Regards,
                         *Bharath Kumar *

Hive install with PostgreSQL

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
I want to have Hive store it’s Metadata on PostgreSQL. Do I need to  have an instance of PostgreSQL on every node?

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData