You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "Natarajan, Prabakaran 1. (NSN - IN/Bangalore)" <pr...@nsn.com> on 2014/07/31 14:50:29 UTC

Hadoop and Hive Performance Tuning

Hi

I am using hive queries on structured RC file.

Can you please let me know, the key performance parameters  that I have tune for better query performance (for Hadoop 2.3/ Yarn and Hive 0.13).

Thanks and Regards
Prabakaran.N  aka NP
nsn, Bangalore
When "I" is replaced by "We" - even Illness becomes "Wellness"





RE: Hadoop and Hive Performance Tuning

Posted by Juan Martin Pampliega <jp...@gmail.com>.
Using ORC is a very good idea. Also, since you are using Hive 0.13 you
might want to try Tez as a query engine and enabling vectorization. The
following link is a good guide for that
http://hortonworks.com/hadoop-tutorial/supercharging-interactive-queries-hive-tez/
On Jul 31, 2014 10:26 AM, "Hussain Jamali" <hu...@amdocs.com>
wrote:

>  I think first thing you should consider to change file format RC to ORC
>
> With Newer Hadoop version ORC file format is very optimized format
> compared to RC. It will give you far better compression and performance
> than RC file.
>
>
>
> Go for Zlib compression + ORC file format.
>
>
>
> *Regards,*
>
>
>
> *Hussain Jamali *[image: cid:image001.jpg@01CC94C4.E2F66080]
>
>  *|** T: +91.20.4135-1138 |  M: +91.89.56119707 |  E:
> hussain.jamali@amdocs.com <hu...@amdocs.com> *
>
> *AMDOCS |* *EMBRACE CHALLENGE* *EXPERIENCE SUCCESS*
>
>
>
> *From:* Natarajan, Prabakaran 1. (NSN - IN/Bangalore) [mailto:
> prabakaran.1.natarajan@nsn.com]
> *Sent:* Thursday, July 31, 2014 6:20 PM
> *To:* user@hadoop.apache.org; user@hive.apache.org
> *Subject:* Hadoop and Hive Performance Tuning
>
>
>
> Hi
>
>
>
> I am using hive queries on structured RC file.
>
>
>
> Can you please let me know, the key performance parameters  that I have
> tune for better query performance (for Hadoop 2.3/ Yarn and Hive 0.13).
>
>
>
> *Thanks and Regards*
>
> Prabakaran.N  aka NP
>
> nsn, Bangalore
>
> *When "I" is replaced by "We" - even Illness becomes "Wellness"*
>
>
>
>
>
>
>
>
>  This message and the information contained herein is proprietary and
> confidential and subject to the Amdocs policy statement, you may review at
> http://www.amdocs.com/email_disclaimer.asp
>

RE: Hadoop and Hive Performance Tuning

Posted by Hussain Jamali <hu...@amdocs.com>.
I think first thing you should consider to change file format RC to ORC
With Newer Hadoop version ORC file format is very optimized format compared to RC. It will give you far better compression and performance than RC file.

Go for Zlib compression + ORC file format.

Regards,

Hussain Jamali [cid:image001.jpg@01CC94C4.E2F66080] <qto://talk/hj2261>

| T: +91.20.4135-1138 |  M: +91.89.56119707 |  E: hussain.jamali@amdocs.com
AMDOCS | EMBRACE CHALLENGE EXPERIENCE SUCCESS

From: Natarajan, Prabakaran 1. (NSN - IN/Bangalore) [mailto:prabakaran.1.natarajan@nsn.com]
Sent: Thursday, July 31, 2014 6:20 PM
To: user@hadoop.apache.org; user@hive.apache.org
Subject: Hadoop and Hive Performance Tuning

Hi

I am using hive queries on structured RC file.

Can you please let me know, the key performance parameters  that I have tune for better query performance (for Hadoop 2.3/ Yarn and Hive 0.13).

Thanks and Regards
Prabakaran.N  aka NP
nsn, Bangalore
When "I" is replaced by "We" - even Illness becomes "Wellness"





This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp