You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Tim Harsch <th...@yarcdata.com> on 2014/07/17 01:09:07 UTC
Turning on Tez for Hive
Hi all,
Is there a wiki page somewhere that shows how to turn on Tez for Hive?
I found "hive.execution.engine" in hive-default.xml.template. But I'm sure there must be more. Do I have to install Tez separately?
Thanks,
Tim
Re: Turning on Tez for Hive
Posted by Lefty Leverenz <le...@gmail.com>.
Also, Tez configuration parameters are listed here:
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez
.
-- Lefty
On Wed, Jul 16, 2014 at 9:16 PM, Lefty Leverenz <le...@gmail.com>
wrote:
> The "Hive on Tez" design doc has a couple of links in the Installation
> and Configuration
> <https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez#HiveonTez-InstallationandConfiguration>
> section.
>
> -- Lefty
>
>
> On Wed, Jul 16, 2014 at 7:09 PM, Tim Harsch <th...@yarcdata.com> wrote:
>
>> Hi all,
>> Is there a wiki page somewhere that shows how to turn on Tez for Hive?
>>
>> I found "hive.execution.engine" in hive-default.xml.template. But I'm
>> sure there must be more. Do I have to install Tez separately?
>>
>> Thanks,
>> Tim
>>
>
>
Re: Turning on Tez for Hive
Posted by Lefty Leverenz <le...@gmail.com>.
The "Hive on Tez" design doc has a couple of links in the Installation and
Configuration
<https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez#HiveonTez-InstallationandConfiguration>
section.
-- Lefty
On Wed, Jul 16, 2014 at 7:09 PM, Tim Harsch <th...@yarcdata.com> wrote:
> Hi all,
> Is there a wiki page somewhere that shows how to turn on Tez for Hive?
>
> I found "hive.execution.engine" in hive-default.xml.template. But I'm
> sure there must be more. Do I have to install Tez separately?
>
> Thanks,
> Tim
>
Re: Turning on Tez for Hive
Posted by Tim Harsch <th...@yarcdata.com>.
Hi Lefty,
I came across those documents as well. They gave me some good hints, but
were in some places too specific to Horton Works. For the record, I did
get Hive 0.13.1 and Tez 0.4.1 working together. (and I immediately saw a
200% speed up on my corpus of queries). For future travelers here are my
(somewhat raw) notes:
Get the source distro at:
http://www.apache.org/dyn/closer.cgi/incubator/tez/tez-0.4.1-incubating/
explode it, and run 'mvm package -DskipTests -Dtar'
explode the tar from tez-dist/target/tez-0.4.1-incubating.tar.gz to a
location which will become TEZ_INSTALL_DIR.
Tez needs a user dir for to match unix user, or you will get
java.io.FileNotFoundException: File does not exist: hdfs:/user/tharsch:
% hdfs dfs -mkdir /user/tharsch; hdfs dfs -chmod g+w /user/tharsch
# NOTE: set HIVE_AUX_JARS_PATH using if -z test or get
java.lang.IllegalArgumentException: Can not create a Path from an empty
string
# NOTE: TEZ_JARS cannot include "*" or you will get error
"java.io.FileNotFoundException: File
file:/home/users/tharsch/apps/tez/tez-0.4.1-incubating/* does not exist"
export TEZ_INSTALL_DIR=/home/users/tharsch/apps/tez/tez-0.4.1-incubating
export TEZ_CONF_DIR=$TEZ_INSTALL_DIR/conf
export TEZ_JARS=$(echo "$TEZ_INSTALL_DIR"/*.jar | tr ' ' ':'):$(echo
"$TEZ_INSTALL_DIR"/lib/*.jar | tr ' ' ':')
if [ -z "$HIVE_AUX_JARS_PATH" ]; then
export HIVE_AUX_JARS_PATH="$TEZ_JARS"
else
export HIVE_AUX_JARS_PATH="$HIVE_AUX_JARS_PATH:$TEZ_JARS"
fi
NOTE: Be sure to copy TEZ jars to HDFS and set HADOOP_CLASSPATH or you
will get:
org.apache.tez.dag.api.TezUncheckedException: Invalid configuration of tez
jars, tez.lib.uris is not defined in the configurartion
export HADOOP_CLASSPATH="${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*"
% hdfs dfs -mkdir -p /apps/tez-0.4.1; hdfs dfs -chmod g+w /apps/tez-0.4.1
% hdfs dfs -copyFromLocal $TEZ_INSTALL_DIR/* /apps/tez-0.4.1
% hdfs dfs -copyFromLocal $TEZ_INSTALL_DIR/lib/* /apps/tez-0.4.1
NOTE: Set mapred-site.xml
<property>
<name>mapreduce.framework.name
<http://mapreduce.framework.name/></name>
<value>yarn-tez</value>
</property>
NOTE: create tez-site.xml in $TEZ_CONF_DIR
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.default.name}/apps/tez-0.4.1/</value>
</property>
</configuration>
NOTE: HIVE Settings:
set hive.execution.engine=tez;
set hive.use.tez.natively=true;
set hive.enable.mrr=true;
Re: Turning on Tez for Hive
Posted by Lefty Leverenz <le...@gmail.com>.
Actually those links don't quite match your component versions. (Close,
but they're for Hive 0.13.0 instead of 0.13.1.) The HDP-2.1.3 docs
<http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-latest/bk_releasenotes_hdp_2.1/content/ch_relnotes-hdp-2.1.3-product.html>
cover
Hadoop 2.4.0, Hive 0.13.1, and Tez 0.4.0. Here are the Tez setup links:
- Set Up the Hive/HCatalog Configuration Files
<http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_installing_manually_book/content/rpm-chap6-3.html>
(see
3.1 Configure Hive and HiveServer2 for Tez, which includes the
configuration information that the previous links found in the Tez chapter)
- Installing and Configuring Apache Tez
<http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_installing_manually_book/content/rpm-chap-tez.html>
(see 10.4 Enable Tez for Hive Queries
<http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_installing_manually_book/content/rpm-chap-tez-enable_tez_for_hive_queries.html>
)
-- Lefty
On Thu, Jul 17, 2014 at 4:56 PM, Lefty Leverenz <le...@gmail.com>
wrote:
> You might also find useful information in HDP's Hive and HCatalog
> installation instructions (section 3.1 "Configure Hive and HiveServer2 for
> Tez") here:
> http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap6-3.html
>
>
> -- Lefty
>
>
> On Thu, Jul 17, 2014 at 12:34 PM, Tim Harsch <th...@yarcdata.com> wrote:
>
>> Hi Alex,
>> Thanks for the reply. I should state I am using vanilla apache hadoop
>> 2.4.0 and hive 0.13.1 (not Horton works). And I don't have root access on
>> my pseudo distributed hadoop cluster (single node).
>>
>> I tried your suggestion for a query:
>>
>> hive> explain select count(*) from web_sales;
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/tez/dag/api/client/StatusGetOpts
>>
>> I'm thinking I must install Tez separately from source. And then set
>> the env vars and HADOOP_CLASSPATH as per:
>>
>> http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez_configure_tez.html
>>
>> Tim
>>
>>
>> From: Alexander Alten-Lorenz <wg...@gmail.com>
>> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>, Alexander
>> Alten-Lorenz <wg...@gmail.com>
>> Date: Wednesday, July 16, 2014 11:27 PM
>> To: "user@hive.apache.org" <us...@hive.apache.org>
>> Subject: Re: Turning on Tez for Hive
>>
>> Just use the execution engine, can be done per Hive query too:
>> hive > set hive.execution.engine=tez;
>> hive > select count (*) from what_ever;
>>
>> If you want to use HS2 with Tez, follow the documentation here:
>>
>> http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html
>>
>> - Alex
>>
>> ------ Originalnachricht ------
>> Von: "Tim Harsch" <th...@yarcdata.com>
>> An: "user@hive.apache.org" <us...@hive.apache.org>
>> Gesendet: 17.07.2014 01:09:07
>> Betreff: Turning on Tez for Hive
>>
>>
>> Hi all,
>> Is there a wiki page somewhere that shows how to turn on Tez for Hive?
>>
>> I found "hive.execution.engine" in hive-default.xml.template. But I'm
>> sure there must be more. Do I have to install Tez separately?
>>
>> Thanks,
>> Tim
>>
>>
>
Re: Turning on Tez for Hive
Posted by Lefty Leverenz <le...@gmail.com>.
You might also find useful information in HDP's Hive and HCatalog
installation instructions (section 3.1 "Configure Hive and HiveServer2 for
Tez") here:
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap6-3.html
-- Lefty
On Thu, Jul 17, 2014 at 12:34 PM, Tim Harsch <th...@yarcdata.com> wrote:
> Hi Alex,
> Thanks for the reply. I should state I am using vanilla apache hadoop
> 2.4.0 and hive 0.13.1 (not Horton works). And I don't have root access on
> my pseudo distributed hadoop cluster (single node).
>
> I tried your suggestion for a query:
>
> hive> explain select count(*) from web_sales;
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/tez/dag/api/client/StatusGetOpts
>
> I'm thinking I must install Tez separately from source. And then set
> the env vars and HADOOP_CLASSPATH as per:
>
> http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez_configure_tez.html
>
> Tim
>
>
> From: Alexander Alten-Lorenz <wg...@gmail.com>
> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>, Alexander
> Alten-Lorenz <wg...@gmail.com>
> Date: Wednesday, July 16, 2014 11:27 PM
> To: "user@hive.apache.org" <us...@hive.apache.org>
> Subject: Re: Turning on Tez for Hive
>
> Just use the execution engine, can be done per Hive query too:
> hive > set hive.execution.engine=tez;
> hive > select count (*) from what_ever;
>
> If you want to use HS2 with Tez, follow the documentation here:
>
> http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html
>
> - Alex
>
> ------ Originalnachricht ------
> Von: "Tim Harsch" <th...@yarcdata.com>
> An: "user@hive.apache.org" <us...@hive.apache.org>
> Gesendet: 17.07.2014 01:09:07
> Betreff: Turning on Tez for Hive
>
>
> Hi all,
> Is there a wiki page somewhere that shows how to turn on Tez for Hive?
>
> I found "hive.execution.engine" in hive-default.xml.template. But I'm
> sure there must be more. Do I have to install Tez separately?
>
> Thanks,
> Tim
>
>
Re: Turning on Tez for Hive
Posted by Tim Harsch <th...@yarcdata.com>.
Hi Alex,
Thanks for the reply. I should state I am using vanilla apache hadoop 2.4.0 and hive 0.13.1 (not Horton works). And I don't have root access on my pseudo distributed hadoop cluster (single node).
I tried your suggestion for a query:
hive> explain select count(*) from web_sales;
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/tez/dag/api/client/StatusGetOpts
I'm thinking I must install Tez separately from source. And then set the env vars and HADOOP_CLASSPATH as per:
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez_configure_tez.html
Tim
From: Alexander Alten-Lorenz <wg...@gmail.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, Alexander Alten-Lorenz <wg...@gmail.com>>
Date: Wednesday, July 16, 2014 11:27 PM
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Subject: Re: Turning on Tez for Hive
Just use the execution engine, can be done per Hive query too:
hive > set hive.execution.engine=tez;
hive > select count (*) from what_ever;
If you want to use HS2 with Tez, follow the documentation here:
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html
- Alex
------ Originalnachricht ------
Von: "Tim Harsch" <th...@yarcdata.com>>
An: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Gesendet: 17.07.2014 01:09:07
Betreff: Turning on Tez for Hive
Hi all,
Is there a wiki page somewhere that shows how to turn on Tez for Hive?
I found "hive.execution.engine" in hive-default.xml.template. But I'm sure there must be more. Do I have to install Tez separately?
Thanks,
Tim
Re: Turning on Tez for Hive
Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Just use the execution engine, can be done per Hive query too:
hive > set hive.execution.engine=tez;
hive > select count (*) from what_ever;
If you want to use HS2 with Tez, follow the documentation here:
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html
- Alex
------ Originalnachricht ------
Von: "Tim Harsch" <th...@yarcdata.com>
An: "user@hive.apache.org" <us...@hive.apache.org>
Gesendet: 17.07.2014 01:09:07
Betreff: Turning on Tez for Hive
>Hi all,
>Is there a wiki page somewhere that shows how to turn on Tez for Hive?
>
>I found "hive.execution.engine" in hive-default.xml.template. But I'm
>sure there must be more. Do I have to install Tez separately?
>
>Thanks,
>Tim