You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Juho Autio <ju...@rovio.com> on 2015/12/08 09:25:11 UTC

Hive on Tez with Java 8 JRE poor performance

Are people running Hive on Tez with Java 8?

I tried that and the Hive job ran about twice as long as with Java 7.

Java 8 has some changes in memory management, removal of MaxPermSize for
example.

Would someone have suggestions for tuning Tez to be run efficiently on Java
8?

RE: Hive on Tez with Java 8 JRE poor performance

Posted by Bikas Saha <bi...@apache.org>.
>From an offline conversation with folks at Yahoo, some issues have been observed when running with JDK 8 and 64-bit VMs. There seems to be longer GC cycles in 64-bit VMs. So if the tasks were on the boundary of GC activity then moving to 64 bit VM could push them over and slow them down – resulting in overall slowdown. Not sure if this is related to JDK-8 or not. Some other OOM issues were observed with JDK-8 (but not with JDK-7). Jonathan/Jason from Yahoo could provide more details.

 

Juho, 

do you have the Tez UI enabled on EMR? If yes, then you could compare the GC times of the slow and fast jobs to see if there was any slowdown from that reason. If you don’t have the Tez UI, then you could grep Application master logs for VERTEX_FINISHED and look at the counters dumped in that log for GC counters and compare them.

 

Bikas

 

From: Juho Autio [mailto:juho.autio@rovio.com] 
Sent: Thursday, December 10, 2015 12:07 AM
To: user@tez.apache.org
Subject: Re: Hive on Tez with Java 8 JRE poor performance

 

Tez version is 0.7.0.

 

Installing it with the following AWS EMR step: s3://support.elasticmapreduce/tez/bigtop/install-tez-with-ui.rb

 

Using EMR release label emr-4.0.0.

 

Installing Java 8 with https://gist.github.com/pstorch/c217d8324c4133a003c4 (as a bootstrap step).

 

Currently it installs this version:

 

$ java -version

openjdk version "1.8.0_65"

OpenJDK Runtime Environment (build 1.8.0_65-b17)

OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)

 

Sorry, but I don't have logs or other specifics to provide at this point. Already switched back to running with mr engine. But it's a pretty basic Hive script with tez engine enabled. With Java 8 it takes twice as long to run as with Java 7.

 

My cluster has 40 m1.large nodes. What I haven't done is configure the tez container size in any way – maybe I would need to, especially for Java 8? I'm looking for kind of "best practices" guide for running Tez with Java 8 instead of 7, if anyone has any experience about this migration?

 

Thanks for helping!

 

On Wed, Dec 9, 2015 at 7:47 AM, Tsuyoshi Ozawa <ozawa@apache.org <ma...@apache.org> > wrote:

Hi,

Could you tell us the version of Tez you have been using?

- Tsuyoshi


On Wed, Dec 9, 2015 at 7:12 AM, Rajesh Balamohan <rbalamohan@apache.org <ma...@apache.org> > wrote:
> Can you provide details on the kind of regression you are seeing with 1.8?
> Is it regressing for a specific task or for overall job?. If possible,
> sharing the application logs with 1.7 and 1.8 would be helpful for
> understanding it better.
>
> ~Rajesh.B
>
> On Tue, Dec 8, 2015 at 2:34 PM, Jean-Baptiste Note <jbnote@gmail.com <ma...@gmail.com> > wrote:
>>
>> Hi Juho,
>>
>> Which version of Java 8 are you using? Some have performance problems with
>> Hadoop on top of issues with kerberos; you should be using a revision
>> strictly greater than u45.
>>
>> Kind regards,
>> JB
>
>





 

-- 

Juho Autio

Analytics Developer

Hatch

Rovio Entertainment Ltd

Mobile: + 358 (0)45 313 0122
juho.autio@rovio.com <ma...@rovio.com>  
www.rovio.com <http://www.rovio.com> 

This message and its attachments may contain confidential information and is intended solely for the attention and use of the named addressee(s). If you are not the intended recipient and / or you have received this message in error, please contact the sender immediately and delete all material you have received in this message. You are hereby notified that any use of the information, which you have received in error in whatsoever form, is strictly prohibited. Thank you for your co-operation.


Re: Hive on Tez with Java 8 JRE poor performance

Posted by Juho Autio <ju...@rovio.com>.
Tez version is 0.7.0.

Installing it with the following AWS EMR step:
s3://support.elasticmapreduce/tez/bigtop/install-tez-with-ui.rb

Using EMR release label emr-4.0.0.

Installing Java 8 with https://gist.github.com/pstorch/c217d8324c4133a003c4
(as a bootstrap step).

Currently it installs this version:

$ java -version
openjdk version "1.8.0_65"
OpenJDK Runtime Environment (build 1.8.0_65-b17)
OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)

Sorry, but I don't have logs or other specifics to provide at this point.
Already switched back to running with mr engine. But it's a pretty basic
Hive script with tez engine enabled. With Java 8 it takes twice as long to
run as with Java 7.

My cluster has 40 m1.large nodes. What I haven't done is configure the tez
container size in any way – maybe I would need to, especially for Java 8?
I'm looking for kind of "best practices" guide for running Tez with Java 8
instead of 7, if anyone has any experience about this migration?

Thanks for helping!

On Wed, Dec 9, 2015 at 7:47 AM, Tsuyoshi Ozawa <oz...@apache.org> wrote:

> Hi,
>
> Could you tell us the version of Tez you have been using?
>
> - Tsuyoshi
>
> On Wed, Dec 9, 2015 at 7:12 AM, Rajesh Balamohan <rb...@apache.org>
> wrote:
> > Can you provide details on the kind of regression you are seeing with
> 1.8?
> > Is it regressing for a specific task or for overall job?. If possible,
> > sharing the application logs with 1.7 and 1.8 would be helpful for
> > understanding it better.
> >
> > ~Rajesh.B
> >
> > On Tue, Dec 8, 2015 at 2:34 PM, Jean-Baptiste Note <jb...@gmail.com>
> wrote:
> >>
> >> Hi Juho,
> >>
> >> Which version of Java 8 are you using? Some have performance problems
> with
> >> Hadoop on top of issues with kerberos; you should be using a revision
> >> strictly greater than u45.
> >>
> >> Kind regards,
> >> JB
> >
> >
>



-- 
*Juho Autio*
Analytics Developer

Hatch
Rovio Entertainment Ltd
Mobile: + 358 (0)45 313 0122
juho.autio@rovio.com
www.rovio.com

*This message and its attachments may contain confidential information and
is intended solely for the attention and use of the named addressee(s). If
you are not the intended recipient and / or you have received this message
in error, please contact the sender immediately and delete all material you
have received in this message. You are hereby notified that any use of the
information, which you have received in error in whatsoever form, is
strictly prohibited. Thank you for your co-operation.*

Re: Hive on Tez with Java 8 JRE poor performance

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
Hi,

Could you tell us the version of Tez you have been using?

- Tsuyoshi

On Wed, Dec 9, 2015 at 7:12 AM, Rajesh Balamohan <rb...@apache.org> wrote:
> Can you provide details on the kind of regression you are seeing with 1.8?
> Is it regressing for a specific task or for overall job?. If possible,
> sharing the application logs with 1.7 and 1.8 would be helpful for
> understanding it better.
>
> ~Rajesh.B
>
> On Tue, Dec 8, 2015 at 2:34 PM, Jean-Baptiste Note <jb...@gmail.com> wrote:
>>
>> Hi Juho,
>>
>> Which version of Java 8 are you using? Some have performance problems with
>> Hadoop on top of issues with kerberos; you should be using a revision
>> strictly greater than u45.
>>
>> Kind regards,
>> JB
>
>

Re: Hive on Tez with Java 8 JRE poor performance

Posted by Rajesh Balamohan <rb...@apache.org>.
Can you provide details on the kind of regression you are seeing with 1.8?
Is it regressing for a specific task or for overall job?. If possible,
sharing the application logs with 1.7 and 1.8 would be helpful for
understanding it better.

~Rajesh.B

On Tue, Dec 8, 2015 at 2:34 PM, Jean-Baptiste Note <jb...@gmail.com> wrote:

> Hi Juho,
>
> Which version of Java 8 are you using? Some have performance problems with
> Hadoop on top of issues with kerberos; you should be using a revision
> strictly greater than u45.
>
> Kind regards,
> JB
>

Re: Hive on Tez with Java 8 JRE poor performance

Posted by Jean-Baptiste Note <jb...@gmail.com>.
Hi Juho,

Which version of Java 8 are you using? Some have performance problems with
Hadoop on top of issues with kerberos; you should be using a revision
strictly greater than u45.

Kind regards,
JB