You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Bejoy Ks <be...@yahoo.com> on 2011/08/18 14:51:58 UTC

Hive crashing after an upgrade - issue with existing larger tables

Hi Experts

I was working on hive with larger volume data with hive 0.7 . Recently my hive installation was upgraded to 0.7.1 . After the upgrade I'm having a lot of issues with queries that were already working fine with larger data. The queries that took seconds to return results is now taking hours, for most larger tables even the map reduce jobs are not getting triggered. Queries like Select * and describe are working fine since they don't involve any map reduce jobs. For the jobs that didn't even get triggered I got the following error from job tracker

Job initialization failed: java.io.IOException: Split metadata size exceeded 10000000.
Aborting job job_201106061630_6993 at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48)
at org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:807)
at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:701)
at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4013)
at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

Looks like some metadata issue. My cluster is on CDH3-u0 . Has anyone faced similar issues before. Please share your thoughts what could be the probable cause of the error.

Thank You

Re: Hive crashing after an upgrade - issue with existing larger tables

Posted by Carl Steinbach <ca...@cloudera.com>.

Hi,

The original CDH3U1 release of Hive contained a configuration bug which we
recently fixed in an update. You can get the update by refreshing your Hive
packages. Afterwards please verify that you are using the following Hive
package: hive-0.7.1+42.9

You can also fix the problem by modifying your hive-site.xml file to include
the following setting:

mapred.max.split.size=256000000

Thanks.

Carl

On Thu, Aug 18, 2011 at 8:48 AM, <be...@yahoo.com> wrote:

> A small correction to my previous post. The CDH version is CDH u1 not u0
> Sorry for the confusion
>
> Regards
> Bejoy K S
> ------------------------------
> *From: * Bejoy Ks <be...@yahoo.com>
> *Date: *Thu, 18 Aug 2011 05:51:58 -0700 (PDT)
> *To: *hive user group<us...@hive.apache.org>
> *ReplyTo: * user@hive.apache.org
> *Subject: *Hive crashing after an upgrade - issue with existing larger
> tables
>
> Hi Experts
>         I was working on hive with larger volume data  with hive 0.7 .
> Recently my hive installation was upgraded to 0.7.1 . After the upgrade I'm
> having a lot of issues with queries that were already working fine with
> larger data. The queries that took seconds to return results is now taking
> hours, for most larger tables even the map reduce jobs are not getting
> triggered. Queries like Select * and describe are working fine since they
> don't involve any map reduce jobs. For the jobs that didn't even get
> triggered I got the following error from job tracker
>
> Job initialization failed: java.io.IOException: Split metadata size
> exceeded 10000000.
> Aborting job job_201106061630_6993 at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48)
>
> at
> org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:807)
> at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:701)
>
> at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4013)
> at
> org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>
> at java.lang.Thread.run(Thread.java:619)
>
>
> Looks like some metadata issue. My cluster is on CDH3-u0 . Has anyone faced
> similar issues before. Please share your thoughts what could be the probable
> cause of the error.
>
> Thank You
>

Re: Hive crashing after an upgrade - issue with existing larger tables

Posted by be...@yahoo.com.

A small correction to my previous post. The CDH version is CDH u1 not u0
Sorry for the confusion

Regards
Bejoy K S

-----Original Message-----
From: Bejoy Ks <be...@yahoo.com>
Date: Thu, 18 Aug 2011 05:51:58 
To: hive user group<us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Hive crashing after an upgrade - issue with existing larger tables

Hi Experts

        I was working on hive with larger volume data  with hive 0.7 . Recently my hive installation was upgraded to 0.7.1 . After the upgrade I'm having a lot of issues with queries that were already working fine with larger data. The queries that took seconds to return results is now taking hours, for most larger tables even the map reduce jobs are not getting triggered. Queries like Select * and describe are working fine since they don't involve any map reduce jobs. For the jobs that didn't even get triggered I got the following error from job tracker

Job initialization failed: java.io.IOException: Split metadata size exceeded 10000000. 
Aborting job job_201106061630_6993 at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48) 
at org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:807) 
at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:701) 
at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4013) 
at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79) 
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) 
at java.lang.Thread.run(Thread.java:619) 

Looks like some metadata issue. My cluster is on CDH3-u0 . Has anyone faced similar issues before. Please share your thoughts what could be the probable cause of the error.

Thank You