You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by 김영우 <wa...@gmail.com> on 2010/08/31 04:13:50 UTC

'hive.merge.mapfiles' is broken in trunk

Hi folks,

'hive.merge.mapfiles=true' is a default for trunk. but I've got an error
like below:

Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.hadoop.mapred.lib.CombineFileInputFormat.createPool(Lorg/apache/hadoop/mapred/JobConf;[Lorg/apache/hadoop/fs/PathFilter;)V
        at
org.apache.hadoop.hive.shims.Hadoop20Shims$CombineFileInputFormatShim.createPool(Hadoop20Shims.java:322)
        at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
        at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:851)
        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:822)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:610)
        at
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:120)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
        at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
        at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

However after 'SET hive.merge.mapfiles=false', My query works fine. it is a
simple INSERT... SELECT ... query.
I'm wondering anyone have experienced this before.

I'm Using CDH3, Hive 0.7(trunk).

Thanks,

Youngwoo

Re: 'hive.merge.mapfiles' is broken in trunk

Posted by 김영우 <wa...@gmail.com>.
Ning,

I've just found a similar issue, http://bit.ly/d5zc8G
CDH3' CombineFileInputFormat is incompatible with Hadoop 0.20.2 :-(

Thanks for your quick reply.

Youngwoo

2010/8/31 Ning Zhang <nz...@facebook.com>

> I think it is because CDH does not support CombineFileInputFormat (or c).
> If you want to merge, you can set hive.mergejob.maponly=false, then it will
> not use CombineFileInputFormat.
>
>
>
> On Aug 30, 2010, at 7:13 PM, 김영우 wrote:
>
> > Hi folks,
> >
> > 'hive.merge.mapfiles=true' is a default for trunk. but I've got an error
> like below:
> >
> > Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.hadoop.mapred.lib.CombineFileInputFormat.createPool(Lorg/apache/hadoop/mapred/JobConf;[Lorg/apache/hadoop/fs/PathFilter;)V
> >        at
> org.apache.hadoop.hive.shims.Hadoop20Shims$CombineFileInputFormatShim.createPool(Hadoop20Shims.java:322)
> >        at
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
> >        at
> org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:851)
> >        at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:822)
> >        at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771)
> >        at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:610)
> >        at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:120)
> >        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
> >        at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
> >        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
> >        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
> >        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
> >        at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
> >        at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
> >        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >
> > However after 'SET hive.merge.mapfiles=false', My query works fine. it is
> a simple INSERT... SELECT ... query.
> > I'm wondering anyone have experienced this before.
> >
> > I'm Using CDH3, Hive 0.7(trunk).
> >
> > Thanks,
> >
> > Youngwoo
> >
>
>

Re: 'hive.merge.mapfiles' is broken in trunk

Posted by Ning Zhang <nz...@facebook.com>.
I think it is because CDH does not support CombineFileInputFormat (or incompatible with Hadoop 0.20.2). If you want to merge, you can set hive.mergejob.maponly=false, then it will not use CombineFileInputFormat. 



On Aug 30, 2010, at 7:13 PM, 김영우 wrote:

> Hi folks,
> 
> 'hive.merge.mapfiles=true' is a default for trunk. but I've got an error like below:
> 
> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.lib.CombineFileInputFormat.createPool(Lorg/apache/hadoop/mapred/JobConf;[Lorg/apache/hadoop/fs/PathFilter;)V
>        at org.apache.hadoop.hive.shims.Hadoop20Shims$CombineFileInputFormatShim.createPool(Hadoop20Shims.java:322)
>        at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
>        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:851)
>        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:822)
>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771)
>        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:610)
>        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:120)
>        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
>        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
>        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
>        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
>        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
>        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
>        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
>        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> 
> However after 'SET hive.merge.mapfiles=false', My query works fine. it is a simple INSERT... SELECT ... query.
> I'm wondering anyone have experienced this before. 
> 
> I'm Using CDH3, Hive 0.7(trunk).
> 
> Thanks,
> 
> Youngwoo
>