You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Roberto Congiu <ro...@openx.com> on 2012/01/20 02:01:09 UTC

error on left/right join, hive 0.8.0

Hey guys,
we found an issue that looks like a bug (hive 0.8 , caldera's distribution).

SELECT count(1)
   FROM table1 a LEFT OUTER JOIN table2 b
   ON ( a.key1 = b.key1 AND a.key2 = b.KEY2)

fails with java.lang.IllegalArgumentException: Can not create a Path
from an empty string ( full stack trace at the bottom of this email).

which stems from this code
(org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:379))

public static String getHiveJobID(Configuration job) {
    String planPath = HiveConf.getVar(job, HiveConf.ConfVars.PLAN);
    if (planPath != null) {
      return (new Path(planPath)).getName();
    }
    return null;
  }


Querying the individual tables works fine, as works fine the normal
inner join (same query removing 'left outer').
RIGHT JOIN and FULL OUTER JOIN don't work either.

Has anybody else had this issue ?

Thanks,
Roberto


Full stack trace:
java.lang.InstantiationException: org.apache.hadoop.hive.ql.io.HiveOutputFormat
	at java.lang.Class.newInstance0(Class.java:340)
	at java.lang.Class.newInstance(Class.java:308)
	at org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPath(ExecDriver.java:859)
	at org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPaths(ExecDriver.java:903)
	at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:426)
	at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:338)
	at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:436)
	at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Job Submission failed with exception
'java.lang.InstantiationException(org.apache.hadoop.hive.ql.io.HiveOutputFormat)'
java.lang.IllegalArgumentException: Can not create a Path from an empty string
	at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
	at org.apache.hadoop.fs.Path.<init>(Path.java:90)
	at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:379)
	at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:192)
	at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476)
	at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:338)
	at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:436)
	at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Re: error on left/right join, hive 0.8.0

Posted by Roberto Congiu <ro...@openx.com>.
Woops... you're right and I apologize for the .

We were using the stock cloudera 0.7 version but needed some patches
in 0.8 so I took the cloudera hive srpm, changed the spec file,
sources and patches to what we needed, and built a 0.8 distribution
rpm based on the cloudera file path etc.

So the code is in an RPM built using the cloudera SPEC and build
scripts from 0.7, but using hive code from the 0.8 release (plus one
patch to the thrift server startup script).

I apologize for the confusion.
R.

On Thu, Jan 19, 2012 at 6:47 PM, Carl Steinbach <ca...@cloudera.com> wrote:
> Hi Roberto,
>
> I'm not sure which version of Hive you're using. If you're talking about the
> version of Hive
> that comes with Cloudera's distribution then it can't be version 0.8.0
> because we
> have not yet included that version in CDH.
>
>
> On Thu, Jan 19, 2012 at 5:01 PM, Roberto Congiu <ro...@openx.com>
> wrote:
>>
>> Hey guys,
>> we found an issue that looks like a bug (hive 0.8 , caldera's
>> distribution).
>>
>> SELECT count(1)
>>   FROM table1 a LEFT OUTER JOIN table2 b
>>   ON ( a.key1 = b.key1 AND a.key2 = b.KEY2)
>>
>> fails with java.lang.IllegalArgumentException: Can not create a Path
>> from an empty string ( full stack trace at the bottom of this email).
>>
>> which stems from this code
>>
>> (org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:379))
>>
>> public static String getHiveJobID(Configuration job) {
>>    String planPath = HiveConf.getVar(job, HiveConf.ConfVars.PLAN);
>>    if (planPath != null) {
>>      return (new Path(planPath)).getName();
>>    }
>>    return null;
>>  }
>>
>>
>> Querying the individual tables works fine, as works fine the normal
>> inner join (same query removing 'left outer').
>> RIGHT JOIN and FULL OUTER JOIN don't work either.
>>
>> Has anybody else had this issue ?
>>
>> Thanks,
>> Roberto
>>
>>
>> Full stack trace:
>> java.lang.InstantiationException:
>> org.apache.hadoop.hive.ql.io.HiveOutputFormat
>>        at java.lang.Class.newInstance0(Class.java:340)
>>        at java.lang.Class.newInstance(Class.java:308)
>>        at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPath(ExecDriver.java:859)
>>        at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPaths(ExecDriver.java:903)
>>        at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:426)
>>        at
>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
>>        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>>        at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>>        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>>        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:338)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:436)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446)
>>        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642)
>>        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> Job Submission failed with exception
>>
>> 'java.lang.InstantiationException(org.apache.hadoop.hive.ql.io.HiveOutputFormat)'
>> java.lang.IllegalArgumentException: Can not create a Path from an empty
>> string
>>        at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
>>        at org.apache.hadoop.fs.Path.<init>(Path.java:90)
>>        at
>> org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:379)
>>        at
>> org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:192)
>>        at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476)
>>        at
>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
>>        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>>        at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>>        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>>        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:338)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:436)
>>        at
>> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446)
>>        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642)
>>        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>

Re: error on left/right join, hive 0.8.0

Posted by Carl Steinbach <ca...@cloudera.com>.
Hi Roberto,

I'm not sure which version of Hive you're using. If you're talking about
the version of Hive
that comes with Cloudera's distribution then it can't be version 0.8.0
because we
have not yet included that version in CDH.

On Thu, Jan 19, 2012 at 5:01 PM, Roberto Congiu <ro...@openx.com>wrote:

> Hey guys,
> we found an issue that looks like a bug (hive 0.8 , caldera's
> distribution).
>
> SELECT count(1)
>   FROM table1 a LEFT OUTER JOIN table2 b
>   ON ( a.key1 = b.key1 AND a.key2 = b.KEY2)
>
> fails with java.lang.IllegalArgumentException: Can not create a Path
> from an empty string ( full stack trace at the bottom of this email).
>
> which stems from this code
> (org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:379))
>
> public static String getHiveJobID(Configuration job) {
>    String planPath = HiveConf.getVar(job, HiveConf.ConfVars.PLAN);
>    if (planPath != null) {
>      return (new Path(planPath)).getName();
>    }
>    return null;
>  }
>
>
> Querying the individual tables works fine, as works fine the normal
> inner join (same query removing 'left outer').
> RIGHT JOIN and FULL OUTER JOIN don't work either.
>
> Has anybody else had this issue ?
>
> Thanks,
> Roberto
>
>
> Full stack trace:
> java.lang.InstantiationException:
> org.apache.hadoop.hive.ql.io.HiveOutputFormat
>        at java.lang.Class.newInstance0(Class.java:340)
>        at java.lang.Class.newInstance(Class.java:308)
>        at
> org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPath(ExecDriver.java:859)
>        at
> org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPaths(ExecDriver.java:903)
>        at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:426)
>        at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
>        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>        at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:338)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:436)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446)
>        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642)
>        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Job Submission failed with exception
>
> 'java.lang.InstantiationException(org.apache.hadoop.hive.ql.io.HiveOutputFormat)'
> java.lang.IllegalArgumentException: Can not create a Path from an empty
> string
>        at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
>        at org.apache.hadoop.fs.Path.<init>(Path.java:90)
>        at
> org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:379)
>        at
> org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:192)
>        at
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476)
>        at
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
>        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>        at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:338)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:436)
>        at
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:446)
>        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642)
>        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>