You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Jim Green <op...@gmail.com> on 2015/07/21 20:12:06 UTC

Hive on Tez query failed with “wrong key class"

Hi Team,

Env: Hive 1.0 on Tez 0.5.3
Query is a simple group-by on top of sequence table.

It fails with below error on tez mode:
*java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: *
*java.io.IOException: java.io.IOException: wrong key class:
org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable *

And it works fine in MR mode.
Anyone met this issue before?

-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

RE: Hive on Tez query failed with ³wrong key class"

Posted by Bikas Saha <bi...@hortonworks.com>.

Also, I believe you are comparing the Tez code for IFile (which is intermediate data) vs code for SequenceFile (which is the final output or initial input from stable storage like HDFS). So they may not be related.

-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of Gopal Vijayaraghavan
Sent: Monday, July 27, 2015 9:20 PM
To: user@tez.apache.org; user@hive.apache.org
Cc: Jim Green <op...@gmail.com>
Subject: Re: Hive on Tez query failed with ³wrong key class"




> From the java code which creates the sequence file, it has set the key 
>class to NullWritable.class:
> job.setOutputKeyClass(org.apache.hadoop.io.NullWritable.class);
...
> I think that caused the mismatch:
> wrong key class: org.apache.hadoop.io.BytesWritable is not class 
>org.apache.hadoop.io.NullWritable

In all possibilities, the exception you¹re hitting originates from here

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-co
mmon/src/main/java/org/apache/hadoop/io/SequenceFile.java#L2328


> Anyone knows why Tez will check the key and value class when doing 
>sort stuff?

As I said in my earlier mail, if you can check the SequenceFile headers and they look like my pasted pair, then we know it¹s the same as the known issue.

Cheers,
Gopal

RE: Hive on Tez query failed with ³wrong key class"

Posted by Bikas Saha <bi...@hortonworks.com>.

Also, I believe you are comparing the Tez code for IFile (which is intermediate data) vs code for SequenceFile (which is the final output or initial input from stable storage like HDFS). So they may not be related.

-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of Gopal Vijayaraghavan
Sent: Monday, July 27, 2015 9:20 PM
To: user@tez.apache.org; user@hive.apache.org
Cc: Jim Green <op...@gmail.com>
Subject: Re: Hive on Tez query failed with ³wrong key class"




> From the java code which creates the sequence file, it has set the key 
>class to NullWritable.class:
> job.setOutputKeyClass(org.apache.hadoop.io.NullWritable.class);
...
> I think that caused the mismatch:
> wrong key class: org.apache.hadoop.io.BytesWritable is not class 
>org.apache.hadoop.io.NullWritable

In all possibilities, the exception you¹re hitting originates from here

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-co
mmon/src/main/java/org/apache/hadoop/io/SequenceFile.java#L2328


> Anyone knows why Tez will check the key and value class when doing 
>sort stuff?

As I said in my earlier mail, if you can check the SequenceFile headers and they look like my pasted pair, then we know it¹s the same as the known issue.

Cheers,
Gopal

Re: Hive on Tez query failed with ³wrong key class"

Posted by Gopal Vijayaraghavan <go...@apache.org>.



> From the java code which creates the sequence file, it has set the key
>class to NullWritable.class:
> job.setOutputKeyClass(org.apache.hadoop.io.NullWritable.class);
...
> I think that caused the mismatch:
> wrong key class: org.apache.hadoop.io.BytesWritable is not class
>org.apache.hadoop.io.NullWritable

In all possibilities, the exception you¹re hitting originates from here

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-co
mmon/src/main/java/org/apache/hadoop/io/SequenceFile.java#L2328


> Anyone knows why Tez will check the key and value class when doing sort
>stuff?

As I said in my earlier mail, if you can check the SequenceFile headers
and they look like my pasted pair, then we know it¹s the same as the known
issue.

Cheers,
Gopal

Re: Hive on Tez query failed with ³wrong key class"

Posted by Gopal Vijayaraghavan <go...@apache.org>.



> From the java code which creates the sequence file, it has set the key
>class to NullWritable.class:
> job.setOutputKeyClass(org.apache.hadoop.io.NullWritable.class);
...
> I think that caused the mismatch:
> wrong key class: org.apache.hadoop.io.BytesWritable is not class
>org.apache.hadoop.io.NullWritable

In all possibilities, the exception you¹re hitting originates from here

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-co
mmon/src/main/java/org/apache/hadoop/io/SequenceFile.java#L2328


> Anyone knows why Tez will check the key and value class when doing sort
>stuff?

As I said in my earlier mail, if you can check the SequenceFile headers
and they look like my pasted pair, then we know it¹s the same as the known
issue.

Cheers,
Gopal

Re: Hive on Tez query failed with “wrong key class"

Posted by Jim Green <op...@gmail.com>.

Hi Team,

Some clue:
>From the java code which creates the sequence file, it has set the key
class to NullWritable.class:
job.setOutputKeyClass(org.apache.hadoop.io.NullWritable.class);

However per the source code of Hive, and the key class for sequence file
writer should be : BytesWritable.
HiveSequenceFileOutputFormat.java:
final SequenceFile.Writer outStream = Utilities.createSequenceWriter(jc,
fs, finalOutPath, BytesWritable.class, valueClass, isCompressed);

I think that caused the mismatch:
wrong key class: org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable

Then I look into the Tez source code and found the reason is in :
tez-runtime-library/src/main/java/org/apache/tez/runtime/lib
rary/common/sort/impl/IFile.java
/**
* Send key/value to be appended to IFile. To represent same key as previous
* one, send IFile.REPEAT_KEY as key parameter. Should not call this method
with
* IFile.REPEAT_KEY as the first key.
*
* @param key
* @param value
* @throws IOException
*/
public void append(Object key, Object value) throws IOException {
checkArgument((key == REPEAT_KEY || key.getClass() == keyClass),
WRONG_KEY_CLASS,
key.getClass(), keyClass);

Above IFile should be speficic to Tez. Hive does not have that code to
check the key class and value class.
Anyone knows why Tez will check the key and value class when doing sort
stuff?

Thanks.



On Tue, Jul 21, 2015 at 5:26 PM, Jim Green <op...@gmail.com> wrote:

>
> Sample stacktrace is :
> [Error: Failure while running task:java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
> java.io.IOException: wrong key class: org.apache.hadoop.io.BytesWritable is
> not class org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
>         at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.io.IOException: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71)
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>         ... 13 more
> Caused by: java.io.IOException: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:363)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>         at
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:126)
>         at
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
>         ... 15 more
> Caused by: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2495)
>         at
> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:358)
>         ... 21 more
> ],
>
>
>
> On Tue, Jul 21, 2015 at 11:26 AM, Bikas Saha <bi...@hortonworks.com>
> wrote:
>
>>  A full stack trace would help determine is this is a Tez issue or hive
>> issue.
>>
>>
>>
>> *From:* Jim Green [mailto:openkbinfo@gmail.com]
>> *Sent:* Tuesday, July 21, 2015 11:12 AM
>> *To:* user@tez.apache.org; user@hive.apache.org
>> *Subject:* Hive on Tez query failed with “wrong key class"
>>
>>
>>
>> Hi Team,
>>
>>
>>
>> Env: Hive 1.0 on Tez 0.5.3
>>
>> Query is a simple group-by on top of sequence table.
>>
>>
>>
>> It fails with below error on tez mode:
>>
>> *java.lang.RuntimeException:
>> org.apache.hadoop.hive.ql.metadata.HiveException: *
>>
>> *java.io.IOException: java.io.IOException: wrong key class:
>> org.apache.hadoop.io.BytesWritable is not class
>> org.apache.hadoop.io.NullWritable *
>>
>>
>>
>> And it works fine in MR mode.
>>
>> Anyone met this issue before?
>>
>>
>>
>> --
>>
>> Thanks,
>>
>> www.openkb.info
>>
>> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>>
>
>
>
> --
> Thanks,
> www.openkb.info
> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>



-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

Re: Hive on Tez query failed with “wrong key class"

Posted by Jim Green <op...@gmail.com>.

Hi Team,

Some clue:
>From the java code which creates the sequence file, it has set the key
class to NullWritable.class:
job.setOutputKeyClass(org.apache.hadoop.io.NullWritable.class);

However per the source code of Hive, and the key class for sequence file
writer should be : BytesWritable.
HiveSequenceFileOutputFormat.java:
final SequenceFile.Writer outStream = Utilities.createSequenceWriter(jc,
fs, finalOutPath, BytesWritable.class, valueClass, isCompressed);

I think that caused the mismatch:
wrong key class: org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable

Then I look into the Tez source code and found the reason is in :
tez-runtime-library/src/main/java/org/apache/tez/runtime/lib
rary/common/sort/impl/IFile.java
/**
* Send key/value to be appended to IFile. To represent same key as previous
* one, send IFile.REPEAT_KEY as key parameter. Should not call this method
with
* IFile.REPEAT_KEY as the first key.
*
* @param key
* @param value
* @throws IOException
*/
public void append(Object key, Object value) throws IOException {
checkArgument((key == REPEAT_KEY || key.getClass() == keyClass),
WRONG_KEY_CLASS,
key.getClass(), keyClass);

Above IFile should be speficic to Tez. Hive does not have that code to
check the key class and value class.
Anyone knows why Tez will check the key and value class when doing sort
stuff?

Thanks.



On Tue, Jul 21, 2015 at 5:26 PM, Jim Green <op...@gmail.com> wrote:

>
> Sample stacktrace is :
> [Error: Failure while running task:java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
> java.io.IOException: wrong key class: org.apache.hadoop.io.BytesWritable is
> not class org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
>         at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.io.IOException: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71)
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
>         at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>         ... 13 more
> Caused by: java.io.IOException: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:363)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>         at
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:126)
>         at
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
>         at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
>         ... 15 more
> Caused by: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable
>         at
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2495)
>         at
> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:358)
>         ... 21 more
> ],
>
>
>
> On Tue, Jul 21, 2015 at 11:26 AM, Bikas Saha <bi...@hortonworks.com>
> wrote:
>
>>  A full stack trace would help determine is this is a Tez issue or hive
>> issue.
>>
>>
>>
>> *From:* Jim Green [mailto:openkbinfo@gmail.com]
>> *Sent:* Tuesday, July 21, 2015 11:12 AM
>> *To:* user@tez.apache.org; user@hive.apache.org
>> *Subject:* Hive on Tez query failed with “wrong key class"
>>
>>
>>
>> Hi Team,
>>
>>
>>
>> Env: Hive 1.0 on Tez 0.5.3
>>
>> Query is a simple group-by on top of sequence table.
>>
>>
>>
>> It fails with below error on tez mode:
>>
>> *java.lang.RuntimeException:
>> org.apache.hadoop.hive.ql.metadata.HiveException: *
>>
>> *java.io.IOException: java.io.IOException: wrong key class:
>> org.apache.hadoop.io.BytesWritable is not class
>> org.apache.hadoop.io.NullWritable *
>>
>>
>>
>> And it works fine in MR mode.
>>
>> Anyone met this issue before?
>>
>>
>>
>> --
>>
>> Thanks,
>>
>> www.openkb.info
>>
>> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>>
>
>
>
> --
> Thanks,
> www.openkb.info
> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>



-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

Re: Hive on Tez query failed with “wrong key class"

Posted by Jim Green <op...@gmail.com>.

Sample stacktrace is :
[Error: Failure while running task:java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
java.io.IOException: wrong key class: org.apache.hadoop.io.BytesWritable is
not class org.apache.hadoop.io.NullWritable
        at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
        at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
        at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
        at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
        at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
        at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
        at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.io.IOException: java.io.IOException: wrong key class:
org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable
        at
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71)
        at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
        at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
        ... 13 more
Caused by: java.io.IOException: java.io.IOException: wrong key class:
org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable
        at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:363)
        at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
        at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
        at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:126)
        at
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
        at
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
        ... 15 more
Caused by: java.io.IOException: wrong key class:
org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable
        at
org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2495)
        at
org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:358)
        ... 21 more
],



On Tue, Jul 21, 2015 at 11:26 AM, Bikas Saha <bi...@hortonworks.com> wrote:

>  A full stack trace would help determine is this is a Tez issue or hive
> issue.
>
>
>
> *From:* Jim Green [mailto:openkbinfo@gmail.com]
> *Sent:* Tuesday, July 21, 2015 11:12 AM
> *To:* user@tez.apache.org; user@hive.apache.org
> *Subject:* Hive on Tez query failed with “wrong key class"
>
>
>
> Hi Team,
>
>
>
> Env: Hive 1.0 on Tez 0.5.3
>
> Query is a simple group-by on top of sequence table.
>
>
>
> It fails with below error on tez mode:
>
> *java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: *
>
> *java.io.IOException: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable *
>
>
>
> And it works fine in MR mode.
>
> Anyone met this issue before?
>
>
>
> --
>
> Thanks,
>
> www.openkb.info
>
> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>



-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

Re: Hive on Tez query failed with “wrong key class"

Posted by Jim Green <op...@gmail.com>.

Sample stacktrace is :
[Error: Failure while running task:java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
java.io.IOException: wrong key class: org.apache.hadoop.io.BytesWritable is
not class org.apache.hadoop.io.NullWritable
        at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
        at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
        at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
        at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
        at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
        at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
        at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.io.IOException: java.io.IOException: wrong key class:
org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable
        at
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71)
        at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
        at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
        ... 13 more
Caused by: java.io.IOException: java.io.IOException: wrong key class:
org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable
        at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:363)
        at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
        at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
        at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:126)
        at
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
        at
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
        ... 15 more
Caused by: java.io.IOException: wrong key class:
org.apache.hadoop.io.BytesWritable is not class
org.apache.hadoop.io.NullWritable
        at
org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2495)
        at
org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:358)
        ... 21 more
],



On Tue, Jul 21, 2015 at 11:26 AM, Bikas Saha <bi...@hortonworks.com> wrote:

>  A full stack trace would help determine is this is a Tez issue or hive
> issue.
>
>
>
> *From:* Jim Green [mailto:openkbinfo@gmail.com]
> *Sent:* Tuesday, July 21, 2015 11:12 AM
> *To:* user@tez.apache.org; user@hive.apache.org
> *Subject:* Hive on Tez query failed with “wrong key class"
>
>
>
> Hi Team,
>
>
>
> Env: Hive 1.0 on Tez 0.5.3
>
> Query is a simple group-by on top of sequence table.
>
>
>
> It fails with below error on tez mode:
>
> *java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: *
>
> *java.io.IOException: java.io.IOException: wrong key class:
> org.apache.hadoop.io.BytesWritable is not class
> org.apache.hadoop.io.NullWritable *
>
>
>
> And it works fine in MR mode.
>
> Anyone met this issue before?
>
>
>
> --
>
> Thanks,
>
> www.openkb.info
>
> (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)
>



-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

RE: Hive on Tez query failed with “wrong key class"

Posted by Bikas Saha <bi...@hortonworks.com>.

A full stack trace would help determine is this is a Tez issue or hive issue.

From: Jim Green [mailto:openkbinfo@gmail.com]
Sent: Tuesday, July 21, 2015 11:12 AM
To: user@tez.apache.org; user@hive.apache.org
Subject: Hive on Tez query failed with “wrong key class"

Hi Team,

Env: Hive 1.0 on Tez 0.5.3
Query is a simple group-by on top of sequence table.

It fails with below error on tez mode:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:
java.io.IOException: java.io.IOException: wrong key class: org.apache.hadoop.io.BytesWritable is not class org.apache.hadoop.io.NullWritable

And it works fine in MR mode.
Anyone met this issue before?

--
Thanks,
www.openkb.info<http://www.openkb.info>
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

Re: Hive on Tez query failed with ³wrong key class"

Posted by Gopal Vijayaraghavan <go...@apache.org>.

> Query is a simple group-by on top of sequence table.
...
> java.io.IOException: java.io.IOException: wrong key class:
>org.apache.hadoop.io.BytesWritable is not class
>org.apache.hadoop.io.NullWritable

I have seen this issue when mixing Sequence files written by PIG with
Sequence files written by Hive - primarily because the data ingestion
wasn¹t done properly via HCatalog writers.

Last report, the first sequence file had as its header

M?.io.LongWritable"org.apache.hadoop.io.BytesWritable)org.apache.hadoop.io.
compress.SnappyCodec??


and the second one had

SEQ!org.apache.hadoop.io.LongWritableorg.apache.hadoop.io.Text)org.apache.h
adoop.io.compress.SnappyCodec?


You can cross-check the exception trace and make sure that the exception
is coming from the RecordReader as the k-v pairs change types between
files.

Primarily this doesn¹t happen in Hive-mr at the small scale, but it
happens for both MR and Tez.

To hit this via CombineInputFormat, you need a file which has been split
up between machines and two such files to generate a combined split of
mismatched schema.

Tez is more aggressive at splitting, since it relies on the file format
splits, not HDFS locations.

If you confirm that this is indeed the cause of the issue, I might have an
idea how to fix it.

Cheers,
Gopal

RE: Hive on Tez query failed with “wrong key class"

Posted by Bikas Saha <bi...@hortonworks.com>.

A full stack trace would help determine is this is a Tez issue or hive issue.

From: Jim Green [mailto:openkbinfo@gmail.com]
Sent: Tuesday, July 21, 2015 11:12 AM
To: user@tez.apache.org; user@hive.apache.org
Subject: Hive on Tez query failed with “wrong key class"

Hi Team,

Env: Hive 1.0 on Tez 0.5.3
Query is a simple group-by on top of sequence table.

It fails with below error on tez mode:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:
java.io.IOException: java.io.IOException: wrong key class: org.apache.hadoop.io.BytesWritable is not class org.apache.hadoop.io.NullWritable

And it works fine in MR mode.
Anyone met this issue before?

--
Thanks,
www.openkb.info<http://www.openkb.info>
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

Re: Hive on Tez query failed with ³wrong key class"

Posted by Gopal Vijayaraghavan <go...@apache.org>.

> Query is a simple group-by on top of sequence table.
...
> java.io.IOException: java.io.IOException: wrong key class:
>org.apache.hadoop.io.BytesWritable is not class
>org.apache.hadoop.io.NullWritable

I have seen this issue when mixing Sequence files written by PIG with
Sequence files written by Hive - primarily because the data ingestion
wasn¹t done properly via HCatalog writers.

Last report, the first sequence file had as its header

M?.io.LongWritable"org.apache.hadoop.io.BytesWritable)org.apache.hadoop.io.
compress.SnappyCodec??


and the second one had

SEQ!org.apache.hadoop.io.LongWritableorg.apache.hadoop.io.Text)org.apache.h
adoop.io.compress.SnappyCodec?


You can cross-check the exception trace and make sure that the exception
is coming from the RecordReader as the k-v pairs change types between
files.

Primarily this doesn¹t happen in Hive-mr at the small scale, but it
happens for both MR and Tez.

To hit this via CombineInputFormat, you need a file which has been split
up between machines and two such files to generate a combined split of
mismatched schema.

Tez is more aggressive at splitting, since it relies on the file format
splits, not HDFS locations.

If you confirm that this is indeed the cause of the issue, I might have an
idea how to fix it.

Cheers,
Gopal