You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Roberto Congiu <ro...@openx.org> on 2010/02/12 22:05:08 UTC
SerDe issue
Hey guys,
I wrote a SerDe to support lwes (http://lwes.org) using BinarySortableSerDe
as a model.
The code is very similar, and I serialize an lwes event to a BytesWritable,
and deserialize from it.
Serialization is fine...however, when I run an insert into... select, the
Deserialize methods is passed a Text object instead of a BytesWritable
object like expected.
Hive generates 2 jobs, and it fails on the mapper in the second.
getSerializedClass() is set correctly:
public Class<? extends Writable> getSerializedClass() {
LOG.debug("JournalSerDe::getSerializedClass()");
return BytesWritable.class;
}
And I don't see any relevant difference between BinarySortableSerDe and my
code.
Does anybody have a hint on what may be happening ?
Thanks,
Roberto
Re: SerDe issue
Posted by Roberto Congiu <ro...@openx.org>.
Thanks Zheng,
Didn't think of that.
that worked!
R.
On Fri, Feb 12, 2010 at 1:19 PM, Zheng Shao <zs...@gmail.com> wrote:
> Hi Roberto,
>
> The reason that Text is passed in is because the table is defined as
> TextFile format (the default).
>
> There are some examples (*.q files) of using SequenceFile format (
> CREATE TABLE xxx .... STORED AS SEQUENCEFILE).
> SEQUENCEFILE will return BytesWritable by default.
>
> Please have a try.
>
> Zheng
>
> On Fri, Feb 12, 2010 at 1:05 PM, Roberto Congiu
> <ro...@openx.org> wrote:
> > Hey guys,
> > I wrote a SerDe to support lwes (http://lwes.org) using
> BinarySortableSerDe
> > as a model.
> > The code is very similar, and I serialize an lwes event to a
> BytesWritable,
> > and deserialize from it.
> > Serialization is fine...however, when I run an insert into... select, the
> > Deserialize methods is passed a Text object instead of a BytesWritable
> > object like expected.
> > Hive generates 2 jobs, and it fails on the mapper in the second.
> > getSerializedClass() is set correctly:
> > public Class<? extends Writable> getSerializedClass() {
> > LOG.debug("JournalSerDe::getSerializedClass()");
> > return BytesWritable.class;
> > }
> > And I don't see any relevant difference between BinarySortableSerDe and
> my
> > code.
> > Does anybody have a hint on what may be happening ?
> > Thanks,
> > Roberto
>
>
>
> --
> Yours,
> Zheng
>
Re: SerDe issue
Posted by Zheng Shao <zs...@gmail.com>.
Hi Roberto,
The reason that Text is passed in is because the table is defined as
TextFile format (the default).
There are some examples (*.q files) of using SequenceFile format (
CREATE TABLE xxx .... STORED AS SEQUENCEFILE).
SEQUENCEFILE will return BytesWritable by default.
Please have a try.
Zheng
On Fri, Feb 12, 2010 at 1:05 PM, Roberto Congiu
<ro...@openx.org> wrote:
> Hey guys,
> I wrote a SerDe to support lwes (http://lwes.org) using BinarySortableSerDe
> as a model.
> The code is very similar, and I serialize an lwes event to a BytesWritable,
> and deserialize from it.
> Serialization is fine...however, when I run an insert into... select, the
> Deserialize methods is passed a Text object instead of a BytesWritable
> object like expected.
> Hive generates 2 jobs, and it fails on the mapper in the second.
> getSerializedClass() is set correctly:
> public Class<? extends Writable> getSerializedClass() {
> LOG.debug("JournalSerDe::getSerializedClass()");
> return BytesWritable.class;
> }
> And I don't see any relevant difference between BinarySortableSerDe and my
> code.
> Does anybody have a hint on what may be happening ?
> Thanks,
> Roberto
--
Yours,
Zheng