You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Roberto Congiu <ro...@openx.org> on 2010/02/12 22:05:08 UTC

SerDe issue

Hey guys,
I wrote a SerDe to support lwes (http://lwes.org) using BinarySortableSerDe
as a model.

The code is very similar, and I serialize an lwes event to a BytesWritable,
and deserialize from it.

Serialization is fine...however, when I run an insert into... select, the
Deserialize methods is passed a Text object instead of a BytesWritable
object like expected.

Hive generates 2 jobs, and it fails on the mapper in the second.

getSerializedClass() is set correctly:

 public Class<? extends Writable> getSerializedClass() {
        LOG.debug("JournalSerDe::getSerializedClass()");
        return BytesWritable.class;
    }

And I don't see any relevant difference between BinarySortableSerDe and my
code.
Does anybody have a hint on what may be happening ?

Thanks,
Roberto

Re: SerDe issue

Posted by Roberto Congiu <ro...@openx.org>.
Thanks Zheng,
Didn't think of that.
that worked!
R.

On Fri, Feb 12, 2010 at 1:19 PM, Zheng Shao <zs...@gmail.com> wrote:

> Hi Roberto,
>
> The reason that Text is passed in is because the table is defined as
> TextFile format (the default).
>
> There are some examples (*.q files) of using SequenceFile format (
> CREATE TABLE xxx .... STORED AS SEQUENCEFILE).
> SEQUENCEFILE will return BytesWritable by default.
>
> Please have a try.
>
> Zheng
>
> On Fri, Feb 12, 2010 at 1:05 PM, Roberto Congiu
> <ro...@openx.org> wrote:
> > Hey guys,
> > I wrote a SerDe to support lwes (http://lwes.org) using
> BinarySortableSerDe
> > as a model.
> > The code is very similar, and I serialize an lwes event to a
> BytesWritable,
> > and deserialize from it.
> > Serialization is fine...however, when I run an insert into... select, the
> > Deserialize methods is passed a Text object instead of a BytesWritable
> > object like expected.
> > Hive generates 2 jobs, and it fails on the mapper in the second.
> > getSerializedClass() is set correctly:
> >  public Class<? extends Writable> getSerializedClass() {
> >         LOG.debug("JournalSerDe::getSerializedClass()");
> >         return BytesWritable.class;
> >     }
> > And I don't see any relevant difference between BinarySortableSerDe and
> my
> > code.
> > Does anybody have a hint on what may be happening ?
> > Thanks,
> > Roberto
>
>
>
> --
> Yours,
> Zheng
>

Re: SerDe issue

Posted by Zheng Shao <zs...@gmail.com>.
Hi Roberto,

The reason that Text is passed in is because the table is defined as
TextFile format (the default).

There are some examples (*.q files) of using SequenceFile format (
CREATE TABLE xxx .... STORED AS SEQUENCEFILE).
SEQUENCEFILE will return BytesWritable by default.

Please have a try.

Zheng

On Fri, Feb 12, 2010 at 1:05 PM, Roberto Congiu
<ro...@openx.org> wrote:
> Hey guys,
> I wrote a SerDe to support lwes (http://lwes.org) using BinarySortableSerDe
> as a model.
> The code is very similar, and I serialize an lwes event to a BytesWritable,
> and deserialize from it.
> Serialization is fine...however, when I run an insert into... select, the
> Deserialize methods is passed a Text object instead of a BytesWritable
> object like expected.
> Hive generates 2 jobs, and it fails on the mapper in the second.
> getSerializedClass() is set correctly:
>  public Class<? extends Writable> getSerializedClass() {
>         LOG.debug("JournalSerDe::getSerializedClass()");
>         return BytesWritable.class;
>     }
> And I don't see any relevant difference between BinarySortableSerDe and my
> code.
> Does anybody have a hint on what may be happening ?
> Thanks,
> Roberto



-- 
Yours,
Zheng