You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Hans Uhlig <hu...@uhlisys.com> on 2012/01/22 19:23:23 UTC

Example using Binary SerDe

I am attempting to Use LazyBinarySerDe to read Sequence files output by a
mapreduce job. Is there an example of how the data needs to be packed by
the final reduce, and how the tables are set up so they can read the output?

Re: Example using Binary SerDe

Posted by Aniket Mokashi <an...@gmail.com>.

Does that mean you would like to read the pojo objects using hive? Is your
pojo a custom writable?
LazyBinarySerDe in my opinion is a SerDe that converts bytewritable to
columns. Your recordreader would return a bytewritable and serde along with
objectinspector would convert it to typed columns. So, directly converting
these pojos into columns would not be straightforward.

In my opinion, writing a serde in this case also would be quite tough (but
doable). You might need your own record writer (inputformat) and then a
serde of your own to inspect the objects.

If you control the way you store your pojo, you may want to pass it through
serde and create a bytewritable before storing it. That would make the
problem much simpler.

Thanks,
Aniket

On Sun, Jan 22, 2012 at 7:28 PM, Hans Uhlig <hu...@uhlisys.com> wrote:

> Hi Aniket,
>
> I am looking to run some data through a mapreduce and I want the output
> sequence files to be compatible with Block Compressed Partitioned
> LazyBinarySerDe so I can map external tables to it. The current job uses a
> pojo that extends writable to serialize to disk, this is easy to read back
> in for mapreduce but I am not sure how to read this with hive. Do I need to
> define it as a struct, just normal fields and row format is LazyBinarySerDe?
>
> On Sun, Jan 22, 2012 at 5:41 PM, Aniket Mokashi <an...@gmail.com>wrote:
>
>> Hi Hans,
>>
>> Can you please elaborate on the use case more? Is your data already in
>> Binary format readable to LazyBinarySerDe (if you mount a table with that
>> serde with hive)?
>> OR
>> are you trying to write data using mapreduce (java) into a location that
>> can be further read by a table that is declared to use LazyBinarySerDe?
>>
>> Please elaborate more.
>>
>> Thanks,
>> Aniket
>>
>> On Sun, Jan 22, 2012 at 10:23 AM, Hans Uhlig <hu...@uhlisys.com> wrote:
>>
>>> I am attempting to Use LazyBinarySerDe to read Sequence files output by
>>> a mapreduce job. Is there an example of how the data needs to be packed by
>>> the final reduce, and how the tables are set up so they can read the
>>> output?
>>
>>
>>
>>
>> --
>> "...:::Aniket:::... Quetzalco@tl"
>>
>
>

-- 
"...:::Aniket:::... Quetzalco@tl"

Re: Example using Binary SerDe

Posted by Hans Uhlig <hu...@uhlisys.com>.

Hi Aniket,

I am looking to run some data through a mapreduce and I want the output
sequence files to be compatible with Block Compressed Partitioned
LazyBinarySerDe so I can map external tables to it. The current job uses a
pojo that extends writable to serialize to disk, this is easy to read back
in for mapreduce but I am not sure how to read this with hive. Do I need to
define it as a struct, just normal fields and row format is LazyBinarySerDe?

On Sun, Jan 22, 2012 at 5:41 PM, Aniket Mokashi <an...@gmail.com> wrote:

> Hi Hans,
>
> Can you please elaborate on the use case more? Is your data already in
> Binary format readable to LazyBinarySerDe (if you mount a table with that
> serde with hive)?
> OR
> are you trying to write data using mapreduce (java) into a location that
> can be further read by a table that is declared to use LazyBinarySerDe?
>
> Please elaborate more.
>
> Thanks,
> Aniket
>
> On Sun, Jan 22, 2012 at 10:23 AM, Hans Uhlig <hu...@uhlisys.com> wrote:
>
>> I am attempting to Use LazyBinarySerDe to read Sequence files output by a
>> mapreduce job. Is there an example of how the data needs to be packed by
>> the final reduce, and how the tables are set up so they can read the
>> output?
>
>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>

Re: Example using Binary SerDe

Posted by Aniket Mokashi <an...@gmail.com>.

Hi Hans,

Can you please elaborate on the use case more? Is your data already in
Binary format readable to LazyBinarySerDe (if you mount a table with that
serde with hive)?
OR
are you trying to write data using mapreduce (java) into a location that
can be further read by a table that is declared to use LazyBinarySerDe?

Please elaborate more.

Thanks,
Aniket

On Sun, Jan 22, 2012 at 10:23 AM, Hans Uhlig <hu...@uhlisys.com> wrote:

> I am attempting to Use LazyBinarySerDe to read Sequence files output by a
> mapreduce job. Is there an example of how the data needs to be packed by
> the final reduce, and how the tables are set up so they can read the
> output?

-- 
"...:::Aniket:::... Quetzalco@tl"