You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Rui Martins <ru...@ruibm.com> on 2013/10/14 14:16:54 UTC
Custom SerDe: Initialize() passes a null configuration to my Custom SerDe
Hi hive users,
I am writing a custom SerDe that loads any protocol buffer generated class.
For flexibility this class can live in a jar external to the SerDe's jar
and then I just use the Hive Configuration class passed in the initiliaze
to dynamically load it and set the schema for the Hive table.
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
When I use my custom SerDe as a Deserializer it all works well, I get a
Configuration and I correctly load the ProtoBuffer class from the external
Jar.
However, when I use the SerDe as a Serializer, the Configuration is always
set to null so I have no way of loading the external class from the Jar.
My questions are:
* 1) Is the initialize(..) method in Serializer supposed to always pass a
null Configuration?*
*
*
* 2) Is there a way of creating or retrieving the current Hadoop/Hive
Configuration when this parameter is passed as null?*
*
*
Thank you,
rui
Re: Custom SerDe: Initialize() passes a null configuration to my
Custom SerDe
Posted by Hari Subramaniyan <hs...@hortonworks.com>.
Please see
https://svn.apache.org/repos/asf/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java
https://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/
to see how a non-null conf is passed to initialize()
Thanks
Hari
On Mon, Oct 14, 2013 at 6:29 PM, Yin Huai <hu...@gmail.com> wrote:
> Can you try to set serde properties?
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties
>
> I have not tried it, but seems it is the right way to pass configurations
> to serde class.
>
> Thanks,
>
> Yin
>
>
> On Mon, Oct 14, 2013 at 8:20 AM, Rui Martins <ru...@ruibm.com> wrote:
>
> > +dev hive mailing list that I should've mailed in the first place.
> >
> > (apologies for the spam)
> >
> >
> > On Mon, Oct 14, 2013 at 1:16 PM, Rui Martins <ru...@ruibm.com> wrote:
> >
> >> Hi hive users,
> >>
> >> I am writing a custom SerDe that loads any protocol buffer generated
> >> class.
> >> For flexibility this class can live in a jar external to the SerDe's jar
> >> and then I just use the Hive Configuration class passed in the
> initiliaze
> >> to dynamically load it and set the schema for the Hive table.
> >>
> >>
> >>
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
> >>
> >> When I use my custom SerDe as a Deserializer it all works well, I get a
> >> Configuration and I correctly load the ProtoBuffer class from the
> external
> >> Jar.
> >>
> >> However, when I use the SerDe as a Serializer, the Configuration is
> >> always set to null so I have no way of loading the external class from
> the
> >> Jar.
> >>
> >> My questions are:
> >>
> >> * 1) Is the initialize(..) method in Serializer supposed to always pass
> >> a null Configuration?*
> >> *
> >> *
> >> * 2) Is there a way of creating or retrieving the current Hadoop/Hive
> >> Configuration when this parameter is passed as null?*
> >> *
> >> *
> >>
> >> Thank you,
> >> rui
> >>
> >
> >
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: Custom SerDe: Initialize() passes a null configuration to my
Custom SerDe
Posted by Hari Subramaniyan <hs...@hortonworks.com>.
Please see
https://svn.apache.org/repos/asf/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java
https://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/
to see how a non-null conf is passed to initialize()
Thanks
Hari
On Mon, Oct 14, 2013 at 6:29 PM, Yin Huai <hu...@gmail.com> wrote:
> Can you try to set serde properties?
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties
>
> I have not tried it, but seems it is the right way to pass configurations
> to serde class.
>
> Thanks,
>
> Yin
>
>
> On Mon, Oct 14, 2013 at 8:20 AM, Rui Martins <ru...@ruibm.com> wrote:
>
> > +dev hive mailing list that I should've mailed in the first place.
> >
> > (apologies for the spam)
> >
> >
> > On Mon, Oct 14, 2013 at 1:16 PM, Rui Martins <ru...@ruibm.com> wrote:
> >
> >> Hi hive users,
> >>
> >> I am writing a custom SerDe that loads any protocol buffer generated
> >> class.
> >> For flexibility this class can live in a jar external to the SerDe's jar
> >> and then I just use the Hive Configuration class passed in the
> initiliaze
> >> to dynamically load it and set the schema for the Hive table.
> >>
> >>
> >>
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
> >>
> >> When I use my custom SerDe as a Deserializer it all works well, I get a
> >> Configuration and I correctly load the ProtoBuffer class from the
> external
> >> Jar.
> >>
> >> However, when I use the SerDe as a Serializer, the Configuration is
> >> always set to null so I have no way of loading the external class from
> the
> >> Jar.
> >>
> >> My questions are:
> >>
> >> * 1) Is the initialize(..) method in Serializer supposed to always pass
> >> a null Configuration?*
> >> *
> >> *
> >> * 2) Is there a way of creating or retrieving the current Hadoop/Hive
> >> Configuration when this parameter is passed as null?*
> >> *
> >> *
> >>
> >> Thank you,
> >> rui
> >>
> >
> >
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: Custom SerDe: Initialize() passes a null configuration to my
Custom SerDe
Posted by Yin Huai <hu...@gmail.com>.
Can you try to set serde properties?
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties
I have not tried it, but seems it is the right way to pass configurations
to serde class.
Thanks,
Yin
On Mon, Oct 14, 2013 at 8:20 AM, Rui Martins <ru...@ruibm.com> wrote:
> +dev hive mailing list that I should've mailed in the first place.
>
> (apologies for the spam)
>
>
> On Mon, Oct 14, 2013 at 1:16 PM, Rui Martins <ru...@ruibm.com> wrote:
>
>> Hi hive users,
>>
>> I am writing a custom SerDe that loads any protocol buffer generated
>> class.
>> For flexibility this class can live in a jar external to the SerDe's jar
>> and then I just use the Hive Configuration class passed in the initiliaze
>> to dynamically load it and set the schema for the Hive table.
>>
>>
>> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
>>
>> When I use my custom SerDe as a Deserializer it all works well, I get a
>> Configuration and I correctly load the ProtoBuffer class from the external
>> Jar.
>>
>> However, when I use the SerDe as a Serializer, the Configuration is
>> always set to null so I have no way of loading the external class from the
>> Jar.
>>
>> My questions are:
>>
>> * 1) Is the initialize(..) method in Serializer supposed to always pass
>> a null Configuration?*
>> *
>> *
>> * 2) Is there a way of creating or retrieving the current Hadoop/Hive
>> Configuration when this parameter is passed as null?*
>> *
>> *
>>
>> Thank you,
>> rui
>>
>
>
Re: Custom SerDe: Initialize() passes a null configuration to my
Custom SerDe
Posted by Yin Huai <hu...@gmail.com>.
Can you try to set serde properties?
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties
I have not tried it, but seems it is the right way to pass configurations
to serde class.
Thanks,
Yin
On Mon, Oct 14, 2013 at 8:20 AM, Rui Martins <ru...@ruibm.com> wrote:
> +dev hive mailing list that I should've mailed in the first place.
>
> (apologies for the spam)
>
>
> On Mon, Oct 14, 2013 at 1:16 PM, Rui Martins <ru...@ruibm.com> wrote:
>
>> Hi hive users,
>>
>> I am writing a custom SerDe that loads any protocol buffer generated
>> class.
>> For flexibility this class can live in a jar external to the SerDe's jar
>> and then I just use the Hive Configuration class passed in the initiliaze
>> to dynamically load it and set the schema for the Hive table.
>>
>>
>> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
>>
>> When I use my custom SerDe as a Deserializer it all works well, I get a
>> Configuration and I correctly load the ProtoBuffer class from the external
>> Jar.
>>
>> However, when I use the SerDe as a Serializer, the Configuration is
>> always set to null so I have no way of loading the external class from the
>> Jar.
>>
>> My questions are:
>>
>> * 1) Is the initialize(..) method in Serializer supposed to always pass
>> a null Configuration?*
>> *
>> *
>> * 2) Is there a way of creating or retrieving the current Hadoop/Hive
>> Configuration when this parameter is passed as null?*
>> *
>> *
>>
>> Thank you,
>> rui
>>
>
>
Re: Custom SerDe: Initialize() passes a null configuration to my
Custom SerDe
Posted by Rui Martins <ru...@ruibm.com>.
+dev hive mailing list that I should've mailed in the first place.
(apologies for the spam)
On Mon, Oct 14, 2013 at 1:16 PM, Rui Martins <ru...@ruibm.com> wrote:
> Hi hive users,
>
> I am writing a custom SerDe that loads any protocol buffer generated
> class.
> For flexibility this class can live in a jar external to the SerDe's jar
> and then I just use the Hive Configuration class passed in the initiliaze
> to dynamically load it and set the schema for the Hive table.
>
>
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
>
> When I use my custom SerDe as a Deserializer it all works well, I get a
> Configuration and I correctly load the ProtoBuffer class from the external
> Jar.
>
> However, when I use the SerDe as a Serializer, the Configuration is always
> set to null so I have no way of loading the external class from the Jar.
>
> My questions are:
>
> * 1) Is the initialize(..) method in Serializer supposed to always pass
> a null Configuration?*
> *
> *
> * 2) Is there a way of creating or retrieving the current Hadoop/Hive
> Configuration when this parameter is passed as null?*
> *
> *
>
> Thank you,
> rui
>
Re: Custom SerDe: Initialize() passes a null configuration to my
Custom SerDe
Posted by Rui Martins <ru...@ruibm.com>.
+dev hive mailing list that I should've mailed in the first place.
(apologies for the spam)
On Mon, Oct 14, 2013 at 1:16 PM, Rui Martins <ru...@ruibm.com> wrote:
> Hi hive users,
>
> I am writing a custom SerDe that loads any protocol buffer generated
> class.
> For flexibility this class can live in a jar external to the SerDe's jar
> and then I just use the Hive Configuration class passed in the initiliaze
> to dynamically load it and set the schema for the Hive table.
>
>
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
>
> When I use my custom SerDe as a Deserializer it all works well, I get a
> Configuration and I correctly load the ProtoBuffer class from the external
> Jar.
>
> However, when I use the SerDe as a Serializer, the Configuration is always
> set to null so I have no way of loading the external class from the Jar.
>
> My questions are:
>
> * 1) Is the initialize(..) method in Serializer supposed to always pass
> a null Configuration?*
> *
> *
> * 2) Is there a way of creating or retrieving the current Hadoop/Hive
> Configuration when this parameter is passed as null?*
> *
> *
>
> Thank you,
> rui
>
Re: Custom SerDe: Initialize() passes a null configuration to my
Custom SerDe
Posted by Rui Martins <ru...@ruibm.com>.
Thank you all for the lightning fast replies.
@Yin: I am actually passing SerDe properties and they are working fine.
Actually, this is how I pass the the full namespace of the class I want to
use. I receive the Properties object fine, it's the Configuration one that
is null. :(
@Hari: Perfect Hari, I think this is exactly what I need.
@Edward: Thanks for the suggestion. I have seen that and I think twitter
also has one called Elephant Bird but neither supports all the
functionalities I need. I've implement support both Deserializer and
Serializer for Protocol Buffers (the former works flawlessly, the latter is
the one with the problem). In my implementation I also support other data
format types (apart from Protocol Buffers), and I also implemented new
Input and Output file formats. Hopefully I'll be able to opensource this
soon.
On Mon, Oct 14, 2013 at 6:06 PM, Edward Capriolo <ed...@gmail.com>wrote:
> Have you seen?
>
> https://github.com/edwardcapriolo/hive-protobuf/
>
>
> On Mon, Oct 14, 2013 at 8:16 AM, Rui Martins <ru...@ruibm.com> wrote:
>
>> Hi hive users,
>>
>> I am writing a custom SerDe that loads any protocol buffer generated
>> class.
>> For flexibility this class can live in a jar external to the SerDe's jar
>> and then I just use the Hive Configuration class passed in the initiliaze
>> to dynamically load it and set the schema for the Hive table.
>>
>>
>> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
>>
>> When I use my custom SerDe as a Deserializer it all works well, I get a
>> Configuration and I correctly load the ProtoBuffer class from the external
>> Jar.
>>
>> However, when I use the SerDe as a Serializer, the Configuration is
>> always set to null so I have no way of loading the external class from the
>> Jar.
>>
>> My questions are:
>>
>> * 1) Is the initialize(..) method in Serializer supposed to always pass
>> a null Configuration?*
>> *
>> *
>> * 2) Is there a way of creating or retrieving the current Hadoop/Hive
>> Configuration when this parameter is passed as null?*
>> *
>> *
>>
>> Thank you,
>> rui
>>
>
>
Re: Custom SerDe: Initialize() passes a null configuration to my
Custom SerDe
Posted by Edward Capriolo <ed...@gmail.com>.
Have you seen?
https://github.com/edwardcapriolo/hive-protobuf/
On Mon, Oct 14, 2013 at 8:16 AM, Rui Martins <ru...@ruibm.com> wrote:
> Hi hive users,
>
> I am writing a custom SerDe that loads any protocol buffer generated
> class.
> For flexibility this class can live in a jar external to the SerDe's jar
> and then I just use the Hive Configuration class passed in the initiliaze
> to dynamically load it and set the schema for the Hive table.
>
>
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop.hive/hive-serde/0.7.0-cdh3u0/org/apache/hadoop/hive/serde2/Serializer.java#Serializer
>
> When I use my custom SerDe as a Deserializer it all works well, I get a
> Configuration and I correctly load the ProtoBuffer class from the external
> Jar.
>
> However, when I use the SerDe as a Serializer, the Configuration is always
> set to null so I have no way of loading the external class from the Jar.
>
> My questions are:
>
> * 1) Is the initialize(..) method in Serializer supposed to always pass
> a null Configuration?*
> *
> *
> * 2) Is there a way of creating or retrieving the current Hadoop/Hive
> Configuration when this parameter is passed as null?*
> *
> *
>
> Thank you,
> rui
>