You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Eugene Koifman <ek...@hortonworks.com> on 2014/05/06 03:31:46 UTC

Make DESCRIBE return info from SerDe?
Hi,
is it possible to write a SerDe (or something else) so that DESCRIBE
<table> (as we all query parser) get the column definitions from the SerDe
rather than Metastore?

The idea is to support self-describing data types such as JSON such that
when the per partition schema changes, there is no need to run an ALTER
TABLE to access new fields.

For example, partition 1 of table T has JSON docs with fields 'a', 'b', 'c'
and the table was created with schema that has 3 columns.
Let's say later a partition is added where the docs have fields 'a', 'b',
'c', 'd'.

Is it possible to then run a query 'SELECT T.d from T' w/o ALTER TABLE?
 (assuming SerDe is smart enough to produce NULL for row that don't have
this field)

Is there a mechanism for that?

Thank you,
Eugene

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Make DESCRIBE

return info from SerDe?Posted by Eugene Koifman <ek...@hortonworks.com>.
thanks


On Mon, May 5, 2014 at 8:02 PM, Edward Capriolo <ed...@gmail.com>wrote:

> My protobuf server
> https://github.com/edwardcapriolo/hive-protobuf/extracts the schema
> from a class name.
>
> The avro serde does similar things.
>
>
>
>
> On Mon, May 5, 2014 at 9:31 PM, Eugene Koifman <ekoifman@hortonworks.com
> >wrote:
>
> > Hi,
> > is it possible to write a SerDe (or something else) so that DESCRIBE
> > <table> (as we all query parser) get the column definitions from the
> SerDe
> > rather than Metastore?
> >
> > The idea is to support self-describing data types such as JSON such that
> > when the per partition schema changes, there is no need to run an ALTER
> > TABLE to access new fields.
> >
> > For example, partition 1 of table T has JSON docs with fields 'a', 'b',
> 'c'
> > and the table was created with schema that has 3 columns.
> > Let's say later a partition is added where the docs have fields 'a', 'b',
> > 'c', 'd'.
> >
> > Is it possible to then run a query 'SELECT T.d from T' w/o ALTER TABLE?
> >  (assuming SerDe is smart enough to produce NULL for row that don't have
> > this field)
> >
> > Is there a mechanism for that?
> >
> > Thank you,
> > Eugene
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Make DESCRIBE

return info from SerDe?Posted by Edward Capriolo <ed...@gmail.com>.
My protobuf server
https://github.com/edwardcapriolo/hive-protobuf/extracts the schema
from a class name.

The avro serde does similar things.




On Mon, May 5, 2014 at 9:31 PM, Eugene Koifman <ek...@hortonworks.com>wrote:

> Hi,
> is it possible to write a SerDe (or something else) so that DESCRIBE
> <table> (as we all query parser) get the column definitions from the SerDe
> rather than Metastore?
>
> The idea is to support self-describing data types such as JSON such that
> when the per partition schema changes, there is no need to run an ALTER
> TABLE to access new fields.
>
> For example, partition 1 of table T has JSON docs with fields 'a', 'b', 'c'
> and the table was created with schema that has 3 columns.
> Let's say later a partition is added where the docs have fields 'a', 'b',
> 'c', 'd'.
>
> Is it possible to then run a query 'SELECT T.d from T' w/o ALTER TABLE?
>  (assuming SerDe is smart enough to produce NULL for row that don't have
> this field)
>
> Is there a mechanism for that?
>
> Thank you,
> Eugene
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>