You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Stephen Scaffid <ss...@tripadvisor.com> on 2012/07/27 01:17:31 UTC

Custom SerDe and getting column comments

I have a custom SerDe we are using and it works well. However, we have one issue with it - an application used to allow users to maintain table and column descriptions does not work with it.

We store the descriptions as "comments" on the tables and columns, via a python script that simply executes the necessary HQL to set the comment, but then uses the Hive class to retrieve it from metadata.

The problem is that any comment we try to set ends up getting set to "from deserializer" on any table using our custom SerDe. This doesn't happen with tables using the LazySimpleSerDe.

Is there a way to make our SerDe work with comments?

Re: Custom SerDe and getting column comments

Posted by Stephen Scaffid <ss...@tripadvisor.com>.
On Jul 26, 2012, at 9:05 PM, Travis Crawford wrote:

> Currently you can't use comments with custom serde's. Look into
> MetaStoreUtils.getFieldsFromDeserializer and you'll see "from
> deserializer" is hard-coded as the comment.

I found that. I also found in org.apache.hadoop.hive.ql.metadata.Table.getCols() that a decision is made to get the column info from either the table SD or the SerDe based on the result of SerDeUtils.shouldGetColsFromSerDe()... But I don't understand why.

Is there some way I can get the comment info from the metadata into the ObjectInspectors used by my SerDe? Would that be a bad idea?

> This is definitely an area for improvement.
> 
> Is your serde reporting columns, or do you have them stored in the
> metastore? If you "describe extended" your table do you see the
> comments stored in the metastore?

Indeed, the comments show up there, but not in the python script's output. Perhaps I shall look more closely at that code.

> 
> On Thu, Jul 26, 2012 at 4:17 PM, Stephen Scaffid
> <ss...@tripadvisor.com> wrote:
>> I have a custom SerDe we are using and it works well. However, we have one issue with it - an application used to allow users to maintain table and column descriptions does not work with it.
>> 
>> We store the descriptions as "comments" on the tables and columns, via a python script that simply executes the necessary HQL to set the comment, but then uses the Hive class to retrieve it from metadata.
>> 
>> The problem is that any comment we try to set ends up getting set to "from deserializer" on any table using our custom SerDe. This doesn't happen with tables using the LazySimpleSerDe.
>> 
>> Is there a way to make our SerDe work with comments?


Re: Custom SerDe and getting column comments

Posted by Travis Crawford <tr...@gmail.com>.
Currently you can't use comments with custom serde's. Look into
MetaStoreUtils.getFieldsFromDeserializer and you'll see "from
deserializer" is hard-coded as the comment.

This is definitely an area for improvement.

Is your serde reporting columns, or do you have them stored in the
metastore? If you "describe extended" your table do you see the
comments stored in the metastore?

--travis



On Thu, Jul 26, 2012 at 4:17 PM, Stephen Scaffid
<ss...@tripadvisor.com> wrote:
> I have a custom SerDe we are using and it works well. However, we have one issue with it - an application used to allow users to maintain table and column descriptions does not work with it.
>
> We store the descriptions as "comments" on the tables and columns, via a python script that simply executes the necessary HQL to set the comment, but then uses the Hive class to retrieve it from metadata.
>
> The problem is that any comment we try to set ends up getting set to "from deserializer" on any table using our custom SerDe. This doesn't happen with tables using the LazySimpleSerDe.
>
> Is there a way to make our SerDe work with comments?