You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by ravi teja <ra...@gmail.com> on 2015/08/25 17:32:58 UTC

Repair table doesnt update the transient_lastDdlTime of updated partitions.

Hi,

I am working towards a incremental solution on hive based on the
transient_lastDdlTime of the partitions.
If the we in

Thanks,
Ravi

Re: Repair table doesnt update the transient_lastDdlTime of updated partitions.

Posted by ravi teja <ra...@gmail.com>.
Thanks a lot Noam, you are a saviour!

Ravi

On Tue, Aug 25, 2015 at 10:03 PM, Noam Hasson <no...@kenshoo.com>
wrote:

> Hi,
>
> Check if this helps you:
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionTouch
>
> Noam.
>
> On Tue, Aug 25, 2015 at 6:43 PM, ravi teja <ra...@gmail.com> wrote:
>
>> Sorry For the incomplete mail, sent bymistake
>>
>> I am working towards a incremental solution on hive based on the
>> transient_lastDdlTime of the partitions.
>> We mostly deal with hive external tables.
>>
>> The transient_lastDdlTime of a partition gets updated when the insertion
>> to the table happens via the insert query route, we are good there.
>>
>> But the issue is, if the file level updation happens in the partition
>> folder, then hive doesnt update transient_lastDdlTime for that partition
>>  and we are not able to get the changed partitions list because of this.
>>
>>
>> Unfortunately we cant change the way the hive table is being updated, its
>> based on the file based update to the underlying location.
>> When we do a file based ingestion, then we have the complete list of
>> partitions updated.
>> But this cannot be passed to the incremental system, hence our source of
>> truth is hive metastore's a and its transient_lastDdlTime.
>>
>> Is there  a way where I can update the transient_lastDdlTime in the
>> metastore , for the partitions changed by adding files?
>> I have tried to re-add the changed partition to the table, for updated
>> ones so that the transient_lastDdlTime will change, but its not possible
>> as it throws an already exists exception.
>>
>> Is there any other way?
>> Thanks in advance.
>>
>> Thanks,
>> Ravi
>>
>> On Tue, Aug 25, 2015 at 9:02 PM, ravi teja <ra...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am working towards a incremental solution on hive based on the
>>> transient_lastDdlTime of the partitions.
>>> If the we in
>>>
>>> Thanks,
>>> Ravi
>>>
>>
>>
>
> This e-mail, as well as any attached document, may contain material which
> is confidential and privileged and may include trademark, copyright and
> other intellectual property rights that are proprietary to Kenshoo Ltd,
>  its subsidiaries or affiliates ("Kenshoo"). This e-mail and its
> attachments may be read, copied and used only by the addressee for the
> purpose(s) for which it was disclosed herein. If you have received it in
> error, please destroy the message and any attachment, and contact us
> immediately. If you are not the intended recipient, be aware that any
> review, reliance, disclosure, copying, distribution or use of the contents
> of this message without Kenshoo's express permission is strictly prohibited.

Re: Repair table doesnt update the transient_lastDdlTime of updated partitions.

Posted by Noam Hasson <no...@kenshoo.com>.
Hi,

Check if this helps you:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionTouch

Noam.

On Tue, Aug 25, 2015 at 6:43 PM, ravi teja <ra...@gmail.com> wrote:

> Sorry For the incomplete mail, sent bymistake
>
> I am working towards a incremental solution on hive based on the
> transient_lastDdlTime of the partitions.
> We mostly deal with hive external tables.
>
> The transient_lastDdlTime of a partition gets updated when the insertion
> to the table happens via the insert query route, we are good there.
>
> But the issue is, if the file level updation happens in the partition
> folder, then hive doesnt update transient_lastDdlTime for that partition
>  and we are not able to get the changed partitions list because of this.
>
>
> Unfortunately we cant change the way the hive table is being updated, its
> based on the file based update to the underlying location.
> When we do a file based ingestion, then we have the complete list of
> partitions updated.
> But this cannot be passed to the incremental system, hence our source of
> truth is hive metastore's a and its transient_lastDdlTime.
>
> Is there  a way where I can update the transient_lastDdlTime in the
> metastore , for the partitions changed by adding files?
> I have tried to re-add the changed partition to the table, for updated
> ones so that the transient_lastDdlTime will change, but its not possible
> as it throws an already exists exception.
>
> Is there any other way?
> Thanks in advance.
>
> Thanks,
> Ravi
>
> On Tue, Aug 25, 2015 at 9:02 PM, ravi teja <ra...@gmail.com> wrote:
>
>> Hi,
>>
>> I am working towards a incremental solution on hive based on the
>> transient_lastDdlTime of the partitions.
>> If the we in
>>
>> Thanks,
>> Ravi
>>
>
>

-- 
This e-mail, as well as any attached document, may contain material which 
is confidential and privileged and may include trademark, copyright and 
other intellectual property rights that are proprietary to Kenshoo Ltd, 
 its subsidiaries or affiliates ("Kenshoo"). This e-mail and its 
attachments may be read, copied and used only by the addressee for the 
purpose(s) for which it was disclosed herein. If you have received it in 
error, please destroy the message and any attachment, and contact us 
immediately. If you are not the intended recipient, be aware that any 
review, reliance, disclosure, copying, distribution or use of the contents 
of this message without Kenshoo's express permission is strictly prohibited.

Re: Repair table doesnt update the transient_lastDdlTime of updated partitions.

Posted by ravi teja <ra...@gmail.com>.
Sorry For the incomplete mail, sent bymistake

I am working towards a incremental solution on hive based on the
transient_lastDdlTime of the partitions.
We mostly deal with hive external tables.

The transient_lastDdlTime of a partition gets updated when the insertion to
the table happens via the insert query route, we are good there.

But the issue is, if the file level updation happens in the partition
folder, then hive doesnt update transient_lastDdlTime for that partition
 and we are not able to get the changed partitions list because of this.


Unfortunately we cant change the way the hive table is being updated, its
based on the file based update to the underlying location.
When we do a file based ingestion, then we have the complete list of
partitions updated.
But this cannot be passed to the incremental system, hence our source of
truth is hive metastore's a and its transient_lastDdlTime.

Is there  a way where I can update the transient_lastDdlTime in the
metastore , for the partitions changed by adding files?
I have tried to re-add the changed partition to the table, for updated ones
so that the transient_lastDdlTime will change, but its not possible as it
throws an already exists exception.

Is there any other way?
Thanks in advance.

Thanks,
Ravi

On Tue, Aug 25, 2015 at 9:02 PM, ravi teja <ra...@gmail.com> wrote:

> Hi,
>
> I am working towards a incremental solution on hive based on the
> transient_lastDdlTime of the partitions.
> If the we in
>
> Thanks,
> Ravi
>