You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Mithun Radhakrishnan (JIRA)" <ji...@apache.org> on 2019/06/14 20:34:00 UTC
[jira] [Commented] (HIVE-21877) Change HCatTableInfo to not be
transient in PartInfo
[ https://issues.apache.org/jira/browse/HIVE-21877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16864420#comment-16864420 ]
Mithun Radhakrishnan commented on HIVE-21877:
---------------------------------------------
Pasting your question from the PR here:
{quote}
While using Hcatalog with Apache Beam, we ran into an issue with HCatTableInfo being null during serialization. I don't see a reason why it should be transient. However, there might be use-cases that I may not be aware of and might require it to be transient. Would love to hear some feedback regardless.
{quote}
This has to do with HIVE-9845. It would not be a good idea to make HCatTableInfo non-transient. Doing so will make Pig/HCatLoader, as well as {{HCatInputFormat}} inefficient for large partition sets.
{{HCatTableInfo}} contains table-information that is static for all partition within a partition-set for a given table. {{PartInfo}} is the variable part. Serializing this multiple times for a partition set increases the split-meta-info for a Hadoop job to unreasonable lengths.
I would advise perusing the HCat code to see how {{HCatTableInfo}} is restored, post serialization.
> Change HCatTableInfo to not be transient in PartInfo
> ----------------------------------------------------
>
> Key: HIVE-21877
> URL: https://issues.apache.org/jira/browse/HIVE-21877
> Project: Hive
> Issue Type: New Feature
> Reporter: Ankit Jhalaria
> Assignee: Ankit Jhalaria
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Since HCatTableInfo is serializable, removing the transient annotation from it. We were running into NPE during serialization while using HCatalogIO with Beam.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)