You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by ut...@gmail.com on 2019/02/15 12:48:39 UTC

Writing to Hive Partitions with HCatalogIO

Hi,

I've a pipeline that reads data from a partitioned Hive Table, performs
some simple filtering and writes back to a partitioned Hive Table.
I've no issues in reading from partitions, by specifying the "withFilter"
option.

However, I am unclear about how to write the PCollection back to Hive
Partition.
I'm trying to understand how the ".withPartition" works.
Should the partition column value be passed in the map - and if so, how
does writing PCollection<HCatRecord> to multiple partitions work ?
Specifically, am interested in knowing if I should handle the withPartition
map differently for a static Hive partition and a dynamic hive partition.

Maybe a code sample or example test would be very useful, since I am not
finding many places where this is used. The test folder in git doesn't
provide much help either.

Beam version used - 2.9.0

Regards,
Utkarsh