You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by PengHui Li <co...@gmail.com> on 2019/04/04 08:22:38 UTC
How to implement partitioned external table.
Hi guys,
I am integrating hive and pulsar(http://pulsar.apache.org) by
HiveStorageHandler and HiveMetaHook, I want to add a feature can divide the
data into several parts(pulsar topics) when use hive `PARTITIONED BY`. But
don't know how to implement it based on HiveStorageHandler and HiveMetaHook.
- how to get hive table partition definition?
- While user inert data to hive table, how to get the data should be placed
in which partition?
- While use select data data from hive table, how to determine data is in
that partition?
Looking forward to your reply
- Penghui Li
Re: How to implement partitioned external table.
Posted by PengHui Li <co...@gmail.com>.
@Zoltan
Appreciate to your replay, i will open a new topic at developer list.
Best regards
Penghui
Zoltan Haindrich <ki...@rxd.hu> 于2019年4月11日周四 下午4:21写道:
>
>
> On 4/4/19 10:22 AM, PengHui Li wrote:
> > Hi guys,
> >
> > I am integrating hive and pulsar(http://pulsar.apache.org <
> http://pulsar.apache.org/>) by HiveStorageHandler and HiveMetaHook, I
> want to add a feature can divide the data
> > into several parts(pulsar topics) when use hive `PARTITIONED BY`. But
> don't know how to implement it based on HiveStorageHandler and HiveMetaHook.
>
> I think you should be able to access the table's properties from the
> StorageHandler (and get access to the pulsar server address/etc from there).
>
> About supporting topics: I think instead of adding some features to
> support "partitioned by"
> the storage handler could get into predicate push down...by making the
> topic a column.
> To get some ideas how to do that I would first take a look at the jdbc
> storage handler(or hbase).
>
> note: I think this topic might better fit the developer list.
>
> cheers,
> Zoltan
>
Re: How to implement partitioned external table.
Posted by Zoltan Haindrich <ki...@rxd.hu>.
On 4/4/19 10:22 AM, PengHui Li wrote:
> Hi guys,
>
> I am integrating hive and pulsar(http://pulsar.apache.org <http://pulsar.apache.org/>) by HiveStorageHandler and HiveMetaHook, I want to add a feature can divide the data
> into several parts(pulsar topics) when use hive `PARTITIONED BY`. But don't know how to implement it based on HiveStorageHandler and HiveMetaHook.
I think you should be able to access the table's properties from the StorageHandler (and get access to the pulsar server address/etc from there).
About supporting topics: I think instead of adding some features to support "partitioned by"
the storage handler could get into predicate push down...by making the topic a column.
To get some ideas how to do that I would first take a look at the jdbc storage handler(or hbase).
note: I think this topic might better fit the developer list.
cheers,
Zoltan