You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Yang <te...@gmail.com> on 2014/07/25 01:06:15 UTC

doing upsert possible?

if we have a huge table, and every 1 hour only 1% of that has some updates,
it would be a huge waste to slurp in the whole table through MR job and
write out the new table.

instead, if we store this table in HBASE, and use the current HBase+Hive
integration, as long as we can do upsert, then we can afford to touch only
that 1% of entries, and the result can be very fast.

Re: doing upsert possible?

Posted by Juan Martin Pampliega <jp...@gmail.com>.
Hi Yang. That's correct. You should check out the HBase UDFs in Klout's
Brickhouse library
https://github.com/klout/brickhouse/tree/master/src/main/java/brickhouse/hbase
On Jul 24, 2014 8:07 PM, "Yang" <te...@gmail.com> wrote:

> if we have a huge table, and every 1 hour only 1% of that has some
> updates, it would be a huge waste to slurp in the whole table through MR
> job and write out the new table.
>
> instead, if we store this table in HBASE, and use the current HBase+Hive
> integration, as long as we can do upsert, then we can afford to touch only
> that 1% of entries, and the result can be very fast.
>