You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Norbert Burger <no...@gmail.com> on 2011/09/01 03:55:43 UTC

manipulating HBaseStorage map outside of a UDF?

I'm using HBaseStorage to load a large column family (many columns)
into a relation, which generates a map[] on each row.  The maps are
wide and sparse (only a few keys exist on each row), and I'd ideally
like to GROUP all maps together by similar columns before passing off
to a UDF for further processing.

Is this possible?  I'd be fine with converting to bags first, but
seems TOBAG() just adds the extra bagging layer on top of a map.

Failing that, is there any manipulation I can make on these types of
relations in Pig in the case where I don't want to explicitly specify
each map key?

Norbert