You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Mohit Anchlia <mo...@gmail.com> on 2012/05/22 23:58:38 UTC

RCFile and UDF

I am new to Hive. Currently I am trying out one of the use cases where we
write xml files into a sequence file. We then read the sequence file and
convert it into more structured row, col format using pig udf. This is
currently being stored as snapp compression.

Now what I want to do is use hive to query data and do self join. But my
problem is that file that I need to query on is in snappy format, HIVE
dserializes the entire row which I am trying to avoid. Is there a way I can
store file in RCFile format when I store using pig?