You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Philip Tromans <ph...@gmail.com> on 2012/01/23 20:05:49 UTC

RCFile and LazyBinarySerDe

Hi all,

I'm having a problem, where I'm trying to insert into a table which
has ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe', and is
STORED AS RCFILE. The exception:

java.lang.UnsupportedOperationException: Currently the writer can only
accept BytesRefArrayWritable
        at org.apache.hadoop.hive.ql.io.RCFile$Writer.append(RCFile.java:863)
        at org.apache.hadoop.hive.ql.io.RCFileOutputFormat$2.write(RCFileOutputFormat.java:140)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:589)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
        at org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
        at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247)

is being thrown. The types present in the table are TIMESTAMP, BINARY,
INT, BOOLEAN, DOUBLE, STRING and TINYINT. As far as I can see, RCFile
stores the data in a text format, and I'd like it to be binary so that
it's faster to read and more compact. What is causing it to break, and
how can I work around the problem?

Cheers,

Phil.