You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Bryan Jeffrey <br...@gmail.com> on 2013/12/17 01:30:23 UTC

Hive - Issue Converting Text to Orc

Hello.

Running the following version of Hadoop: hadoop-2.2.0
Running the following version of Hive: hive-0.12.0

I have a simple test system setup with (2) datanodes/node manager and (1)
namenode/resource manager.  Hive is running on the namenode, and contacting
a MySQL database for metastore.

I have created a small table 'from_text' as follows:

[server:10001] hive> describe from_text;
foo                     int                     None
bar                     int                     None
boo                     string                  None


[server:10001] hive> select * from from_text;
1       2       Hello
2       3       World

I go to insert the data into my Orc table, 'orc_table':

[server:10001] hive> describe orc_test;
foo                     int                     from deserializer
bar                     int                     from deserializer
boo                     string                  from deserializer


The job runs, but fails to complete with the following errors (see below).
 This seems to be the exact example covered in the example here:

http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/

I took a few minutes to recompile the protbuf library as several other
problems mentioned that Hive 0.12 did not have the protobuf library
updated. That did not remedy the problem.  Any ideas?


[server:10001] hive> insert into table orc_test select * from from_text;
[Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask


Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing
operators
        at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.UnsupportedOperationException: This is supposed to be
overridden by subclasses.
        at
com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
        at
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
        at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
        at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
        at
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
        at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
        at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
        at
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
        at
com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
        at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
        at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
        at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
        at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
        at
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
        at
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
        at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
        ... 8 more

Re: Hive - Issue Converting Text to Orc

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
AFAIK, hive 0.12 uses protobuf 2.4 version to compile orcproto.proto. Since you are using protobuf 2.5 there might be incompatibility of protobuf versions. Can you try downloading protobuf 2.4.1 from https://code.google.com/p/protobuf/downloads/list and repeating your steps again?

Thanks
Prasanth Jayachandran

On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com> wrote:

> Prasanth,
> 
> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google Code site.  I compiled it via the following steps:
> (1) ./configure && make (to compile the C code)
> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto ../src/google/protobuf/orc.proto
> (3) Compiled the org/apache/... directory via javac
> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
> (6) Restarted hive
> 
> Same results before/after protobuf modification.
> 
> Bryan
> 
> 
> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
> What version of protobuf are you using? Are you compiling hive from source?
> 
> Thanks
> Prasanth Jayachandran
> 
> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com> wrote:
> 
>> Hello.
>> 
>> Running the following version of Hadoop: hadoop-2.2.0
>> Running the following version of Hive: hive-0.12.0
>> 
>> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
>> 
>> I have created a small table 'from_text' as follows:
>> 
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>> 
>> 
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>> 
>> I go to insert the data into my Orc table, 'orc_table':
>> 
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>> 
>> 
>> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here: 
>> 
>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>> 
>> I took a few minutes to recompile the protbuf library as several other problems mentioned that Hive 0.12 did not have the protobuf library updated. That did not remedy the problem.  Any ideas?
>> 
>> 
>> [server:10001] hive> insert into table orc_test select * from from_text;
>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> 
>> 
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>         ... 8 more
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive - Issue Converting Text to Orc

Posted by Bryan Jeffrey <br...@gmail.com>.
Prasanth,

That worked well.  The procedure you had is a little bit more complicated.
The installation of protobuf 2.5.0 breaks the compilation of Hive (ivy
pulls down protobuf 2.4.1 jar, which causes compile errors).  I convinced
Ivy to provide the 2.5.0 package, Hive compiled correctly, and queries to
create Orc data worked well once installed.

I really appreciate the assistance!


On Thu, Jan 9, 2014 at 8:42 AM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> Hi Bryan
>
> My apologize for the delayed response. I am still on vacation and couldn’t
> get much time to work on this issue. I was able to figure out the reason
> for this issue. This issue is related to incompatibility between versions
> of protobuf of generated code (OrcProto.java) and the runtime protobuf jar
> (protobuf-java-2.x.x.jar). This incompatibility is also addressed here
> https://code.google.com/p/protobuf/issues/detail?id=493
>
> To reproduce your exception, I tried the following
> 1) Installed protoc version 2.4.1.
> 2) Compiled hive source and generated OrcProto.java (having protoc in
> PATH). I used the following command
> mvn package -DskipTests -Phadoop-2,protobuf -Pdist
>      This command compiles the orcproto.proto file with 2.4.1 and the
> maven package will pull protobuf-2.5.0.jar as a part of dependency.
> 3) ran hive cli and followed the steps that you had mentioned in this mail
> thread
>
> Doing the above steps resulted in the same exception as you had posted.
>
> The reason for the exception in your case might be, hive-0.12.0 binary
> download uses protobuf 2.4.1 to compile proto file and hadoop-2.2.0 uses
> protobuf 2.5.0. When protobuf-java-2.5.0.jar is present in your classpath,
> it will throw runtime exception. To avoid this try the solution below
>
> Solution:
> 1) Install protoc 2.5.0, compile hive 0.12.0 (*ant protobuf && ant clean
> package*)
> 2) Remove protobuf-java-2.4.1.jar pulled by ivy from build/dist/lib
> directory
> 3) Copy protbuf-java-2.5.0.jar from hadoop-2.2.0/share/hadoop/common/lib
> to hive/build/dist/lib
> 3) Run hive cli and rerun your queries.
>
> Let me know if this works.
>
> Thanks
> Prasanth Jayachandran
>
> On Dec 31, 2013, at 2:54 AM, Bryan Jeffrey <br...@gmail.com>
> wrote:
>
> Prasanth,
>
> Any luck?
>
>
> On Tue, Dec 24, 2013 at 4:31 PM, Bryan Jeffrey <br...@gmail.com>wrote:
>
>> Prasanth,
>>
>> I am also traveling this week.  Your assistance would be appreciated, but
>> not at the expense of your holiday!
>>
>> Bryan
>> On Dec 24, 2013 2:23 PM, "Prasanth Jayachandran" <
>> pjayachandran@hortonworks.com> wrote:
>>
>>> Bryan
>>>
>>> I have a similar setup. I will try to reproduce this issue and get back
>>> to you asap.
>>> Since i am traveling expect some delay.
>>>
>>> Thanks
>>> Prasanth
>>>
>>> Sent from my iPhone
>>>
>>> On Dec 24, 2013, at 11:39 AM, Bryan Jeffrey <br...@gmail.com>
>>> wrote:
>>>
>>> Hello.
>>>
>>> I posted this a few weeks ago, but was unable to get a response that
>>> solved the issue.  I have made no headway in the mean time.  I was hoping
>>> that if I re-summarized the issue that someone would have some advice
>>> regarding this problem.
>>> Running the following version of Hadoop: hadoop-2.2.0
>>> Running the following version of Hive: hive-0.12.0
>>>
>>> I have a simple test system setup with (2) datanodes/node manager and
>>> (1) namenode/resource manager.  Hive is running on the namenode, and
>>> contacting a MySQL database for metastore.
>>>
>>> I have created a small table 'from_text' as follows:
>>>
>>> [server:10001] hive> describe from_text;
>>> foo                     int                     None
>>> bar                     int                     None
>>> boo                     string                  None
>>>
>>>
>>> [server:10001] hive> select * from from_text;
>>> 1       2       Hello
>>> 2       3       World
>>>
>>> I go to insert the data into my Orc table, 'orc_table':
>>>
>>> [server:10001] hive> describe orc_test;
>>> foo                     int                     from deserializer
>>> bar                     int                     from deserializer
>>> boo                     string                  from deserializer
>>>
>>>
>>> The job runs, but fails to complete with the following errors (see
>>> below).  This seems to be the exact example covered in the example here:
>>>
>>>
>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>
>>> The error output is below.  I have tried several things to solve the
>>> issue, including re-installing Hive 0.12.0 from binary install.
>>>
>>> Help?
>>>
>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>
>>>
>>> Diagnostic Messages for this Task:
>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>> operators
>>>         at
>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>         at
>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>>> be overridden by subclasses.
>>>         at
>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>         at org.apache.hadoop.hive.ql.io.orc
>>> .OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>         at org.apache.hadoop.hive.ql.io.orc
>>> .OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>         at org.apache.hadoop.hive.ql.io.orc
>>> .OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>         at
>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>         at org.apache.hadoop.hive.ql.io.orc
>>> .WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>         at org.apache.hadoop.hive.ql.io.orc
>>> .WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>         at org.apache.hadoop.hive.ql.io.orc
>>> .WriterImpl.flushStripe(WriterImpl.java:1699)
>>>         at org.apache.hadoop.hive.ql.io.orc
>>> .WriterImpl.close(WriterImpl.java:1868)
>>>         at org.apache.hadoop.hive.ql.io.orc
>>> .OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>         ... 8 more
>>>
>>>
>>> On Tue, Dec 17, 2013 at 11:56 AM, Bryan Jeffrey <bryan.jeffrey@gmail.com
>>> > wrote:
>>>
>>>> Prasanth,
>>>>
>>>> I downloaded the binary Hive version from the URL you specified.  I
>>>> untarred the Hive tar, copied in configuration files, and started Hive.  I
>>>> continue to see the same error:
>>>>
>>>> [server:10001] hive> describe orc_test;
>>>> foo                     int                     from deserializer
>>>> bar                     int                     from deserializer
>>>> boo                     string                  from deserializer
>>>>
>>>>
>>>> [server:10001] hive> describe from_text;
>>>> foo                     int                     None
>>>> bar                     int                     None
>>>> boo                     string                  None
>>>>
>>>> [server:10001] hive> select * from from_text;
>>>> 1       2       Hello
>>>> 2       3       World
>>>>
>>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>>>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>
>>>> From the Hive Log:
>>>> Diagnostic Messages for this Task:
>>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>>> operators
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>>         at
>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>         at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>>>> be overridden by subclasses.
>>>>         at
>>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>>          at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>>         at
>>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>>         ... 8 more
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Dec 17, 2013 at 2:31 AM, Prasanth Jayachandran <
>>>> pjayachandran@hortonworks.com> wrote:
>>>>
>>>>> Bryan
>>>>>
>>>>> In either cases (source download or binary download) you do not need
>>>>> to compile orc protobuf component. The java source from .proto files should
>>>>> be already available when you download hive 0.12 release. I would recommend
>>>>> re-downloading hive 0.12 binary release from
>>>>> http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and
>>>>> running hive directly. After extracting the hive-0.12.0-bin.tar.gz<http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/hive-0.12.0-bin.tar.gz> set
>>>>> HIVE_HOME to the extracted directory and run hive. Let me know if you face
>>>>> any issues.
>>>>>
>>>>> Thanks
>>>>> Prasanth Jayachandran
>>>>>
>>>>> On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <br...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Prasanth,
>>>>>
>>>>> I simply compiled the protobuf library, and then compiled the orc
>>>>> protobuf component.  I did not recompile either Hive or custom UDFs/etc.
>>>>>
>>>>> Is a protobuf recompile the solution for this issue, or a dead end?
>>>>>  Has this been seen before?  I looked for more feedback, but most of the
>>>>> Orc issues were associated with Hive 0.11.0.
>>>>>
>>>>> I will try recompiling the 2.4 protobuf version shortly!
>>>>>
>>>>> Bryan
>>>>>
>>>>>
>>>>> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <
>>>>> pjayachandran@hortonworks.com> wrote:
>>>>>
>>>>>> Also what are you doing with steps 2 through 5? Compiling hive or
>>>>>> your custom code?
>>>>>>
>>>>>> Thanks
>>>>>> Prasanth Jayachandran
>>>>>>
>>>>>> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> Prasanth,
>>>>>>
>>>>>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did
>>>>>> not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google
>>>>>> Code site.  I compiled it via the following steps:
>>>>>> (1) ./configure && make (to compile the C code)
>>>>>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto
>>>>>> ../src/google/protobuf/orc.proto
>>>>>> (3) Compiled the org/apache/... directory via javac
>>>>>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>>>>>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>>>>>> (6) Restarted hive
>>>>>>
>>>>>> Same results before/after protobuf modification.
>>>>>>
>>>>>> Bryan
>>>>>>
>>>>>>
>>>>>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <
>>>>>> pjayachandran@hortonworks.com> wrote:
>>>>>>
>>>>>>> What version of protobuf are you using? Are you compiling hive from
>>>>>>> source?
>>>>>>>
>>>>>>>  Thanks
>>>>>>> Prasanth Jayachandran
>>>>>>>
>>>>>>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>   Hello.
>>>>>>>
>>>>>>> Running the following version of Hadoop: hadoop-2.2.0
>>>>>>> Running the following version of Hive: hive-0.12.0
>>>>>>>
>>>>>>> I have a simple test system setup with (2) datanodes/node manager
>>>>>>> and (1) namenode/resource manager.  Hive is running on the namenode, and
>>>>>>> contacting a MySQL database for metastore.
>>>>>>>
>>>>>>> I have created a small table 'from_text' as follows:
>>>>>>>
>>>>>>> [server:10001] hive> describe from_text;
>>>>>>> foo                     int                     None
>>>>>>> bar                     int                     None
>>>>>>> boo                     string                  None
>>>>>>>
>>>>>>>
>>>>>>> [server:10001] hive> select * from from_text;
>>>>>>> 1       2       Hello
>>>>>>> 2       3       World
>>>>>>>
>>>>>>> I go to insert the data into my Orc table, 'orc_table':
>>>>>>>
>>>>>>> [server:10001] hive> describe orc_test;
>>>>>>> foo                     int                     from deserializer
>>>>>>> bar                     int                     from deserializer
>>>>>>> boo                     string                  from deserializer
>>>>>>>
>>>>>>>
>>>>>>> The job runs, but fails to complete with the following errors (see
>>>>>>> below).  This seems to be the exact example covered in the example here:
>>>>>>>
>>>>>>>
>>>>>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>>>>>
>>>>>>> I took a few minutes to recompile the protbuf library as several
>>>>>>> other problems mentioned that Hive 0.12 did not have the protobuf library
>>>>>>> updated. That did not remedy the problem.  Any ideas?
>>>>>>>
>>>>>>>
>>>>>>> [server:10001] hive> insert into table orc_test select * from
>>>>>>> from_text;
>>>>>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED:
>>>>>>> Execution Error, return code 2 from
>>>>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>>
>>>>>>>
>>>>>>> Diagnostic Messages for this Task:
>>>>>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>>>>>> operators
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>>>>>         at
>>>>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>>>         at
>>>>>>> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>>>         at
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>>>>         at
>>>>>>> org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>>>>>> Caused by: java.lang.UnsupportedOperationException: This is supposed
>>>>>>> to be overridden by subclasses.
>>>>>>>         at
>>>>>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>>>>>          at
>>>>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>>>         at
>>>>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>>>>>         at
>>>>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>>>         at
>>>>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>>>>>         at
>>>>>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>>>>>         ... 8 more
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>>> entity to which it is addressed and may contain information that is
>>>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>>>> If the reader of this message is not the intended recipient, you are hereby
>>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>>>> you have received this communication in error, please contact the sender
>>>>>>> immediately and delete it from your system. Thank You.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> CONFIDENTIALITY NOTICE
>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>> entity to which it is addressed and may contain information that is
>>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>>> If the reader of this message is not the intended recipient, you are hereby
>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>>> you have received this communication in error, please contact the sender
>>>>>> immediately and delete it from your system. Thank You.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>>>> entity to which it is addressed and may contain information that is
>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>> If the reader of this message is not the intended recipient, you are hereby
>>>>> notified that any printing, copying, dissemination, distribution,
>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>> you have received this communication in error, please contact the sender
>>>>> immediately and delete it from your system. Thank You.
>>>>>
>>>>
>>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Hive - Issue Converting Text to Orc

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
Hi Bryan

My apologize for the delayed response. I am still on vacation and couldn’t get much time to work on this issue. I was able to figure out the reason for this issue. This issue is related to incompatibility between versions of protobuf of generated code (OrcProto.java) and the runtime protobuf jar (protobuf-java-2.x.x.jar). This incompatibility is also addressed here https://code.google.com/p/protobuf/issues/detail?id=493

To reproduce your exception, I tried the following
1) Installed protoc version 2.4.1. 
2) Compiled hive source and generated OrcProto.java (having protoc in PATH). I used the following command
	mvn package -DskipTests -Phadoop-2,protobuf -Pdist
     This command compiles the orcproto.proto file with 2.4.1 and the maven package will pull protobuf-2.5.0.jar as a part of dependency.
3) ran hive cli and followed the steps that you had mentioned in this mail thread

Doing the above steps resulted in the same exception as you had posted.

The reason for the exception in your case might be, hive-0.12.0 binary download uses protobuf 2.4.1 to compile proto file and hadoop-2.2.0 uses protobuf 2.5.0. When protobuf-java-2.5.0.jar is present in your classpath, it will throw runtime exception. To avoid this try the solution below

Solution:
1) Install protoc 2.5.0, compile hive 0.12.0 (ant protobuf && ant clean package)
2) Remove protobuf-java-2.4.1.jar pulled by ivy from build/dist/lib directory
3) Copy protbuf-java-2.5.0.jar from hadoop-2.2.0/share/hadoop/common/lib to hive/build/dist/lib
3) Run hive cli and rerun your queries.

Let me know if this works.

Thanks
Prasanth Jayachandran

On Dec 31, 2013, at 2:54 AM, Bryan Jeffrey <br...@gmail.com> wrote:

> Prasanth,
> 
> Any luck?
> 
> 
> On Tue, Dec 24, 2013 at 4:31 PM, Bryan Jeffrey <br...@gmail.com> wrote:
> Prasanth,
> 
> I am also traveling this week.  Your assistance would be appreciated, but not at the expense of your holiday!
> 
> Bryan
> 
> On Dec 24, 2013 2:23 PM, "Prasanth Jayachandran" <pj...@hortonworks.com> wrote:
> Bryan
> 
> I have a similar setup. I will try to reproduce this issue and get back to you asap. 
> Since i am traveling expect some delay.  
> 
> Thanks 
> Prasanth
> 
> Sent from my iPhone
> 
> On Dec 24, 2013, at 11:39 AM, Bryan Jeffrey <br...@gmail.com> wrote:
> 
>> Hello.  
>> 
>> I posted this a few weeks ago, but was unable to get a response that solved the issue.  I have made no headway in the mean time.  I was hoping that if I re-summarized the issue that someone would have some advice regarding this problem.
>> Running the following version of Hadoop: hadoop-2.2.0
>> Running the following version of Hive: hive-0.12.0
>> 
>> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
>> 
>> I have created a small table 'from_text' as follows:
>> 
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>> 
>> 
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>> 
>> I go to insert the data into my Orc table, 'orc_table':
>> 
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>> 
>> 
>> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here: 
>> 
>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>> 
>> The error output is below.  I have tried several things to solve the issue, including re-installing Hive 0.12.0 from binary install.
>> 
>> Help?
>> 
>> [server:10001] hive> insert into table orc_test select * from from_text;
>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> 
>> 
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>         ... 8 more
>> 
>> 
>> On Tue, Dec 17, 2013 at 11:56 AM, Bryan Jeffrey <br...@gmail.com> wrote:
>> Prasanth,
>> 
>> I downloaded the binary Hive version from the URL you specified.  I untarred the Hive tar, copied in configuration files, and started Hive.  I continue to see the same error:
>> 
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>> 
>> 
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>> 
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>> 
>> [server:10001] hive> insert into table orc_test select * from from_text;
>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> 
>> From the Hive Log:
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>         ... 8 more
>> 
>> 
>> 
>> 
>> 
>> On Tue, Dec 17, 2013 at 2:31 AM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
>> Bryan
>> 
>> In either cases (source download or binary download) you do not need to compile orc protobuf component. The java source from .proto files should be already available when you download hive 0.12 release. I would recommend re-downloading hive 0.12 binary release from http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and running hive directly. After extracting the hive-0.12.0-bin.tar.gz set HIVE_HOME to the extracted directory and run hive. Let me know if you face any issues.
>> 
>> Thanks
>> Prasanth Jayachandran
>> 
>> On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <br...@gmail.com> wrote:
>> 
>>> Prasanth,
>>> 
>>> I simply compiled the protobuf library, and then compiled the orc protobuf component.  I did not recompile either Hive or custom UDFs/etc.  
>>> 
>>> Is a protobuf recompile the solution for this issue, or a dead end?  Has this been seen before?  I looked for more feedback, but most of the Orc issues were associated with Hive 0.11.0.
>>> 
>>> I will try recompiling the 2.4 protobuf version shortly!
>>> 
>>> Bryan
>>> 
>>> 
>>> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
>>> Also what are you doing with steps 2 through 5? Compiling hive or your custom code?
>>> 
>>> Thanks
>>> Prasanth Jayachandran
>>> 
>>> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com> wrote:
>>> 
>>>> Prasanth,
>>>> 
>>>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google Code site.  I compiled it via the following steps:
>>>> (1) ./configure && make (to compile the C code)
>>>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto ../src/google/protobuf/orc.proto
>>>> (3) Compiled the org/apache/... directory via javac
>>>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>>>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>>>> (6) Restarted hive
>>>> 
>>>> Same results before/after protobuf modification.
>>>> 
>>>> Bryan
>>>> 
>>>> 
>>>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
>>>> What version of protobuf are you using? Are you compiling hive from source?
>>>> 
>>>> Thanks
>>>> Prasanth Jayachandran
>>>> 
>>>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com> wrote:
>>>> 
>>>>> Hello.
>>>>> 
>>>>> Running the following version of Hadoop: hadoop-2.2.0
>>>>> Running the following version of Hive: hive-0.12.0
>>>>> 
>>>>> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
>>>>> 
>>>>> I have created a small table 'from_text' as follows:
>>>>> 
>>>>> [server:10001] hive> describe from_text;
>>>>> foo                     int                     None
>>>>> bar                     int                     None
>>>>> boo                     string                  None
>>>>> 
>>>>> 
>>>>> [server:10001] hive> select * from from_text;
>>>>> 1       2       Hello
>>>>> 2       3       World
>>>>> 
>>>>> I go to insert the data into my Orc table, 'orc_table':
>>>>> 
>>>>> [server:10001] hive> describe orc_test;
>>>>> foo                     int                     from deserializer
>>>>> bar                     int                     from deserializer
>>>>> boo                     string                  from deserializer
>>>>> 
>>>>> 
>>>>> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here: 
>>>>> 
>>>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>>> 
>>>>> I took a few minutes to recompile the protbuf library as several other problems mentioned that Hive 0.12 did not have the protobuf library updated. That did not remedy the problem.  Any ideas?
>>>>> 
>>>>> 
>>>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>> 
>>>>> 
>>>>> Diagnostic Messages for this Task:
>>>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>>>>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>>>>>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>>>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>>>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>>>         ... 8 more
>>>> 
>>>> 
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>> 
>>> 
>>> 
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>> 
>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>> 
>> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive - Issue Converting Text to Orc

Posted by Bryan Jeffrey <br...@gmail.com>.
Prasanth,

Any luck?


On Tue, Dec 24, 2013 at 4:31 PM, Bryan Jeffrey <br...@gmail.com>wrote:

> Prasanth,
>
> I am also traveling this week.  Your assistance would be appreciated, but
> not at the expense of your holiday!
>
> Bryan
> On Dec 24, 2013 2:23 PM, "Prasanth Jayachandran" <
> pjayachandran@hortonworks.com> wrote:
>
>> Bryan
>>
>> I have a similar setup. I will try to reproduce this issue and get back
>> to you asap.
>> Since i am traveling expect some delay.
>>
>> Thanks
>> Prasanth
>>
>> Sent from my iPhone
>>
>> On Dec 24, 2013, at 11:39 AM, Bryan Jeffrey <br...@gmail.com>
>> wrote:
>>
>> Hello.
>>
>> I posted this a few weeks ago, but was unable to get a response that
>> solved the issue.  I have made no headway in the mean time.  I was hoping
>> that if I re-summarized the issue that someone would have some advice
>> regarding this problem.
>> Running the following version of Hadoop: hadoop-2.2.0
>> Running the following version of Hive: hive-0.12.0
>>
>> I have a simple test system setup with (2) datanodes/node manager and (1)
>> namenode/resource manager.  Hive is running on the namenode, and contacting
>> a MySQL database for metastore.
>>
>> I have created a small table 'from_text' as follows:
>>
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>>
>>
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>>
>> I go to insert the data into my Orc table, 'orc_table':
>>
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>>
>>
>> The job runs, but fails to complete with the following errors (see
>> below).  This seems to be the exact example covered in the example here:
>>
>>
>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>
>> The error output is below.  I have tried several things to solve the
>> issue, including re-installing Hive 0.12.0 from binary install.
>>
>> Help?
>>
>> [server:10001] hive> insert into table orc_test select * from from_text;
>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>
>>
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>> operators
>>         at
>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>> be overridden by subclasses.
>>         at
>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>         at org.apache.hadoop.hive.ql.io.orc
>> .OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc
>> .OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc
>> .OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>         at
>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>         at org.apache.hadoop.hive.ql.io.orc
>> .WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>         at org.apache.hadoop.hive.ql.io.orc
>> .WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>         at org.apache.hadoop.hive.ql.io.orc
>> .WriterImpl.flushStripe(WriterImpl.java:1699)
>>         at org.apache.hadoop.hive.ql.io.orc
>> .WriterImpl.close(WriterImpl.java:1868)
>>         at org.apache.hadoop.hive.ql.io.orc
>> .OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>         at
>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>         at
>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>         ... 8 more
>>
>>
>> On Tue, Dec 17, 2013 at 11:56 AM, Bryan Jeffrey <br...@gmail.com>wrote:
>>
>>> Prasanth,
>>>
>>> I downloaded the binary Hive version from the URL you specified.  I
>>> untarred the Hive tar, copied in configuration files, and started Hive.  I
>>> continue to see the same error:
>>>
>>> [server:10001] hive> describe orc_test;
>>> foo                     int                     from deserializer
>>> bar                     int                     from deserializer
>>> boo                     string                  from deserializer
>>>
>>>
>>> [server:10001] hive> describe from_text;
>>> foo                     int                     None
>>> bar                     int                     None
>>> boo                     string                  None
>>>
>>> [server:10001] hive> select * from from_text;
>>> 1       2       Hello
>>> 2       3       World
>>>
>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>
>>> From the Hive Log:
>>> Diagnostic Messages for this Task:
>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>> operators
>>>         at
>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>         at
>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>>> be overridden by subclasses.
>>>         at
>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>          at
>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>         at
>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>         ... 8 more
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Dec 17, 2013 at 2:31 AM, Prasanth Jayachandran <
>>> pjayachandran@hortonworks.com> wrote:
>>>
>>>> Bryan
>>>>
>>>> In either cases (source download or binary download) you do not need to
>>>> compile orc protobuf component. The java source from .proto files should be
>>>> already available when you download hive 0.12 release. I would recommend
>>>> re-downloading hive 0.12 binary release from
>>>> http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and running
>>>> hive directly. After extracting the hive-0.12.0-bin.tar.gz<http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/hive-0.12.0-bin.tar.gz> set
>>>> HIVE_HOME to the extracted directory and run hive. Let me know if you face
>>>> any issues.
>>>>
>>>> Thanks
>>>> Prasanth Jayachandran
>>>>
>>>> On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <br...@gmail.com>
>>>> wrote:
>>>>
>>>> Prasanth,
>>>>
>>>> I simply compiled the protobuf library, and then compiled the orc
>>>> protobuf component.  I did not recompile either Hive or custom UDFs/etc.
>>>>
>>>> Is a protobuf recompile the solution for this issue, or a dead end?
>>>>  Has this been seen before?  I looked for more feedback, but most of the
>>>> Orc issues were associated with Hive 0.11.0.
>>>>
>>>> I will try recompiling the 2.4 protobuf version shortly!
>>>>
>>>> Bryan
>>>>
>>>>
>>>> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <
>>>> pjayachandran@hortonworks.com> wrote:
>>>>
>>>>> Also what are you doing with steps 2 through 5? Compiling hive or your
>>>>> custom code?
>>>>>
>>>>> Thanks
>>>>> Prasanth Jayachandran
>>>>>
>>>>> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Prasanth,
>>>>>
>>>>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did
>>>>> not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google
>>>>> Code site.  I compiled it via the following steps:
>>>>> (1) ./configure && make (to compile the C code)
>>>>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto
>>>>> ../src/google/protobuf/orc.proto
>>>>> (3) Compiled the org/apache/... directory via javac
>>>>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>>>>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>>>>> (6) Restarted hive
>>>>>
>>>>> Same results before/after protobuf modification.
>>>>>
>>>>> Bryan
>>>>>
>>>>>
>>>>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <
>>>>> pjayachandran@hortonworks.com> wrote:
>>>>>
>>>>>> What version of protobuf are you using? Are you compiling hive from
>>>>>> source?
>>>>>>
>>>>>>  Thanks
>>>>>> Prasanth Jayachandran
>>>>>>
>>>>>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>   Hello.
>>>>>>
>>>>>> Running the following version of Hadoop: hadoop-2.2.0
>>>>>> Running the following version of Hive: hive-0.12.0
>>>>>>
>>>>>> I have a simple test system setup with (2) datanodes/node manager and
>>>>>> (1) namenode/resource manager.  Hive is running on the namenode, and
>>>>>> contacting a MySQL database for metastore.
>>>>>>
>>>>>> I have created a small table 'from_text' as follows:
>>>>>>
>>>>>> [server:10001] hive> describe from_text;
>>>>>> foo                     int                     None
>>>>>> bar                     int                     None
>>>>>> boo                     string                  None
>>>>>>
>>>>>>
>>>>>> [server:10001] hive> select * from from_text;
>>>>>> 1       2       Hello
>>>>>> 2       3       World
>>>>>>
>>>>>> I go to insert the data into my Orc table, 'orc_table':
>>>>>>
>>>>>> [server:10001] hive> describe orc_test;
>>>>>> foo                     int                     from deserializer
>>>>>> bar                     int                     from deserializer
>>>>>> boo                     string                  from deserializer
>>>>>>
>>>>>>
>>>>>> The job runs, but fails to complete with the following errors (see
>>>>>> below).  This seems to be the exact example covered in the example here:
>>>>>>
>>>>>>
>>>>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>>>>
>>>>>> I took a few minutes to recompile the protbuf library as several
>>>>>> other problems mentioned that Hive 0.12 did not have the protobuf library
>>>>>> updated. That did not remedy the problem.  Any ideas?
>>>>>>
>>>>>>
>>>>>> [server:10001] hive> insert into table orc_test select * from
>>>>>> from_text;
>>>>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED:
>>>>>> Execution Error, return code 2 from
>>>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>
>>>>>>
>>>>>> Diagnostic Messages for this Task:
>>>>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>>>>> operators
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>>>>         at
>>>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>>         at
>>>>>> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>>         at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>>>>> Caused by: java.lang.UnsupportedOperationException: This is supposed
>>>>>> to be overridden by subclasses.
>>>>>>         at
>>>>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>>>>          at
>>>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>>         at
>>>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>>>>         at
>>>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>>         at
>>>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>>>>         at
>>>>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>         at
>>>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>>>>         ... 8 more
>>>>>>
>>>>>>
>>>>>>
>>>>>> CONFIDENTIALITY NOTICE
>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>> entity to which it is addressed and may contain information that is
>>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>>> If the reader of this message is not the intended recipient, you are hereby
>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>>> you have received this communication in error, please contact the sender
>>>>>> immediately and delete it from your system. Thank You.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>>>> entity to which it is addressed and may contain information that is
>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>> If the reader of this message is not the intended recipient, you are hereby
>>>>> notified that any printing, copying, dissemination, distribution,
>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>> you have received this communication in error, please contact the sender
>>>>> immediately and delete it from your system. Thank You.
>>>>>
>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>> If the reader of this message is not the intended recipient, you are hereby
>>>> notified that any printing, copying, dissemination, distribution,
>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>> you have received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>>
>>>
>>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>

Re: Hive - Issue Converting Text to Orc

Posted by Bryan Jeffrey <br...@gmail.com>.
Prasanth,

I am also traveling this week.  Your assistance would be appreciated, but
not at the expense of your holiday!

Bryan
On Dec 24, 2013 2:23 PM, "Prasanth Jayachandran" <
pjayachandran@hortonworks.com> wrote:

> Bryan
>
> I have a similar setup. I will try to reproduce this issue and get back to
> you asap.
> Since i am traveling expect some delay.
>
> Thanks
> Prasanth
>
> Sent from my iPhone
>
> On Dec 24, 2013, at 11:39 AM, Bryan Jeffrey <br...@gmail.com>
> wrote:
>
> Hello.
>
> I posted this a few weeks ago, but was unable to get a response that
> solved the issue.  I have made no headway in the mean time.  I was hoping
> that if I re-summarized the issue that someone would have some advice
> regarding this problem.
> Running the following version of Hadoop: hadoop-2.2.0
> Running the following version of Hive: hive-0.12.0
>
> I have a simple test system setup with (2) datanodes/node manager and (1)
> namenode/resource manager.  Hive is running on the namenode, and contacting
> a MySQL database for metastore.
>
> I have created a small table 'from_text' as follows:
>
> [server:10001] hive> describe from_text;
> foo                     int                     None
> bar                     int                     None
> boo                     string                  None
>
>
> [server:10001] hive> select * from from_text;
> 1       2       Hello
> 2       3       World
>
> I go to insert the data into my Orc table, 'orc_table':
>
> [server:10001] hive> describe orc_test;
> foo                     int                     from deserializer
> bar                     int                     from deserializer
> boo                     string                  from deserializer
>
>
> The job runs, but fails to complete with the following errors (see
> below).  This seems to be the exact example covered in the example here:
>
>
> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>
> The error output is below.  I have tried several things to solve the
> issue, including re-installing Hive 0.12.0 from binary install.
>
> Help?
>
> [server:10001] hive> insert into table orc_test select * from from_text;
> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>
>
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing
> operators
>         at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.lang.UnsupportedOperationException: This is supposed to be
> overridden by subclasses.
>         at
> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>         at org.apache.hadoop.hive.ql.io.orc
> .OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at org.apache.hadoop.hive.ql.io.orc
> .OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at org.apache.hadoop.hive.ql.io.orc
> .OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>         at
> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>         at org.apache.hadoop.hive.ql.io.orc
> .WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>         at org.apache.hadoop.hive.ql.io.orc
> .WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>         at org.apache.hadoop.hive.ql.io.orc
> .WriterImpl.flushStripe(WriterImpl.java:1699)
>         at org.apache.hadoop.hive.ql.io.orc
> .WriterImpl.close(WriterImpl.java:1868)
>         at org.apache.hadoop.hive.ql.io.orc
> .OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>         ... 8 more
>
>
> On Tue, Dec 17, 2013 at 11:56 AM, Bryan Jeffrey <br...@gmail.com>wrote:
>
>> Prasanth,
>>
>> I downloaded the binary Hive version from the URL you specified.  I
>> untarred the Hive tar, copied in configuration files, and started Hive.  I
>> continue to see the same error:
>>
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>>
>>
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>>
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>>
>> [server:10001] hive> insert into table orc_test select * from from_text;
>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>
>> From the Hive Log:
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>> operators
>>         at
>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>> be overridden by subclasses.
>>         at
>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>          at
>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>         at
>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>         at
>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>         at
>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>         ... 8 more
>>
>>
>>
>>
>>
>> On Tue, Dec 17, 2013 at 2:31 AM, Prasanth Jayachandran <
>> pjayachandran@hortonworks.com> wrote:
>>
>>> Bryan
>>>
>>> In either cases (source download or binary download) you do not need to
>>> compile orc protobuf component. The java source from .proto files should be
>>> already available when you download hive 0.12 release. I would recommend
>>> re-downloading hive 0.12 binary release from
>>> http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and running
>>> hive directly. After extracting the hive-0.12.0-bin.tar.gz<http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/hive-0.12.0-bin.tar.gz> set
>>> HIVE_HOME to the extracted directory and run hive. Let me know if you face
>>> any issues.
>>>
>>> Thanks
>>> Prasanth Jayachandran
>>>
>>> On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <br...@gmail.com>
>>> wrote:
>>>
>>> Prasanth,
>>>
>>> I simply compiled the protobuf library, and then compiled the orc
>>> protobuf component.  I did not recompile either Hive or custom UDFs/etc.
>>>
>>> Is a protobuf recompile the solution for this issue, or a dead end?  Has
>>> this been seen before?  I looked for more feedback, but most of the Orc
>>> issues were associated with Hive 0.11.0.
>>>
>>> I will try recompiling the 2.4 protobuf version shortly!
>>>
>>> Bryan
>>>
>>>
>>> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <
>>> pjayachandran@hortonworks.com> wrote:
>>>
>>>> Also what are you doing with steps 2 through 5? Compiling hive or your
>>>> custom code?
>>>>
>>>> Thanks
>>>> Prasanth Jayachandran
>>>>
>>>> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com>
>>>> wrote:
>>>>
>>>> Prasanth,
>>>>
>>>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did
>>>> not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google
>>>> Code site.  I compiled it via the following steps:
>>>> (1) ./configure && make (to compile the C code)
>>>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto
>>>> ../src/google/protobuf/orc.proto
>>>> (3) Compiled the org/apache/... directory via javac
>>>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>>>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>>>> (6) Restarted hive
>>>>
>>>> Same results before/after protobuf modification.
>>>>
>>>> Bryan
>>>>
>>>>
>>>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <
>>>> pjayachandran@hortonworks.com> wrote:
>>>>
>>>>> What version of protobuf are you using? Are you compiling hive from
>>>>> source?
>>>>>
>>>>>  Thanks
>>>>> Prasanth Jayachandran
>>>>>
>>>>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>   Hello.
>>>>>
>>>>> Running the following version of Hadoop: hadoop-2.2.0
>>>>> Running the following version of Hive: hive-0.12.0
>>>>>
>>>>> I have a simple test system setup with (2) datanodes/node manager and
>>>>> (1) namenode/resource manager.  Hive is running on the namenode, and
>>>>> contacting a MySQL database for metastore.
>>>>>
>>>>> I have created a small table 'from_text' as follows:
>>>>>
>>>>> [server:10001] hive> describe from_text;
>>>>> foo                     int                     None
>>>>> bar                     int                     None
>>>>> boo                     string                  None
>>>>>
>>>>>
>>>>> [server:10001] hive> select * from from_text;
>>>>> 1       2       Hello
>>>>> 2       3       World
>>>>>
>>>>> I go to insert the data into my Orc table, 'orc_table':
>>>>>
>>>>> [server:10001] hive> describe orc_test;
>>>>> foo                     int                     from deserializer
>>>>> bar                     int                     from deserializer
>>>>> boo                     string                  from deserializer
>>>>>
>>>>>
>>>>> The job runs, but fails to complete with the following errors (see
>>>>> below).  This seems to be the exact example covered in the example here:
>>>>>
>>>>>
>>>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>>>
>>>>> I took a few minutes to recompile the protbuf library as several other
>>>>> problems mentioned that Hive 0.12 did not have the protobuf library
>>>>> updated. That did not remedy the problem.  Any ideas?
>>>>>
>>>>>
>>>>> [server:10001] hive> insert into table orc_test select * from
>>>>> from_text;
>>>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED:
>>>>> Execution Error, return code 2 from
>>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>
>>>>>
>>>>> Diagnostic Messages for this Task:
>>>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>>>> operators
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>>>         at
>>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>         at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>>>> Caused by: java.lang.UnsupportedOperationException: This is supposed
>>>>> to be overridden by subclasses.
>>>>>         at
>>>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>>>          at
>>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>         at
>>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>>>         at
>>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>         at
>>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>>>         at
>>>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>         at
>>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>>>         ... 8 more
>>>>>
>>>>>
>>>>>
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>>>> entity to which it is addressed and may contain information that is
>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>> If the reader of this message is not the intended recipient, you are hereby
>>>>> notified that any printing, copying, dissemination, distribution,
>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>> you have received this communication in error, please contact the sender
>>>>> immediately and delete it from your system. Thank You.
>>>>
>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>> If the reader of this message is not the intended recipient, you are hereby
>>>> notified that any printing, copying, dissemination, distribution,
>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>> you have received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>>
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>>
>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: Hive - Issue Converting Text to Orc

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
Bryan

I have a similar setup. I will try to reproduce this issue and get back to you asap. 
Since i am traveling expect some delay.  

Thanks 
Prasanth

Sent from my iPhone

> On Dec 24, 2013, at 11:39 AM, Bryan Jeffrey <br...@gmail.com> wrote:
> 
> Hello.  
> 
> I posted this a few weeks ago, but was unable to get a response that solved the issue.  I have made no headway in the mean time.  I was hoping that if I re-summarized the issue that someone would have some advice regarding this problem.
> Running the following version of Hadoop: hadoop-2.2.0
> Running the following version of Hive: hive-0.12.0
> 
> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
> 
> I have created a small table 'from_text' as follows:
> 
> [server:10001] hive> describe from_text;
> foo                     int                     None
> bar                     int                     None
> boo                     string                  None
> 
> 
> [server:10001] hive> select * from from_text;
> 1       2       Hello
> 2       3       World
> 
> I go to insert the data into my Orc table, 'orc_table':
> 
> [server:10001] hive> describe orc_test;
> foo                     int                     from deserializer
> bar                     int                     from deserializer
> boo                     string                  from deserializer
> 
> 
> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here: 
> 
> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
> 
> The error output is below.  I have tried several things to solve the issue, including re-installing Hive 0.12.0 from binary install.
> 
> Help?
> 
> [server:10001] hive> insert into table orc_test select * from from_text;
> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>         ... 8 more
> 
> 
>> On Tue, Dec 17, 2013 at 11:56 AM, Bryan Jeffrey <br...@gmail.com> wrote:
>> Prasanth,
>> 
>> I downloaded the binary Hive version from the URL you specified.  I untarred the Hive tar, copied in configuration files, and started Hive.  I continue to see the same error:
>> 
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>> 
>> 
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>> 
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>> 
>> [server:10001] hive> insert into table orc_test select * from from_text;
>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> 
>> From the Hive Log:
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>         ... 8 more
>> 
>> 
>> 
>> 
>> 
>>> On Tue, Dec 17, 2013 at 2:31 AM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
>>> Bryan
>>> 
>>> In either cases (source download or binary download) you do not need to compile orc protobuf component. The java source from .proto files should be already available when you download hive 0.12 release. I would recommend re-downloading hive 0.12 binary release from http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and running hive directly. After extracting the hive-0.12.0-bin.tar.gz set HIVE_HOME to the extracted directory and run hive. Let me know if you face any issues.
>>> 
>>> Thanks
>>> Prasanth Jayachandran
>>> 
>>>> On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <br...@gmail.com> wrote:
>>>> 
>>>> Prasanth,
>>>> 
>>>> I simply compiled the protobuf library, and then compiled the orc protobuf component.  I did not recompile either Hive or custom UDFs/etc.  
>>>> 
>>>> Is a protobuf recompile the solution for this issue, or a dead end?  Has this been seen before?  I looked for more feedback, but most of the Orc issues were associated with Hive 0.11.0.
>>>> 
>>>> I will try recompiling the 2.4 protobuf version shortly!
>>>> 
>>>> Bryan
>>>> 
>>>> 
>>>>> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
>>>>> Also what are you doing with steps 2 through 5? Compiling hive or your custom code?
>>>>> 
>>>>> Thanks
>>>>> Prasanth Jayachandran
>>>>> 
>>>>>> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com> wrote:
>>>>>> 
>>>>>> Prasanth,
>>>>>> 
>>>>>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google Code site.  I compiled it via the following steps:
>>>>>> (1) ./configure && make (to compile the C code)
>>>>>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto ../src/google/protobuf/orc.proto
>>>>>> (3) Compiled the org/apache/... directory via javac
>>>>>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>>>>>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>>>>>> (6) Restarted hive
>>>>>> 
>>>>>> Same results before/after protobuf modification.
>>>>>> 
>>>>>> Bryan
>>>>>> 
>>>>>> 
>>>>>>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
>>>>>>> What version of protobuf are you using? Are you compiling hive from source?
>>>>>>> 
>>>>>>> Thanks
>>>>>>> Prasanth Jayachandran
>>>>>>> 
>>>>>>>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Hello.
>>>>>>>> 
>>>>>>>> Running the following version of Hadoop: hadoop-2.2.0
>>>>>>>> Running the following version of Hive: hive-0.12.0
>>>>>>>> 
>>>>>>>> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
>>>>>>>> 
>>>>>>>> I have created a small table 'from_text' as follows:
>>>>>>>> 
>>>>>>>> [server:10001] hive> describe from_text;
>>>>>>>> foo                     int                     None
>>>>>>>> bar                     int                     None
>>>>>>>> boo                     string                  None
>>>>>>>> 
>>>>>>>> 
>>>>>>>> [server:10001] hive> select * from from_text;
>>>>>>>> 1       2       Hello
>>>>>>>> 2       3       World
>>>>>>>> 
>>>>>>>> I go to insert the data into my Orc table, 'orc_table':
>>>>>>>> 
>>>>>>>> [server:10001] hive> describe orc_test;
>>>>>>>> foo                     int                     from deserializer
>>>>>>>> bar                     int                     from deserializer
>>>>>>>> boo                     string                  from deserializer
>>>>>>>> 
>>>>>>>> 
>>>>>>>> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here: 
>>>>>>>> 
>>>>>>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>>>>>> 
>>>>>>>> I took a few minutes to recompile the protbuf library as several other problems mentioned that Hive 0.12 did not have the protobuf library updated. That did not remedy the problem.  Any ideas?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>>>>>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Diagnostic Messages for this Task:
>>>>>>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>>>>>>>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>>>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>>>>>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>>>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>>>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>>>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>>>>>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>>>>>>>>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>>>>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>>>>>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>>>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>>>>>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>>>>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>>>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>>>>>>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>>>>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>>>>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>>>>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>>>>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>>>>>>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>>>>>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>>>>>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>>>>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>>>>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>>>>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>>>>>>         ... 8 more
>>>>>>> 
>>>>>>> 
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>>> 
>>>>> 
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>> 
>>> 
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive - Issue Converting Text to Orc

Posted by Bryan Jeffrey <br...@gmail.com>.
Hello.

I posted this a few weeks ago, but was unable to get a response that solved
the issue.  I have made no headway in the mean time.  I was hoping that if
I re-summarized the issue that someone would have some advice regarding
this problem.
Running the following version of Hadoop: hadoop-2.2.0
Running the following version of Hive: hive-0.12.0

I have a simple test system setup with (2) datanodes/node manager and (1)
namenode/resource manager.  Hive is running on the namenode, and contacting
a MySQL database for metastore.

I have created a small table 'from_text' as follows:

[server:10001] hive> describe from_text;
foo                     int                     None
bar                     int                     None
boo                     string                  None


[server:10001] hive> select * from from_text;
1       2       Hello
2       3       World

I go to insert the data into my Orc table, 'orc_table':

[server:10001] hive> describe orc_test;
foo                     int                     from deserializer
bar                     int                     from deserializer
boo                     string                  from deserializer


The job runs, but fails to complete with the following errors (see below).
 This seems to be the exact example covered in the example here:

http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/

The error output is below.  I have tried several things to solve the issue,
including re-installing Hive 0.12.0 from binary install.

Help?

[server:10001] hive> insert into table orc_test select * from from_text;
[Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask


Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing
operators
        at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.UnsupportedOperationException: This is supposed to be
overridden by subclasses.
        at
com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
        at org.apache.hadoop.hive.ql.io.orc
.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
        at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
        at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
        at org.apache.hadoop.hive.ql.io.orc
.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
        at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
        at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
        at org.apache.hadoop.hive.ql.io.orc
.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
        at
com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
        at org.apache.hadoop.hive.ql.io.orc
.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
        at org.apache.hadoop.hive.ql.io.orc
.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
        at org.apache.hadoop.hive.ql.io.orc
.WriterImpl.flushStripe(WriterImpl.java:1699)
        at org.apache.hadoop.hive.ql.io.orc
.WriterImpl.close(WriterImpl.java:1868)
        at org.apache.hadoop.hive.ql.io.orc
.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
        at
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
        at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
        ... 8 more


On Tue, Dec 17, 2013 at 11:56 AM, Bryan Jeffrey <br...@gmail.com>wrote:

> Prasanth,
>
> I downloaded the binary Hive version from the URL you specified.  I
> untarred the Hive tar, copied in configuration files, and started Hive.  I
> continue to see the same error:
>
> [server:10001] hive> describe orc_test;
> foo                     int                     from deserializer
> bar                     int                     from deserializer
> boo                     string                  from deserializer
>
>
> [server:10001] hive> describe from_text;
> foo                     int                     None
> bar                     int                     None
> boo                     string                  None
>
> [server:10001] hive> select * from from_text;
> 1       2       Hello
> 2       3       World
>
> [server:10001] hive> insert into table orc_test select * from from_text;
> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>
> From the Hive Log:
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing
> operators
>         at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.lang.UnsupportedOperationException: This is supposed to be
> overridden by subclasses.
>         at
> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>         at
> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>         ... 8 more
>
>
>
>
>
> On Tue, Dec 17, 2013 at 2:31 AM, Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
>> Bryan
>>
>> In either cases (source download or binary download) you do not need to
>> compile orc protobuf component. The java source from .proto files should be
>> already available when you download hive 0.12 release. I would recommend
>> re-downloading hive 0.12 binary release from
>> http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and running
>> hive directly. After extracting the hive-0.12.0-bin.tar.gz<http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/hive-0.12.0-bin.tar.gz> set
>> HIVE_HOME to the extracted directory and run hive. Let me know if you face
>> any issues.
>>
>> Thanks
>> Prasanth Jayachandran
>>
>> On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <br...@gmail.com>
>> wrote:
>>
>> Prasanth,
>>
>> I simply compiled the protobuf library, and then compiled the orc
>> protobuf component.  I did not recompile either Hive or custom UDFs/etc.
>>
>> Is a protobuf recompile the solution for this issue, or a dead end?  Has
>> this been seen before?  I looked for more feedback, but most of the Orc
>> issues were associated with Hive 0.11.0.
>>
>> I will try recompiling the 2.4 protobuf version shortly!
>>
>> Bryan
>>
>>
>> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <
>> pjayachandran@hortonworks.com> wrote:
>>
>>> Also what are you doing with steps 2 through 5? Compiling hive or your
>>> custom code?
>>>
>>> Thanks
>>> Prasanth Jayachandran
>>>
>>> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com>
>>> wrote:
>>>
>>> Prasanth,
>>>
>>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did
>>> not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google
>>> Code site.  I compiled it via the following steps:
>>> (1) ./configure && make (to compile the C code)
>>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto
>>> ../src/google/protobuf/orc.proto
>>> (3) Compiled the org/apache/... directory via javac
>>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>>> (6) Restarted hive
>>>
>>> Same results before/after protobuf modification.
>>>
>>> Bryan
>>>
>>>
>>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <
>>> pjayachandran@hortonworks.com> wrote:
>>>
>>>> What version of protobuf are you using? Are you compiling hive from
>>>> source?
>>>>
>>>>  Thanks
>>>> Prasanth Jayachandran
>>>>
>>>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com>
>>>> wrote:
>>>>
>>>>   Hello.
>>>>
>>>> Running the following version of Hadoop: hadoop-2.2.0
>>>> Running the following version of Hive: hive-0.12.0
>>>>
>>>> I have a simple test system setup with (2) datanodes/node manager and
>>>> (1) namenode/resource manager.  Hive is running on the namenode, and
>>>> contacting a MySQL database for metastore.
>>>>
>>>> I have created a small table 'from_text' as follows:
>>>>
>>>> [server:10001] hive> describe from_text;
>>>> foo                     int                     None
>>>> bar                     int                     None
>>>> boo                     string                  None
>>>>
>>>>
>>>> [server:10001] hive> select * from from_text;
>>>> 1       2       Hello
>>>> 2       3       World
>>>>
>>>> I go to insert the data into my Orc table, 'orc_table':
>>>>
>>>> [server:10001] hive> describe orc_test;
>>>> foo                     int                     from deserializer
>>>> bar                     int                     from deserializer
>>>> boo                     string                  from deserializer
>>>>
>>>>
>>>> The job runs, but fails to complete with the following errors (see
>>>> below).  This seems to be the exact example covered in the example here:
>>>>
>>>>
>>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>>
>>>> I took a few minutes to recompile the protbuf library as several other
>>>> problems mentioned that Hive 0.12 did not have the protobuf library
>>>> updated. That did not remedy the problem.  Any ideas?
>>>>
>>>>
>>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>>>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>
>>>>
>>>> Diagnostic Messages for this Task:
>>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>>> operators
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>>         at
>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>         at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>>>> be overridden by subclasses.
>>>>         at
>>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>>          at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>>         at
>>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>>         ... 8 more
>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>> If the reader of this message is not the intended recipient, you are hereby
>>>> notified that any printing, copying, dissemination, distribution,
>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>> you have received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>

Re: Hive - Issue Converting Text to Orc

Posted by Bryan Jeffrey <br...@gmail.com>.
Prasanth,

I downloaded the binary Hive version from the URL you specified.  I
untarred the Hive tar, copied in configuration files, and started Hive.  I
continue to see the same error:

[server:10001] hive> describe orc_test;
foo                     int                     from deserializer
bar                     int                     from deserializer
boo                     string                  from deserializer


[server:10001] hive> describe from_text;
foo                     int                     None
bar                     int                     None
boo                     string                  None

[server:10001] hive> select * from from_text;
1       2       Hello
2       3       World

[server:10001] hive> insert into table orc_test select * from from_text;
[Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

>From the Hive Log:
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing
operators
        at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.UnsupportedOperationException: This is supposed to be
overridden by subclasses.
        at
com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
        at
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
        at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
        at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
        at
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
        at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
        at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
        at
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
        at
com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
        at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
        at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
        at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
        at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
        at
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
        at
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
        at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
        ... 8 more





On Tue, Dec 17, 2013 at 2:31 AM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> Bryan
>
> In either cases (source download or binary download) you do not need to
> compile orc protobuf component. The java source from .proto files should be
> already available when you download hive 0.12 release. I would recommend
> re-downloading hive 0.12 binary release from
> http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and running
> hive directly. After extracting the hive-0.12.0-bin.tar.gz<http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/hive-0.12.0-bin.tar.gz> set
> HIVE_HOME to the extracted directory and run hive. Let me know if you face
> any issues.
>
> Thanks
> Prasanth Jayachandran
>
> On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <br...@gmail.com>
> wrote:
>
> Prasanth,
>
> I simply compiled the protobuf library, and then compiled the orc protobuf
> component.  I did not recompile either Hive or custom UDFs/etc.
>
> Is a protobuf recompile the solution for this issue, or a dead end?  Has
> this been seen before?  I looked for more feedback, but most of the Orc
> issues were associated with Hive 0.11.0.
>
> I will try recompiling the 2.4 protobuf version shortly!
>
> Bryan
>
>
> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
>> Also what are you doing with steps 2 through 5? Compiling hive or your
>> custom code?
>>
>> Thanks
>> Prasanth Jayachandran
>>
>> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com>
>> wrote:
>>
>> Prasanth,
>>
>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did not
>> compile it.  I downloaded protobuf 2.5.0 earlier today from the Google Code
>> site.  I compiled it via the following steps:
>> (1) ./configure && make (to compile the C code)
>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto
>> ../src/google/protobuf/orc.proto
>> (3) Compiled the org/apache/... directory via javac
>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>> (6) Restarted hive
>>
>> Same results before/after protobuf modification.
>>
>> Bryan
>>
>>
>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <
>> pjayachandran@hortonworks.com> wrote:
>>
>>> What version of protobuf are you using? Are you compiling hive from
>>> source?
>>>
>>>  Thanks
>>> Prasanth Jayachandran
>>>
>>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com>
>>> wrote:
>>>
>>>   Hello.
>>>
>>> Running the following version of Hadoop: hadoop-2.2.0
>>> Running the following version of Hive: hive-0.12.0
>>>
>>> I have a simple test system setup with (2) datanodes/node manager and
>>> (1) namenode/resource manager.  Hive is running on the namenode, and
>>> contacting a MySQL database for metastore.
>>>
>>> I have created a small table 'from_text' as follows:
>>>
>>> [server:10001] hive> describe from_text;
>>> foo                     int                     None
>>> bar                     int                     None
>>> boo                     string                  None
>>>
>>>
>>> [server:10001] hive> select * from from_text;
>>> 1       2       Hello
>>> 2       3       World
>>>
>>> I go to insert the data into my Orc table, 'orc_table':
>>>
>>> [server:10001] hive> describe orc_test;
>>> foo                     int                     from deserializer
>>> bar                     int                     from deserializer
>>> boo                     string                  from deserializer
>>>
>>>
>>> The job runs, but fails to complete with the following errors (see
>>> below).  This seems to be the exact example covered in the example here:
>>>
>>>
>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>
>>> I took a few minutes to recompile the protbuf library as several other
>>> problems mentioned that Hive 0.12 did not have the protobuf library
>>> updated. That did not remedy the problem.  Any ideas?
>>>
>>>
>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>
>>>
>>> Diagnostic Messages for this Task:
>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>> operators
>>>         at
>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>         at
>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>>> be overridden by subclasses.
>>>         at
>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>          at
>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>         at
>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>         at
>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>         at
>>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at
>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>         ... 8 more
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Hive - Issue Converting Text to Orc

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
Bryan

In either cases (source download or binary download) you do not need to compile orc protobuf component. The java source from .proto files should be already available when you download hive 0.12 release. I would recommend re-downloading hive 0.12 binary release from http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and running hive directly. After extracting the hive-0.12.0-bin.tar.gz set HIVE_HOME to the extracted directory and run hive. Let me know if you face any issues.

Thanks
Prasanth Jayachandran

On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <br...@gmail.com> wrote:

> Prasanth,
> 
> I simply compiled the protobuf library, and then compiled the orc protobuf component.  I did not recompile either Hive or custom UDFs/etc.  
> 
> Is a protobuf recompile the solution for this issue, or a dead end?  Has this been seen before?  I looked for more feedback, but most of the Orc issues were associated with Hive 0.11.0.
> 
> I will try recompiling the 2.4 protobuf version shortly!
> 
> Bryan
> 
> 
> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
> Also what are you doing with steps 2 through 5? Compiling hive or your custom code?
> 
> Thanks
> Prasanth Jayachandran
> 
> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com> wrote:
> 
>> Prasanth,
>> 
>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google Code site.  I compiled it via the following steps:
>> (1) ./configure && make (to compile the C code)
>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto ../src/google/protobuf/orc.proto
>> (3) Compiled the org/apache/... directory via javac
>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>> (6) Restarted hive
>> 
>> Same results before/after protobuf modification.
>> 
>> Bryan
>> 
>> 
>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
>> What version of protobuf are you using? Are you compiling hive from source?
>> 
>> Thanks
>> Prasanth Jayachandran
>> 
>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com> wrote:
>> 
>>> Hello.
>>> 
>>> Running the following version of Hadoop: hadoop-2.2.0
>>> Running the following version of Hive: hive-0.12.0
>>> 
>>> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
>>> 
>>> I have created a small table 'from_text' as follows:
>>> 
>>> [server:10001] hive> describe from_text;
>>> foo                     int                     None
>>> bar                     int                     None
>>> boo                     string                  None
>>> 
>>> 
>>> [server:10001] hive> select * from from_text;
>>> 1       2       Hello
>>> 2       3       World
>>> 
>>> I go to insert the data into my Orc table, 'orc_table':
>>> 
>>> [server:10001] hive> describe orc_test;
>>> foo                     int                     from deserializer
>>> bar                     int                     from deserializer
>>> boo                     string                  from deserializer
>>> 
>>> 
>>> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here: 
>>> 
>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>> 
>>> I took a few minutes to recompile the protbuf library as several other problems mentioned that Hive 0.12 did not have the protobuf library updated. That did not remedy the problem.  Any ideas?
>>> 
>>> 
>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>> 
>>> 
>>> Diagnostic Messages for this Task:
>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>>>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>         ... 8 more
>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive - Issue Converting Text to Orc

Posted by Bryan Jeffrey <br...@gmail.com>.
Prasanth,

I simply compiled the protobuf library, and then compiled the orc protobuf
component.  I did not recompile either Hive or custom UDFs/etc.

Is a protobuf recompile the solution for this issue, or a dead end?  Has
this been seen before?  I looked for more feedback, but most of the Orc
issues were associated with Hive 0.11.0.

I will try recompiling the 2.4 protobuf version shortly!

Bryan


On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> Also what are you doing with steps 2 through 5? Compiling hive or your
> custom code?
>
> Thanks
> Prasanth Jayachandran
>
> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com>
> wrote:
>
> Prasanth,
>
> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did not
> compile it.  I downloaded protobuf 2.5.0 earlier today from the Google Code
> site.  I compiled it via the following steps:
> (1) ./configure && make (to compile the C code)
> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto
> ../src/google/protobuf/orc.proto
> (3) Compiled the org/apache/... directory via javac
> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
> (6) Restarted hive
>
> Same results before/after protobuf modification.
>
> Bryan
>
>
> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
>> What version of protobuf are you using? Are you compiling hive from
>> source?
>>
>>  Thanks
>> Prasanth Jayachandran
>>
>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com>
>> wrote:
>>
>>   Hello.
>>
>> Running the following version of Hadoop: hadoop-2.2.0
>> Running the following version of Hive: hive-0.12.0
>>
>> I have a simple test system setup with (2) datanodes/node manager and (1)
>> namenode/resource manager.  Hive is running on the namenode, and contacting
>> a MySQL database for metastore.
>>
>> I have created a small table 'from_text' as follows:
>>
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>>
>>
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>>
>> I go to insert the data into my Orc table, 'orc_table':
>>
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>>
>>
>> The job runs, but fails to complete with the following errors (see
>> below).  This seems to be the exact example covered in the example here:
>>
>>
>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>
>> I took a few minutes to recompile the protbuf library as several other
>> problems mentioned that Hive 0.12 did not have the protobuf library
>> updated. That did not remedy the problem.  Any ideas?
>>
>>
>> [server:10001] hive> insert into table orc_test select * from from_text;
>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>
>>
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>> operators
>>         at
>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>> be overridden by subclasses.
>>         at
>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>          at
>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at
>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>         at
>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>         at
>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>         at
>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>         at
>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at
>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>         ... 8 more
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Hive - Issue Converting Text to Orc

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
Also what are you doing with steps 2 through 5? Compiling hive or your custom code?

Thanks
Prasanth Jayachandran

On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <br...@gmail.com> wrote:

> Prasanth,
> 
> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google Code site.  I compiled it via the following steps:
> (1) ./configure && make (to compile the C code)
> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto ../src/google/protobuf/orc.proto
> (3) Compiled the org/apache/... directory via javac
> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
> (6) Restarted hive
> 
> Same results before/after protobuf modification.
> 
> Bryan
> 
> 
> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <pj...@hortonworks.com> wrote:
> What version of protobuf are you using? Are you compiling hive from source?
> 
> Thanks
> Prasanth Jayachandran
> 
> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com> wrote:
> 
>> Hello.
>> 
>> Running the following version of Hadoop: hadoop-2.2.0
>> Running the following version of Hive: hive-0.12.0
>> 
>> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
>> 
>> I have created a small table 'from_text' as follows:
>> 
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>> 
>> 
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>> 
>> I go to insert the data into my Orc table, 'orc_table':
>> 
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>> 
>> 
>> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here: 
>> 
>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>> 
>> I took a few minutes to recompile the protbuf library as several other problems mentioned that Hive 0.12 did not have the protobuf library updated. That did not remedy the problem.  Any ideas?
>> 
>> 
>> [server:10001] hive> insert into table orc_test select * from from_text;
>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> 
>> 
>> Diagnostic Messages for this Task:
>> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>         ... 8 more
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive - Issue Converting Text to Orc

Posted by Bryan Jeffrey <br...@gmail.com>.
Prasanth,

I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did not
compile it.  I downloaded protobuf 2.5.0 earlier today from the Google Code
site.  I compiled it via the following steps:
(1) ./configure && make (to compile the C code)
(2) protoc --java_out=src/main/java -I../src
../src/google/protobuf/descriptor.proto
../src/google/protobuf/orc.proto
(3) Compiled the org/apache/... directory via javac
(4) Created jar via jar -cf protobuf-java-2.4.1.jar org
(5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
(6) Restarted hive

Same results before/after protobuf modification.

Bryan


On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> What version of protobuf are you using? Are you compiling hive from source?
>
> Thanks
> Prasanth Jayachandran
>
> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com>
> wrote:
>
>  Hello.
>
> Running the following version of Hadoop: hadoop-2.2.0
> Running the following version of Hive: hive-0.12.0
>
> I have a simple test system setup with (2) datanodes/node manager and (1)
> namenode/resource manager.  Hive is running on the namenode, and contacting
> a MySQL database for metastore.
>
> I have created a small table 'from_text' as follows:
>
> [server:10001] hive> describe from_text;
> foo                     int                     None
> bar                     int                     None
> boo                     string                  None
>
>
> [server:10001] hive> select * from from_text;
> 1       2       Hello
> 2       3       World
>
> I go to insert the data into my Orc table, 'orc_table':
>
> [server:10001] hive> describe orc_test;
> foo                     int                     from deserializer
> bar                     int                     from deserializer
> boo                     string                  from deserializer
>
>
> The job runs, but fails to complete with the following errors (see below).
>  This seems to be the exact example covered in the example here:
>
>
> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>
> I took a few minutes to recompile the protbuf library as several other
> problems mentioned that Hive 0.12 did not have the protobuf library
> updated. That did not remedy the problem.  Any ideas?
>
>
> [server:10001] hive> insert into table orc_test select * from from_text;
> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>
>
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing
> operators
>         at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.lang.UnsupportedOperationException: This is supposed to be
> overridden by subclasses.
>         at
> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>         at
> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>         ... 8 more
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: Hive - Issue Converting Text to Orc

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
What version of protobuf are you using? Are you compiling hive from source?

Thanks
Prasanth Jayachandran

On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <br...@gmail.com> wrote:

> Hello.
> 
> Running the following version of Hadoop: hadoop-2.2.0
> Running the following version of Hive: hive-0.12.0
> 
> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
> 
> I have created a small table 'from_text' as follows:
> 
> [server:10001] hive> describe from_text;
> foo                     int                     None
> bar                     int                     None
> boo                     string                  None
> 
> 
> [server:10001] hive> select * from from_text;
> 1       2       Hello
> 2       3       World
> 
> I go to insert the data into my Orc table, 'orc_table':
> 
> [server:10001] hive> describe orc_test;
> foo                     int                     from deserializer
> bar                     int                     from deserializer
> boo                     string                  from deserializer
> 
> 
> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here: 
> 
> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
> 
> I took a few minutes to recompile the protbuf library as several other problems mentioned that Hive 0.12 did not have the protobuf library updated. That did not remedy the problem.  Any ideas?
> 
> 
> [server:10001] hive> insert into table orc_test select * from from_text;
> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
>         at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>         at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>         at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>         at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>         at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>         at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>         ... 8 more


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.