You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Narlin M <hp...@gmail.com> on 2013/08/30 22:04:14 UTC

InvalidProtocolBufferException while submitting crunch job to cluster

I am getting following exception while trying to submit a crunch pipeline
job to a remote hadoop cluster:

Exception in thread "main" java.lang.RuntimeException: Cannot create job
output directory /tmp/crunch-324987940
at
org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
at test.CrunchTest.setup(CrunchTest.java:98)
at test.CrunchTest.main(CrunchTest.java:367)
Caused by: java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host is:
"NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
at org.apache.hadoop.ipc.Client.call(Client.java:1164)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
at
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
at
org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
... 3 more
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
message end-group tag did not match expected tag.
at
com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
at
com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
at
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
at
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
at
org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
java.lang.NoSuchMethodError:
com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
at
org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
at
org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
at
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

Google search on this error yielded solutions that asked to confirm that
/etc/hosts file contained the entry for NARLIN which it does in my case.

Here's the code that I am using to set up the MRPipeline:

Configuration conf = HBaseConfiguration.create();

conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
conf.set("mapred.job.tracker", "<server_address>:50030");

System.out.println("Hadoop configuration created.");
System.out.println("Initializing crunch pipeline ...");

conf.set("mapred.jar", "<path_to_jar_file>");

pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);

Has anyone faced this issue before and knows how to resolve it/point out if
I am missing anything?

Thanks for the help.

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Shekhar Sharma <sh...@gmail.com>.
Can you please check whether are you able to access HDFS using java
API..and also able to run MR Job.
Regards,
Som Shekhar Sharma
+91-8197243810


On Sat, Aug 31, 2013 at 7:08 PM, Narlin M <hp...@gmail.com> wrote:
> The <server_address> that was mentioned in my original post is not
> pointing to bdatadev. I should have mentioned this in my original post,
> sorry I missed that.
>
> On 8/31/13 8:32 AM, "Narlin M" <hp...@gmail.com> wrote:
>
>>I would, but bdatadev is not one of my servers, it seems like a random
>>host name. I can't figure out how or where this name got generated. That's
>>what puzzling me.
>>
>>On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:
>>
>>>: java.net.UnknownHostException: bdatadev
>>>
>>>
>>>edit your /etc/hosts file
>>>Regards,
>>>Som Shekhar Sharma
>>>+91-8197243810
>>>
>>>
>>>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>>>> Looks like I was pointing to incorrect ports. After correcting the port
>>>> numbers,
>>>>
>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>>>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>>>
>>>> I am now getting the following exception:
>>>>
>>>> 2880 [Thread-15] INFO
>>>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
>>>>-
>>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>>bdatadev
>>>> at
>>>>
>>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.j
>>>>a
>>>>va:414)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.
>>>>j
>>>>ava:164)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:
>>>>1
>>>>29)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileS
>>>>y
>>>>stem.java:124)
>>>> at
>>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>> at
>>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>>>> at
>>>>
>>>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissi
>>>>o
>>>>nFiles.java:103)
>>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>>> at
>>>>
>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>>n
>>>>.java:1332)
>>>> at
>>>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.su
>>>>b
>>>>mit(CrunchControlledJob.java:305)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.start
>>>>R
>>>>eadyJobs(CrunchJobControl.java:180)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJ
>>>>o
>>>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>>>> at
>>>>
>>>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:10
>>>>0
>>>>)
>>>> at
>>>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>>>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>>>> at java.lang.Thread.run(Thread.java:680)
>>>> Caused by: java.net.UnknownHostException: bdatadev
>>>> ... 27 more
>>>>
>>>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>>>> cannot ping this host.
>>>>
>>>> Thanks for the help.
>>>>
>>>>
>>>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>>>
>>>>> I am getting following exception while trying to submit a crunch
>>>>>pipeline
>>>>> job to a remote hadoop cluster:
>>>>>
>>>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>>>job
>>>>> output directory /tmp/crunch-324987940
>>>>> at
>>>>>
>>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>>a
>>>>>:344)
>>>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>>>> at test.CrunchTest.main(CrunchTest.java:367)
>>>>> Caused by: java.io.IOException: Failed on local exception:
>>>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>>>> end-group tag did not match expected tag.; Host Details : local host
>>>>>is:
>>>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>>e
>>>>>.java:202)
>>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>>
>>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
>>>>>v
>>>>>a:39)
>>>>> at
>>>>>
>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>>r
>>>>>Impl.java:25)
>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>>o
>>>>>cationHandler.java:164)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>>n
>>>>>Handler.java:83)
>>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mk
>>>>>d
>>>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyst
>>>>>e
>>>>>m.java:523)
>>>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>>>> at
>>>>>
>>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>>a
>>>>>:342)
>>>>> ... 3 more
>>>>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>>>Protocol
>>>>> message end-group tag did not match expected tag.
>>>>> at
>>>>>
>>>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invali
>>>>>d
>>>>>ProtocolBufferException.java:73)
>>>>> at
>>>>>
>>>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.j
>>>>>a
>>>>>va:124)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessa
>>>>>g
>>>>>eLite.java:213)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>>a
>>>>>va:746)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>>a
>>>>>va:238)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>>r
>>>>>actMessageLite.java:282)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>>M
>>>>>essage.java:760)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>>r
>>>>>actMessageLite.java:288)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>>M
>>>>>essage.java:752)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeader
>>>>>P
>>>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882
>>>>>)
>>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>>> java.lang.NoSuchMethodError:
>>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSyste
>>>>>m
>>>>>.java:539)
>>>>> at
>>>>>org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.ja
>>>>>v
>>>>>a:2324)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.ja
>>>>>v
>>>>>a:54)
>>>>>
>>>>> Google search on this error yielded solutions that asked to confirm
>>>>>that
>>>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>>>case.
>>>>>
>>>>> Here's the code that I am using to set up the MRPipeline:
>>>>>
>>>>> Configuration conf = HBaseConfiguration.create();
>>>>>
>>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>>>
>>>>> System.out.println("Hadoop configuration created.");
>>>>> System.out.println("Initializing crunch pipeline ...");
>>>>>
>>>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>>>
>>>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>>>
>>>>> Has anyone faced this issue before and knows how to resolve it/point
>>>>>out
>>>>> if I am missing anything?
>>>>>
>>>>> Thanks for the help.
>>>>
>>>>
>>
>>
>
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Shekhar Sharma <sh...@gmail.com>.
Can you please check whether are you able to access HDFS using java
API..and also able to run MR Job.
Regards,
Som Shekhar Sharma
+91-8197243810


On Sat, Aug 31, 2013 at 7:08 PM, Narlin M <hp...@gmail.com> wrote:
> The <server_address> that was mentioned in my original post is not
> pointing to bdatadev. I should have mentioned this in my original post,
> sorry I missed that.
>
> On 8/31/13 8:32 AM, "Narlin M" <hp...@gmail.com> wrote:
>
>>I would, but bdatadev is not one of my servers, it seems like a random
>>host name. I can't figure out how or where this name got generated. That's
>>what puzzling me.
>>
>>On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:
>>
>>>: java.net.UnknownHostException: bdatadev
>>>
>>>
>>>edit your /etc/hosts file
>>>Regards,
>>>Som Shekhar Sharma
>>>+91-8197243810
>>>
>>>
>>>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>>>> Looks like I was pointing to incorrect ports. After correcting the port
>>>> numbers,
>>>>
>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>>>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>>>
>>>> I am now getting the following exception:
>>>>
>>>> 2880 [Thread-15] INFO
>>>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
>>>>-
>>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>>bdatadev
>>>> at
>>>>
>>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.j
>>>>a
>>>>va:414)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.
>>>>j
>>>>ava:164)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:
>>>>1
>>>>29)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileS
>>>>y
>>>>stem.java:124)
>>>> at
>>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>> at
>>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>>>> at
>>>>
>>>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissi
>>>>o
>>>>nFiles.java:103)
>>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>>> at
>>>>
>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>>n
>>>>.java:1332)
>>>> at
>>>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.su
>>>>b
>>>>mit(CrunchControlledJob.java:305)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.start
>>>>R
>>>>eadyJobs(CrunchJobControl.java:180)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJ
>>>>o
>>>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>>>> at
>>>>
>>>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:10
>>>>0
>>>>)
>>>> at
>>>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>>>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>>>> at java.lang.Thread.run(Thread.java:680)
>>>> Caused by: java.net.UnknownHostException: bdatadev
>>>> ... 27 more
>>>>
>>>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>>>> cannot ping this host.
>>>>
>>>> Thanks for the help.
>>>>
>>>>
>>>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>>>
>>>>> I am getting following exception while trying to submit a crunch
>>>>>pipeline
>>>>> job to a remote hadoop cluster:
>>>>>
>>>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>>>job
>>>>> output directory /tmp/crunch-324987940
>>>>> at
>>>>>
>>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>>a
>>>>>:344)
>>>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>>>> at test.CrunchTest.main(CrunchTest.java:367)
>>>>> Caused by: java.io.IOException: Failed on local exception:
>>>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>>>> end-group tag did not match expected tag.; Host Details : local host
>>>>>is:
>>>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>>e
>>>>>.java:202)
>>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>>
>>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
>>>>>v
>>>>>a:39)
>>>>> at
>>>>>
>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>>r
>>>>>Impl.java:25)
>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>>o
>>>>>cationHandler.java:164)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>>n
>>>>>Handler.java:83)
>>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mk
>>>>>d
>>>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyst
>>>>>e
>>>>>m.java:523)
>>>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>>>> at
>>>>>
>>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>>a
>>>>>:342)
>>>>> ... 3 more
>>>>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>>>Protocol
>>>>> message end-group tag did not match expected tag.
>>>>> at
>>>>>
>>>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invali
>>>>>d
>>>>>ProtocolBufferException.java:73)
>>>>> at
>>>>>
>>>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.j
>>>>>a
>>>>>va:124)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessa
>>>>>g
>>>>>eLite.java:213)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>>a
>>>>>va:746)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>>a
>>>>>va:238)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>>r
>>>>>actMessageLite.java:282)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>>M
>>>>>essage.java:760)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>>r
>>>>>actMessageLite.java:288)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>>M
>>>>>essage.java:752)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeader
>>>>>P
>>>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882
>>>>>)
>>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>>> java.lang.NoSuchMethodError:
>>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSyste
>>>>>m
>>>>>.java:539)
>>>>> at
>>>>>org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.ja
>>>>>v
>>>>>a:2324)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.ja
>>>>>v
>>>>>a:54)
>>>>>
>>>>> Google search on this error yielded solutions that asked to confirm
>>>>>that
>>>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>>>case.
>>>>>
>>>>> Here's the code that I am using to set up the MRPipeline:
>>>>>
>>>>> Configuration conf = HBaseConfiguration.create();
>>>>>
>>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>>>
>>>>> System.out.println("Hadoop configuration created.");
>>>>> System.out.println("Initializing crunch pipeline ...");
>>>>>
>>>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>>>
>>>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>>>
>>>>> Has anyone faced this issue before and knows how to resolve it/point
>>>>>out
>>>>> if I am missing anything?
>>>>>
>>>>> Thanks for the help.
>>>>
>>>>
>>
>>
>
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Shekhar Sharma <sh...@gmail.com>.
Can you please check whether are you able to access HDFS using java
API..and also able to run MR Job.
Regards,
Som Shekhar Sharma
+91-8197243810


On Sat, Aug 31, 2013 at 7:08 PM, Narlin M <hp...@gmail.com> wrote:
> The <server_address> that was mentioned in my original post is not
> pointing to bdatadev. I should have mentioned this in my original post,
> sorry I missed that.
>
> On 8/31/13 8:32 AM, "Narlin M" <hp...@gmail.com> wrote:
>
>>I would, but bdatadev is not one of my servers, it seems like a random
>>host name. I can't figure out how or where this name got generated. That's
>>what puzzling me.
>>
>>On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:
>>
>>>: java.net.UnknownHostException: bdatadev
>>>
>>>
>>>edit your /etc/hosts file
>>>Regards,
>>>Som Shekhar Sharma
>>>+91-8197243810
>>>
>>>
>>>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>>>> Looks like I was pointing to incorrect ports. After correcting the port
>>>> numbers,
>>>>
>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>>>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>>>
>>>> I am now getting the following exception:
>>>>
>>>> 2880 [Thread-15] INFO
>>>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
>>>>-
>>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>>bdatadev
>>>> at
>>>>
>>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.j
>>>>a
>>>>va:414)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.
>>>>j
>>>>ava:164)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:
>>>>1
>>>>29)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileS
>>>>y
>>>>stem.java:124)
>>>> at
>>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>> at
>>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>>>> at
>>>>
>>>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissi
>>>>o
>>>>nFiles.java:103)
>>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>>> at
>>>>
>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>>n
>>>>.java:1332)
>>>> at
>>>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.su
>>>>b
>>>>mit(CrunchControlledJob.java:305)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.start
>>>>R
>>>>eadyJobs(CrunchJobControl.java:180)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJ
>>>>o
>>>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>>>> at
>>>>
>>>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:10
>>>>0
>>>>)
>>>> at
>>>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>>>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>>>> at java.lang.Thread.run(Thread.java:680)
>>>> Caused by: java.net.UnknownHostException: bdatadev
>>>> ... 27 more
>>>>
>>>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>>>> cannot ping this host.
>>>>
>>>> Thanks for the help.
>>>>
>>>>
>>>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>>>
>>>>> I am getting following exception while trying to submit a crunch
>>>>>pipeline
>>>>> job to a remote hadoop cluster:
>>>>>
>>>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>>>job
>>>>> output directory /tmp/crunch-324987940
>>>>> at
>>>>>
>>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>>a
>>>>>:344)
>>>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>>>> at test.CrunchTest.main(CrunchTest.java:367)
>>>>> Caused by: java.io.IOException: Failed on local exception:
>>>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>>>> end-group tag did not match expected tag.; Host Details : local host
>>>>>is:
>>>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>>e
>>>>>.java:202)
>>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>>
>>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
>>>>>v
>>>>>a:39)
>>>>> at
>>>>>
>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>>r
>>>>>Impl.java:25)
>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>>o
>>>>>cationHandler.java:164)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>>n
>>>>>Handler.java:83)
>>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mk
>>>>>d
>>>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyst
>>>>>e
>>>>>m.java:523)
>>>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>>>> at
>>>>>
>>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>>a
>>>>>:342)
>>>>> ... 3 more
>>>>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>>>Protocol
>>>>> message end-group tag did not match expected tag.
>>>>> at
>>>>>
>>>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invali
>>>>>d
>>>>>ProtocolBufferException.java:73)
>>>>> at
>>>>>
>>>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.j
>>>>>a
>>>>>va:124)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessa
>>>>>g
>>>>>eLite.java:213)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>>a
>>>>>va:746)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>>a
>>>>>va:238)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>>r
>>>>>actMessageLite.java:282)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>>M
>>>>>essage.java:760)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>>r
>>>>>actMessageLite.java:288)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>>M
>>>>>essage.java:752)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeader
>>>>>P
>>>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882
>>>>>)
>>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>>> java.lang.NoSuchMethodError:
>>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSyste
>>>>>m
>>>>>.java:539)
>>>>> at
>>>>>org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.ja
>>>>>v
>>>>>a:2324)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.ja
>>>>>v
>>>>>a:54)
>>>>>
>>>>> Google search on this error yielded solutions that asked to confirm
>>>>>that
>>>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>>>case.
>>>>>
>>>>> Here's the code that I am using to set up the MRPipeline:
>>>>>
>>>>> Configuration conf = HBaseConfiguration.create();
>>>>>
>>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>>>
>>>>> System.out.println("Hadoop configuration created.");
>>>>> System.out.println("Initializing crunch pipeline ...");
>>>>>
>>>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>>>
>>>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>>>
>>>>> Has anyone faced this issue before and knows how to resolve it/point
>>>>>out
>>>>> if I am missing anything?
>>>>>
>>>>> Thanks for the help.
>>>>
>>>>
>>
>>
>
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Shekhar Sharma <sh...@gmail.com>.
Can you please check whether are you able to access HDFS using java
API..and also able to run MR Job.
Regards,
Som Shekhar Sharma
+91-8197243810


On Sat, Aug 31, 2013 at 7:08 PM, Narlin M <hp...@gmail.com> wrote:
> The <server_address> that was mentioned in my original post is not
> pointing to bdatadev. I should have mentioned this in my original post,
> sorry I missed that.
>
> On 8/31/13 8:32 AM, "Narlin M" <hp...@gmail.com> wrote:
>
>>I would, but bdatadev is not one of my servers, it seems like a random
>>host name. I can't figure out how or where this name got generated. That's
>>what puzzling me.
>>
>>On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:
>>
>>>: java.net.UnknownHostException: bdatadev
>>>
>>>
>>>edit your /etc/hosts file
>>>Regards,
>>>Som Shekhar Sharma
>>>+91-8197243810
>>>
>>>
>>>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>>>> Looks like I was pointing to incorrect ports. After correcting the port
>>>> numbers,
>>>>
>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>>>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>>>
>>>> I am now getting the following exception:
>>>>
>>>> 2880 [Thread-15] INFO
>>>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
>>>>-
>>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>>bdatadev
>>>> at
>>>>
>>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.j
>>>>a
>>>>va:414)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.
>>>>j
>>>>ava:164)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:
>>>>1
>>>>29)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>>>> at
>>>>
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileS
>>>>y
>>>>stem.java:124)
>>>> at
>>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>> at
>>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>>>> at
>>>>
>>>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissi
>>>>o
>>>>nFiles.java:103)
>>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>>> at
>>>>
>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>>n
>>>>.java:1332)
>>>> at
>>>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.su
>>>>b
>>>>mit(CrunchControlledJob.java:305)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.start
>>>>R
>>>>eadyJobs(CrunchJobControl.java:180)
>>>> at
>>>>
>>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJ
>>>>o
>>>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>>>> at
>>>>
>>>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:10
>>>>0
>>>>)
>>>> at
>>>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>>>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>>>> at java.lang.Thread.run(Thread.java:680)
>>>> Caused by: java.net.UnknownHostException: bdatadev
>>>> ... 27 more
>>>>
>>>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>>>> cannot ping this host.
>>>>
>>>> Thanks for the help.
>>>>
>>>>
>>>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>>>
>>>>> I am getting following exception while trying to submit a crunch
>>>>>pipeline
>>>>> job to a remote hadoop cluster:
>>>>>
>>>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>>>job
>>>>> output directory /tmp/crunch-324987940
>>>>> at
>>>>>
>>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>>a
>>>>>:344)
>>>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>>>> at test.CrunchTest.main(CrunchTest.java:367)
>>>>> Caused by: java.io.IOException: Failed on local exception:
>>>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>>>> end-group tag did not match expected tag.; Host Details : local host
>>>>>is:
>>>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>>e
>>>>>.java:202)
>>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>>
>>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
>>>>>v
>>>>>a:39)
>>>>> at
>>>>>
>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>>r
>>>>>Impl.java:25)
>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>>o
>>>>>cationHandler.java:164)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>>n
>>>>>Handler.java:83)
>>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mk
>>>>>d
>>>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyst
>>>>>e
>>>>>m.java:523)
>>>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>>>> at
>>>>>
>>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>>a
>>>>>:342)
>>>>> ... 3 more
>>>>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>>>Protocol
>>>>> message end-group tag did not match expected tag.
>>>>> at
>>>>>
>>>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invali
>>>>>d
>>>>>ProtocolBufferException.java:73)
>>>>> at
>>>>>
>>>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.j
>>>>>a
>>>>>va:124)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessa
>>>>>g
>>>>>eLite.java:213)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>>a
>>>>>va:746)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>>a
>>>>>va:238)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>>r
>>>>>actMessageLite.java:282)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>>M
>>>>>essage.java:760)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>>r
>>>>>actMessageLite.java:288)
>>>>> at
>>>>>
>>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>>M
>>>>>essage.java:752)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeader
>>>>>P
>>>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882
>>>>>)
>>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>>> java.lang.NoSuchMethodError:
>>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSyste
>>>>>m
>>>>>.java:539)
>>>>> at
>>>>>org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.ja
>>>>>v
>>>>>a:2324)
>>>>> at
>>>>>
>>>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.ja
>>>>>v
>>>>>a:54)
>>>>>
>>>>> Google search on this error yielded solutions that asked to confirm
>>>>>that
>>>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>>>case.
>>>>>
>>>>> Here's the code that I am using to set up the MRPipeline:
>>>>>
>>>>> Configuration conf = HBaseConfiguration.create();
>>>>>
>>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>>>
>>>>> System.out.println("Hadoop configuration created.");
>>>>> System.out.println("Initializing crunch pipeline ...");
>>>>>
>>>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>>>
>>>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>>>
>>>>> Has anyone faced this issue before and knows how to resolve it/point
>>>>>out
>>>>> if I am missing anything?
>>>>>
>>>>> Thanks for the help.
>>>>
>>>>
>>
>>
>
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
The <server_address> that was mentioned in my original post is not
pointing to bdatadev. I should have mentioned this in my original post,
sorry I missed that.

On 8/31/13 8:32 AM, "Narlin M" <hp...@gmail.com> wrote:

>I would, but bdatadev is not one of my servers, it seems like a random
>host name. I can't figure out how or where this name got generated. That's
>what puzzling me.
>
>On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:
>
>>: java.net.UnknownHostException: bdatadev
>>
>>
>>edit your /etc/hosts file
>>Regards,
>>Som Shekhar Sharma
>>+91-8197243810
>>
>>
>>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>>> Looks like I was pointing to incorrect ports. After correcting the port
>>> numbers,
>>>
>>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>>
>>> I am now getting the following exception:
>>>
>>> 2880 [Thread-15] INFO
>>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
>>>-
>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>bdatadev
>>> at
>>> 
>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.j
>>>a
>>>va:414)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.
>>>j
>>>ava:164)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:
>>>1
>>>29)
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileS
>>>y
>>>stem.java:124)
>>> at 
>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>> at 
>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>>> at
>>> 
>>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissi
>>>o
>>>nFiles.java:103)
>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>> at
>>> 
>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>n
>>>.java:1332)
>>> at 
>>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.su
>>>b
>>>mit(CrunchControlledJob.java:305)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.start
>>>R
>>>eadyJobs(CrunchJobControl.java:180)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJ
>>>o
>>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>>> at
>>> 
>>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:10
>>>0
>>>)
>>> at 
>>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>>> at java.lang.Thread.run(Thread.java:680)
>>> Caused by: java.net.UnknownHostException: bdatadev
>>> ... 27 more
>>>
>>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>>> cannot ping this host.
>>>
>>> Thanks for the help.
>>>
>>>
>>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>>
>>>> I am getting following exception while trying to submit a crunch
>>>>pipeline
>>>> job to a remote hadoop cluster:
>>>>
>>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>>job
>>>> output directory /tmp/crunch-324987940
>>>> at
>>>> 
>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>a
>>>>:344)
>>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>>> at test.CrunchTest.main(CrunchTest.java:367)
>>>> Caused by: java.io.IOException: Failed on local exception:
>>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>>> end-group tag did not match expected tag.; Host Details : local host
>>>>is:
>>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>e
>>>>.java:202)
>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> 
>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
>>>>v
>>>>a:39)
>>>> at
>>>> 
>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>r
>>>>Impl.java:25)
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>> at
>>>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>o
>>>>cationHandler.java:164)
>>>> at
>>>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>n
>>>>Handler.java:83)
>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mk
>>>>d
>>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyst
>>>>e
>>>>m.java:523)
>>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>>> at
>>>> 
>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>a
>>>>:342)
>>>> ... 3 more
>>>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>>Protocol
>>>> message end-group tag did not match expected tag.
>>>> at
>>>> 
>>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invali
>>>>d
>>>>ProtocolBufferException.java:73)
>>>> at
>>>> 
>>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.j
>>>>a
>>>>va:124)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessa
>>>>g
>>>>eLite.java:213)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>a
>>>>va:746)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>a
>>>>va:238)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>r
>>>>actMessageLite.java:282)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>M
>>>>essage.java:760)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>r
>>>>actMessageLite.java:288)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>M
>>>>essage.java:752)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeader
>>>>P
>>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882
>>>>)
>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>> java.lang.NoSuchMethodError:
>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSyste
>>>>m
>>>>.java:539)
>>>> at 
>>>>org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>>> at
>>>> 
>>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.ja
>>>>v
>>>>a:2324)
>>>> at
>>>> 
>>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.ja
>>>>v
>>>>a:54)
>>>>
>>>> Google search on this error yielded solutions that asked to confirm
>>>>that
>>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>>case.
>>>>
>>>> Here's the code that I am using to set up the MRPipeline:
>>>>
>>>> Configuration conf = HBaseConfiguration.create();
>>>>
>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>>
>>>> System.out.println("Hadoop configuration created.");
>>>> System.out.println("Initializing crunch pipeline ...");
>>>>
>>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>>
>>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>>
>>>> Has anyone faced this issue before and knows how to resolve it/point
>>>>out
>>>> if I am missing anything?
>>>>
>>>> Thanks for the help.
>>>
>>>
>
>



Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
The <server_address> that was mentioned in my original post is not
pointing to bdatadev. I should have mentioned this in my original post,
sorry I missed that.

On 8/31/13 8:32 AM, "Narlin M" <hp...@gmail.com> wrote:

>I would, but bdatadev is not one of my servers, it seems like a random
>host name. I can't figure out how or where this name got generated. That's
>what puzzling me.
>
>On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:
>
>>: java.net.UnknownHostException: bdatadev
>>
>>
>>edit your /etc/hosts file
>>Regards,
>>Som Shekhar Sharma
>>+91-8197243810
>>
>>
>>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>>> Looks like I was pointing to incorrect ports. After correcting the port
>>> numbers,
>>>
>>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>>
>>> I am now getting the following exception:
>>>
>>> 2880 [Thread-15] INFO
>>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
>>>-
>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>bdatadev
>>> at
>>> 
>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.j
>>>a
>>>va:414)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.
>>>j
>>>ava:164)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:
>>>1
>>>29)
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileS
>>>y
>>>stem.java:124)
>>> at 
>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>> at 
>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>>> at
>>> 
>>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissi
>>>o
>>>nFiles.java:103)
>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>> at
>>> 
>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>n
>>>.java:1332)
>>> at 
>>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.su
>>>b
>>>mit(CrunchControlledJob.java:305)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.start
>>>R
>>>eadyJobs(CrunchJobControl.java:180)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJ
>>>o
>>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>>> at
>>> 
>>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:10
>>>0
>>>)
>>> at 
>>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>>> at java.lang.Thread.run(Thread.java:680)
>>> Caused by: java.net.UnknownHostException: bdatadev
>>> ... 27 more
>>>
>>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>>> cannot ping this host.
>>>
>>> Thanks for the help.
>>>
>>>
>>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>>
>>>> I am getting following exception while trying to submit a crunch
>>>>pipeline
>>>> job to a remote hadoop cluster:
>>>>
>>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>>job
>>>> output directory /tmp/crunch-324987940
>>>> at
>>>> 
>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>a
>>>>:344)
>>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>>> at test.CrunchTest.main(CrunchTest.java:367)
>>>> Caused by: java.io.IOException: Failed on local exception:
>>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>>> end-group tag did not match expected tag.; Host Details : local host
>>>>is:
>>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>e
>>>>.java:202)
>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> 
>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
>>>>v
>>>>a:39)
>>>> at
>>>> 
>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>r
>>>>Impl.java:25)
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>> at
>>>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>o
>>>>cationHandler.java:164)
>>>> at
>>>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>n
>>>>Handler.java:83)
>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mk
>>>>d
>>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyst
>>>>e
>>>>m.java:523)
>>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>>> at
>>>> 
>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>a
>>>>:342)
>>>> ... 3 more
>>>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>>Protocol
>>>> message end-group tag did not match expected tag.
>>>> at
>>>> 
>>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invali
>>>>d
>>>>ProtocolBufferException.java:73)
>>>> at
>>>> 
>>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.j
>>>>a
>>>>va:124)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessa
>>>>g
>>>>eLite.java:213)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>a
>>>>va:746)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>a
>>>>va:238)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>r
>>>>actMessageLite.java:282)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>M
>>>>essage.java:760)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>r
>>>>actMessageLite.java:288)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>M
>>>>essage.java:752)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeader
>>>>P
>>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882
>>>>)
>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>> java.lang.NoSuchMethodError:
>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSyste
>>>>m
>>>>.java:539)
>>>> at 
>>>>org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>>> at
>>>> 
>>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.ja
>>>>v
>>>>a:2324)
>>>> at
>>>> 
>>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.ja
>>>>v
>>>>a:54)
>>>>
>>>> Google search on this error yielded solutions that asked to confirm
>>>>that
>>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>>case.
>>>>
>>>> Here's the code that I am using to set up the MRPipeline:
>>>>
>>>> Configuration conf = HBaseConfiguration.create();
>>>>
>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>>
>>>> System.out.println("Hadoop configuration created.");
>>>> System.out.println("Initializing crunch pipeline ...");
>>>>
>>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>>
>>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>>
>>>> Has anyone faced this issue before and knows how to resolve it/point
>>>>out
>>>> if I am missing anything?
>>>>
>>>> Thanks for the help.
>>>
>>>
>
>



Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
The <server_address> that was mentioned in my original post is not
pointing to bdatadev. I should have mentioned this in my original post,
sorry I missed that.

On 8/31/13 8:32 AM, "Narlin M" <hp...@gmail.com> wrote:

>I would, but bdatadev is not one of my servers, it seems like a random
>host name. I can't figure out how or where this name got generated. That's
>what puzzling me.
>
>On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:
>
>>: java.net.UnknownHostException: bdatadev
>>
>>
>>edit your /etc/hosts file
>>Regards,
>>Som Shekhar Sharma
>>+91-8197243810
>>
>>
>>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>>> Looks like I was pointing to incorrect ports. After correcting the port
>>> numbers,
>>>
>>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>>
>>> I am now getting the following exception:
>>>
>>> 2880 [Thread-15] INFO
>>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
>>>-
>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>bdatadev
>>> at
>>> 
>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.j
>>>a
>>>va:414)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.
>>>j
>>>ava:164)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:
>>>1
>>>29)
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileS
>>>y
>>>stem.java:124)
>>> at 
>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>> at 
>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>>> at
>>> 
>>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissi
>>>o
>>>nFiles.java:103)
>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>> at
>>> 
>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>n
>>>.java:1332)
>>> at 
>>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.su
>>>b
>>>mit(CrunchControlledJob.java:305)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.start
>>>R
>>>eadyJobs(CrunchJobControl.java:180)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJ
>>>o
>>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>>> at
>>> 
>>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:10
>>>0
>>>)
>>> at 
>>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>>> at java.lang.Thread.run(Thread.java:680)
>>> Caused by: java.net.UnknownHostException: bdatadev
>>> ... 27 more
>>>
>>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>>> cannot ping this host.
>>>
>>> Thanks for the help.
>>>
>>>
>>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>>
>>>> I am getting following exception while trying to submit a crunch
>>>>pipeline
>>>> job to a remote hadoop cluster:
>>>>
>>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>>job
>>>> output directory /tmp/crunch-324987940
>>>> at
>>>> 
>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>a
>>>>:344)
>>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>>> at test.CrunchTest.main(CrunchTest.java:367)
>>>> Caused by: java.io.IOException: Failed on local exception:
>>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>>> end-group tag did not match expected tag.; Host Details : local host
>>>>is:
>>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>e
>>>>.java:202)
>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> 
>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
>>>>v
>>>>a:39)
>>>> at
>>>> 
>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>r
>>>>Impl.java:25)
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>> at
>>>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>o
>>>>cationHandler.java:164)
>>>> at
>>>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>n
>>>>Handler.java:83)
>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mk
>>>>d
>>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyst
>>>>e
>>>>m.java:523)
>>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>>> at
>>>> 
>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>a
>>>>:342)
>>>> ... 3 more
>>>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>>Protocol
>>>> message end-group tag did not match expected tag.
>>>> at
>>>> 
>>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invali
>>>>d
>>>>ProtocolBufferException.java:73)
>>>> at
>>>> 
>>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.j
>>>>a
>>>>va:124)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessa
>>>>g
>>>>eLite.java:213)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>a
>>>>va:746)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>a
>>>>va:238)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>r
>>>>actMessageLite.java:282)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>M
>>>>essage.java:760)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>r
>>>>actMessageLite.java:288)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>M
>>>>essage.java:752)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeader
>>>>P
>>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882
>>>>)
>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>> java.lang.NoSuchMethodError:
>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSyste
>>>>m
>>>>.java:539)
>>>> at 
>>>>org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>>> at
>>>> 
>>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.ja
>>>>v
>>>>a:2324)
>>>> at
>>>> 
>>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.ja
>>>>v
>>>>a:54)
>>>>
>>>> Google search on this error yielded solutions that asked to confirm
>>>>that
>>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>>case.
>>>>
>>>> Here's the code that I am using to set up the MRPipeline:
>>>>
>>>> Configuration conf = HBaseConfiguration.create();
>>>>
>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>>
>>>> System.out.println("Hadoop configuration created.");
>>>> System.out.println("Initializing crunch pipeline ...");
>>>>
>>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>>
>>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>>
>>>> Has anyone faced this issue before and knows how to resolve it/point
>>>>out
>>>> if I am missing anything?
>>>>
>>>> Thanks for the help.
>>>
>>>
>
>



Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
The <server_address> that was mentioned in my original post is not
pointing to bdatadev. I should have mentioned this in my original post,
sorry I missed that.

On 8/31/13 8:32 AM, "Narlin M" <hp...@gmail.com> wrote:

>I would, but bdatadev is not one of my servers, it seems like a random
>host name. I can't figure out how or where this name got generated. That's
>what puzzling me.
>
>On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:
>
>>: java.net.UnknownHostException: bdatadev
>>
>>
>>edit your /etc/hosts file
>>Regards,
>>Som Shekhar Sharma
>>+91-8197243810
>>
>>
>>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>>> Looks like I was pointing to incorrect ports. After correcting the port
>>> numbers,
>>>
>>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>>
>>> I am now getting the following exception:
>>>
>>> 2880 [Thread-15] INFO
>>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
>>>-
>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>bdatadev
>>> at
>>> 
>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.j
>>>a
>>>va:414)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.
>>>j
>>>ava:164)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:
>>>1
>>>29)
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileS
>>>y
>>>stem.java:124)
>>> at 
>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>> at 
>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>>> at
>>> 
>>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissi
>>>o
>>>nFiles.java:103)
>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>> at
>>> 
>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>n
>>>.java:1332)
>>> at 
>>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.su
>>>b
>>>mit(CrunchControlledJob.java:305)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.start
>>>R
>>>eadyJobs(CrunchJobControl.java:180)
>>> at
>>> 
>>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJ
>>>o
>>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>>> at
>>> 
>>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:10
>>>0
>>>)
>>> at 
>>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>>> at java.lang.Thread.run(Thread.java:680)
>>> Caused by: java.net.UnknownHostException: bdatadev
>>> ... 27 more
>>>
>>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>>> cannot ping this host.
>>>
>>> Thanks for the help.
>>>
>>>
>>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>>
>>>> I am getting following exception while trying to submit a crunch
>>>>pipeline
>>>> job to a remote hadoop cluster:
>>>>
>>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>>job
>>>> output directory /tmp/crunch-324987940
>>>> at
>>>> 
>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>a
>>>>:344)
>>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>>> at test.CrunchTest.main(CrunchTest.java:367)
>>>> Caused by: java.io.IOException: Failed on local exception:
>>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>>> end-group tag did not match expected tag.; Host Details : local host
>>>>is:
>>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>e
>>>>.java:202)
>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> 
>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
>>>>v
>>>>a:39)
>>>> at
>>>> 
>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>r
>>>>Impl.java:25)
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>> at
>>>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>o
>>>>cationHandler.java:164)
>>>> at
>>>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>n
>>>>Handler.java:83)
>>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mk
>>>>d
>>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyst
>>>>e
>>>>m.java:523)
>>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>>> at
>>>> 
>>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.jav
>>>>a
>>>>:342)
>>>> ... 3 more
>>>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>>Protocol
>>>> message end-group tag did not match expected tag.
>>>> at
>>>> 
>>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invali
>>>>d
>>>>ProtocolBufferException.java:73)
>>>> at
>>>> 
>>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.j
>>>>a
>>>>va:124)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessa
>>>>g
>>>>eLite.java:213)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>a
>>>>va:746)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.j
>>>>a
>>>>va:238)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>r
>>>>actMessageLite.java:282)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>M
>>>>essage.java:760)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abst
>>>>r
>>>>actMessageLite.java:288)
>>>> at
>>>> 
>>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(Abstract
>>>>M
>>>>essage.java:752)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeader
>>>>P
>>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>>> at
>>>> 
>>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882
>>>>)
>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>> java.lang.NoSuchMethodError:
>>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>>> at
>>>> 
>>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSyste
>>>>m
>>>>.java:539)
>>>> at 
>>>>org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>>> at
>>>> 
>>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.ja
>>>>v
>>>>a:2324)
>>>> at
>>>> 
>>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.ja
>>>>v
>>>>a:54)
>>>>
>>>> Google search on this error yielded solutions that asked to confirm
>>>>that
>>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>>case.
>>>>
>>>> Here's the code that I am using to set up the MRPipeline:
>>>>
>>>> Configuration conf = HBaseConfiguration.create();
>>>>
>>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>>
>>>> System.out.println("Hadoop configuration created.");
>>>> System.out.println("Initializing crunch pipeline ...");
>>>>
>>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>>
>>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>>
>>>> Has anyone faced this issue before and knows how to resolve it/point
>>>>out
>>>> if I am missing anything?
>>>>
>>>> Thanks for the help.
>>>
>>>
>
>



Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
I would, but bdatadev is not one of my servers, it seems like a random
host name. I can't figure out how or where this name got generated. That's
what puzzling me.

On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:

>: java.net.UnknownHostException: bdatadev
>
>
>edit your /etc/hosts file
>Regards,
>Som Shekhar Sharma
>+91-8197243810
>
>
>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>> Looks like I was pointing to incorrect ports. After correcting the port
>> numbers,
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>
>> I am now getting the following exception:
>>
>> 2880 [Thread-15] INFO
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>bdatadev
>> at
>> 
>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.ja
>>va:414)
>> at
>> 
>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.j
>>ava:164)
>> at
>> 
>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:1
>>29)
>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>> at
>> 
>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSy
>>stem.java:124)
>> at 
>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>> at 
>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>> at
>> 
>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissio
>>nFiles.java:103)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1332)
>> at 
>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.sub
>>mit(CrunchControlledJob.java:305)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startR
>>eadyJobs(CrunchJobControl.java:180)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJo
>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>> at
>> 
>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100
>>)
>> at 
>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>> at java.lang.Thread.run(Thread.java:680)
>> Caused by: java.net.UnknownHostException: bdatadev
>> ... 27 more
>>
>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>> cannot ping this host.
>>
>> Thanks for the help.
>>
>>
>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>
>>> I am getting following exception while trying to submit a crunch
>>>pipeline
>>> job to a remote hadoop cluster:
>>>
>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>job
>>> output directory /tmp/crunch-324987940
>>> at
>>> 
>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java
>>>:344)
>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>> at test.CrunchTest.main(CrunchTest.java:367)
>>> Caused by: java.io.IOException: Failed on local exception:
>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>> end-group tag did not match expected tag.; Host Details : local host
>>>is:
>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>> at
>>> 
>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine
>>>.java:202)
>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> 
>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
>>>a:39)
>>> at
>>> 
>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
>>>Impl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>> 
>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
>>>cationHandler.java:164)
>>> at
>>> 
>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
>>>Handler.java:83)
>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkd
>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyste
>>>m.java:523)
>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>> at
>>> 
>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java
>>>:342)
>>> ... 3 more
>>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>>> message end-group tag did not match expected tag.
>>> at
>>> 
>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invalid
>>>ProtocolBufferException.java:73)
>>> at
>>> 
>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.ja
>>>va:124)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessag
>>>eLite.java:213)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.ja
>>>va:746)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.ja
>>>va:238)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abstr
>>>actMessageLite.java:282)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractM
>>>essage.java:760)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abstr
>>>actMessageLite.java:288)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractM
>>>essage.java:752)
>>> at
>>> 
>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderP
>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>> at
>>> 
>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>> java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem
>>>.java:539)
>>> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>> at
>>> 
>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.jav
>>>a:2324)
>>> at
>>> 
>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.jav
>>>a:54)
>>>
>>> Google search on this error yielded solutions that asked to confirm
>>>that
>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>case.
>>>
>>> Here's the code that I am using to set up the MRPipeline:
>>>
>>> Configuration conf = HBaseConfiguration.create();
>>>
>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>
>>> System.out.println("Hadoop configuration created.");
>>> System.out.println("Initializing crunch pipeline ...");
>>>
>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>
>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>
>>> Has anyone faced this issue before and knows how to resolve it/point
>>>out
>>> if I am missing anything?
>>>
>>> Thanks for the help.
>>
>>



Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
I would, but bdatadev is not one of my servers, it seems like a random
host name. I can't figure out how or where this name got generated. That's
what puzzling me.

On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:

>: java.net.UnknownHostException: bdatadev
>
>
>edit your /etc/hosts file
>Regards,
>Som Shekhar Sharma
>+91-8197243810
>
>
>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>> Looks like I was pointing to incorrect ports. After correcting the port
>> numbers,
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>
>> I am now getting the following exception:
>>
>> 2880 [Thread-15] INFO
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>bdatadev
>> at
>> 
>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.ja
>>va:414)
>> at
>> 
>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.j
>>ava:164)
>> at
>> 
>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:1
>>29)
>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>> at
>> 
>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSy
>>stem.java:124)
>> at 
>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>> at 
>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>> at
>> 
>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissio
>>nFiles.java:103)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1332)
>> at 
>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.sub
>>mit(CrunchControlledJob.java:305)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startR
>>eadyJobs(CrunchJobControl.java:180)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJo
>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>> at
>> 
>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100
>>)
>> at 
>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>> at java.lang.Thread.run(Thread.java:680)
>> Caused by: java.net.UnknownHostException: bdatadev
>> ... 27 more
>>
>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>> cannot ping this host.
>>
>> Thanks for the help.
>>
>>
>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>
>>> I am getting following exception while trying to submit a crunch
>>>pipeline
>>> job to a remote hadoop cluster:
>>>
>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>job
>>> output directory /tmp/crunch-324987940
>>> at
>>> 
>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java
>>>:344)
>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>> at test.CrunchTest.main(CrunchTest.java:367)
>>> Caused by: java.io.IOException: Failed on local exception:
>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>> end-group tag did not match expected tag.; Host Details : local host
>>>is:
>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>> at
>>> 
>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine
>>>.java:202)
>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> 
>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
>>>a:39)
>>> at
>>> 
>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
>>>Impl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>> 
>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
>>>cationHandler.java:164)
>>> at
>>> 
>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
>>>Handler.java:83)
>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkd
>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyste
>>>m.java:523)
>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>> at
>>> 
>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java
>>>:342)
>>> ... 3 more
>>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>>> message end-group tag did not match expected tag.
>>> at
>>> 
>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invalid
>>>ProtocolBufferException.java:73)
>>> at
>>> 
>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.ja
>>>va:124)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessag
>>>eLite.java:213)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.ja
>>>va:746)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.ja
>>>va:238)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abstr
>>>actMessageLite.java:282)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractM
>>>essage.java:760)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abstr
>>>actMessageLite.java:288)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractM
>>>essage.java:752)
>>> at
>>> 
>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderP
>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>> at
>>> 
>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>> java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem
>>>.java:539)
>>> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>> at
>>> 
>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.jav
>>>a:2324)
>>> at
>>> 
>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.jav
>>>a:54)
>>>
>>> Google search on this error yielded solutions that asked to confirm
>>>that
>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>case.
>>>
>>> Here's the code that I am using to set up the MRPipeline:
>>>
>>> Configuration conf = HBaseConfiguration.create();
>>>
>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>
>>> System.out.println("Hadoop configuration created.");
>>> System.out.println("Initializing crunch pipeline ...");
>>>
>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>
>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>
>>> Has anyone faced this issue before and knows how to resolve it/point
>>>out
>>> if I am missing anything?
>>>
>>> Thanks for the help.
>>
>>



Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
I would, but bdatadev is not one of my servers, it seems like a random
host name. I can't figure out how or where this name got generated. That's
what puzzling me.

On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:

>: java.net.UnknownHostException: bdatadev
>
>
>edit your /etc/hosts file
>Regards,
>Som Shekhar Sharma
>+91-8197243810
>
>
>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>> Looks like I was pointing to incorrect ports. After correcting the port
>> numbers,
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>
>> I am now getting the following exception:
>>
>> 2880 [Thread-15] INFO
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>bdatadev
>> at
>> 
>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.ja
>>va:414)
>> at
>> 
>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.j
>>ava:164)
>> at
>> 
>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:1
>>29)
>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>> at
>> 
>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSy
>>stem.java:124)
>> at 
>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>> at 
>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>> at
>> 
>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissio
>>nFiles.java:103)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1332)
>> at 
>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.sub
>>mit(CrunchControlledJob.java:305)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startR
>>eadyJobs(CrunchJobControl.java:180)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJo
>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>> at
>> 
>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100
>>)
>> at 
>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>> at java.lang.Thread.run(Thread.java:680)
>> Caused by: java.net.UnknownHostException: bdatadev
>> ... 27 more
>>
>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>> cannot ping this host.
>>
>> Thanks for the help.
>>
>>
>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>
>>> I am getting following exception while trying to submit a crunch
>>>pipeline
>>> job to a remote hadoop cluster:
>>>
>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>job
>>> output directory /tmp/crunch-324987940
>>> at
>>> 
>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java
>>>:344)
>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>> at test.CrunchTest.main(CrunchTest.java:367)
>>> Caused by: java.io.IOException: Failed on local exception:
>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>> end-group tag did not match expected tag.; Host Details : local host
>>>is:
>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>> at
>>> 
>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine
>>>.java:202)
>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> 
>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
>>>a:39)
>>> at
>>> 
>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
>>>Impl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>> 
>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
>>>cationHandler.java:164)
>>> at
>>> 
>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
>>>Handler.java:83)
>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkd
>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyste
>>>m.java:523)
>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>> at
>>> 
>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java
>>>:342)
>>> ... 3 more
>>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>>> message end-group tag did not match expected tag.
>>> at
>>> 
>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invalid
>>>ProtocolBufferException.java:73)
>>> at
>>> 
>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.ja
>>>va:124)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessag
>>>eLite.java:213)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.ja
>>>va:746)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.ja
>>>va:238)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abstr
>>>actMessageLite.java:282)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractM
>>>essage.java:760)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abstr
>>>actMessageLite.java:288)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractM
>>>essage.java:752)
>>> at
>>> 
>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderP
>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>> at
>>> 
>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>> java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem
>>>.java:539)
>>> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>> at
>>> 
>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.jav
>>>a:2324)
>>> at
>>> 
>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.jav
>>>a:54)
>>>
>>> Google search on this error yielded solutions that asked to confirm
>>>that
>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>case.
>>>
>>> Here's the code that I am using to set up the MRPipeline:
>>>
>>> Configuration conf = HBaseConfiguration.create();
>>>
>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>
>>> System.out.println("Hadoop configuration created.");
>>> System.out.println("Initializing crunch pipeline ...");
>>>
>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>
>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>
>>> Has anyone faced this issue before and knows how to resolve it/point
>>>out
>>> if I am missing anything?
>>>
>>> Thanks for the help.
>>
>>



Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
I would, but bdatadev is not one of my servers, it seems like a random
host name. I can't figure out how or where this name got generated. That's
what puzzling me.

On 8/31/13 5:43 AM, "Shekhar Sharma" <sh...@gmail.com> wrote:

>: java.net.UnknownHostException: bdatadev
>
>
>edit your /etc/hosts file
>Regards,
>Som Shekhar Sharma
>+91-8197243810
>
>
>On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
>> Looks like I was pointing to incorrect ports. After correcting the port
>> numbers,
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
>> conf.set("mapred.job.tracker", "<server_address>:8021");
>>
>> I am now getting the following exception:
>>
>> 2880 [Thread-15] INFO
>> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>bdatadev
>> at
>> 
>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.ja
>>va:414)
>> at
>> 
>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.j
>>ava:164)
>> at
>> 
>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:1
>>29)
>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
>> at
>> 
>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSy
>>stem.java:124)
>> at 
>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>> at 
>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>> at
>> 
>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissio
>>nFiles.java:103)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1332)
>> at 
>>org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.sub
>>mit(CrunchControlledJob.java:305)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startR
>>eadyJobs(CrunchJobControl.java:180)
>> at
>> 
>>org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJo
>>bStatusAndStartNewOnes(CrunchJobControl.java:209)
>> at
>> 
>>org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100
>>)
>> at 
>>org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
>> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
>> at java.lang.Thread.run(Thread.java:680)
>> Caused by: java.net.UnknownHostException: bdatadev
>> ... 27 more
>>
>> However nowhere in my code a host named "bdatadev" is mentioned, and I
>> cannot ping this host.
>>
>> Thanks for the help.
>>
>>
>> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>>
>>> I am getting following exception while trying to submit a crunch
>>>pipeline
>>> job to a remote hadoop cluster:
>>>
>>> Exception in thread "main" java.lang.RuntimeException: Cannot create
>>>job
>>> output directory /tmp/crunch-324987940
>>> at
>>> 
>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java
>>>:344)
>>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>>> at test.CrunchTest.setup(CrunchTest.java:98)
>>> at test.CrunchTest.main(CrunchTest.java:367)
>>> Caused by: java.io.IOException: Failed on local exception:
>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>> end-group tag did not match expected tag.; Host Details : local host
>>>is:
>>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>> at
>>> 
>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine
>>>.java:202)
>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> 
>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
>>>a:39)
>>> at
>>> 
>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
>>>Impl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>> 
>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
>>>cationHandler.java:164)
>>> at
>>> 
>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
>>>Handler.java:83)
>>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkd
>>>irs(ClientNamenodeProtocolTranslatorPB.java:425)
>>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSyste
>>>m.java:523)
>>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>>> at
>>> 
>>>org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java
>>>:342)
>>> ... 3 more
>>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>>> message end-group tag did not match expected tag.
>>> at
>>> 
>>>com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(Invalid
>>>ProtocolBufferException.java:73)
>>> at
>>> 
>>>com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.ja
>>>va:124)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessag
>>>eLite.java:213)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.ja
>>>va:746)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.ja
>>>va:238)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abstr
>>>actMessageLite.java:282)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractM
>>>essage.java:760)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(Abstr
>>>actMessageLite.java:288)
>>> at
>>> 
>>>com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractM
>>>essage.java:752)
>>> at
>>> 
>>>org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderP
>>>roto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>> at
>>> 
>>>org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>> java.lang.NoSuchMethodError:
>>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>>> at
>>> 
>>>org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem
>>>.java:539)
>>> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>>> at
>>> 
>>>org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.jav
>>>a:2324)
>>> at
>>> 
>>>org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.jav
>>>a:54)
>>>
>>> Google search on this error yielded solutions that asked to confirm
>>>that
>>> /etc/hosts file contained the entry for NARLIN which it does in my
>>>case.
>>>
>>> Here's the code that I am using to set up the MRPipeline:
>>>
>>> Configuration conf = HBaseConfiguration.create();
>>>
>>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>>
>>> System.out.println("Hadoop configuration created.");
>>> System.out.println("Initializing crunch pipeline ...");
>>>
>>> conf.set("mapred.jar", "<path_to_jar_file>");
>>>
>>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>>
>>> Has anyone faced this issue before and knows how to resolve it/point
>>>out
>>> if I am missing anything?
>>>
>>> Thanks for the help.
>>
>>



Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Shekhar Sharma <sh...@gmail.com>.
: java.net.UnknownHostException: bdatadev


edit your /etc/hosts file
Regards,
Som Shekhar Sharma
+91-8197243810


On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
> Looks like I was pointing to incorrect ports. After correcting the port
> numbers,
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
> conf.set("mapred.job.tracker", "<server_address>:8021");
>
> I am now getting the following exception:
>
> 2880 [Thread-15] INFO
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
> java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
> at
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
> at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
> at
> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
> at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: java.net.UnknownHostException: bdatadev
> ... 27 more
>
> However nowhere in my code a host named "bdatadev" is mentioned, and I
> cannot ping this host.
>
> Thanks for the help.
>
>
> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>
>> I am getting following exception while trying to submit a crunch pipeline
>> job to a remote hadoop cluster:
>>
>> Exception in thread "main" java.lang.RuntimeException: Cannot create job
>> output directory /tmp/crunch-324987940
>> at
>> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>> at test.CrunchTest.setup(CrunchTest.java:98)
>> at test.CrunchTest.main(CrunchTest.java:367)
>> Caused by: java.io.IOException: Failed on local exception:
>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>> end-group tag did not match expected tag.; Host Details : local host is:
>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>> at
>> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
>> ... 3 more
>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>> message end-group tag did not match expected tag.
>> at
>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>> at
>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
>> at
>> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>> at
>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>> java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>> at
>> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
>> at
>> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>>
>> Google search on this error yielded solutions that asked to confirm that
>> /etc/hosts file contained the entry for NARLIN which it does in my case.
>>
>> Here's the code that I am using to set up the MRPipeline:
>>
>> Configuration conf = HBaseConfiguration.create();
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>
>> System.out.println("Hadoop configuration created.");
>> System.out.println("Initializing crunch pipeline ...");
>>
>> conf.set("mapred.jar", "<path_to_jar_file>");
>>
>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>
>> Has anyone faced this issue before and knows how to resolve it/point out
>> if I am missing anything?
>>
>> Thanks for the help.
>
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Shekhar Sharma <sh...@gmail.com>.
: java.net.UnknownHostException: bdatadev


edit your /etc/hosts file
Regards,
Som Shekhar Sharma
+91-8197243810


On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
> Looks like I was pointing to incorrect ports. After correcting the port
> numbers,
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
> conf.set("mapred.job.tracker", "<server_address>:8021");
>
> I am now getting the following exception:
>
> 2880 [Thread-15] INFO
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
> java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
> at
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
> at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
> at
> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
> at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: java.net.UnknownHostException: bdatadev
> ... 27 more
>
> However nowhere in my code a host named "bdatadev" is mentioned, and I
> cannot ping this host.
>
> Thanks for the help.
>
>
> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>
>> I am getting following exception while trying to submit a crunch pipeline
>> job to a remote hadoop cluster:
>>
>> Exception in thread "main" java.lang.RuntimeException: Cannot create job
>> output directory /tmp/crunch-324987940
>> at
>> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>> at test.CrunchTest.setup(CrunchTest.java:98)
>> at test.CrunchTest.main(CrunchTest.java:367)
>> Caused by: java.io.IOException: Failed on local exception:
>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>> end-group tag did not match expected tag.; Host Details : local host is:
>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>> at
>> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
>> ... 3 more
>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>> message end-group tag did not match expected tag.
>> at
>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>> at
>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
>> at
>> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>> at
>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>> java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>> at
>> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
>> at
>> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>>
>> Google search on this error yielded solutions that asked to confirm that
>> /etc/hosts file contained the entry for NARLIN which it does in my case.
>>
>> Here's the code that I am using to set up the MRPipeline:
>>
>> Configuration conf = HBaseConfiguration.create();
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>
>> System.out.println("Hadoop configuration created.");
>> System.out.println("Initializing crunch pipeline ...");
>>
>> conf.set("mapred.jar", "<path_to_jar_file>");
>>
>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>
>> Has anyone faced this issue before and knows how to resolve it/point out
>> if I am missing anything?
>>
>> Thanks for the help.
>
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Shekhar Sharma <sh...@gmail.com>.
: java.net.UnknownHostException: bdatadev


edit your /etc/hosts file
Regards,
Som Shekhar Sharma
+91-8197243810


On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
> Looks like I was pointing to incorrect ports. After correcting the port
> numbers,
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
> conf.set("mapred.job.tracker", "<server_address>:8021");
>
> I am now getting the following exception:
>
> 2880 [Thread-15] INFO
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
> java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
> at
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
> at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
> at
> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
> at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: java.net.UnknownHostException: bdatadev
> ... 27 more
>
> However nowhere in my code a host named "bdatadev" is mentioned, and I
> cannot ping this host.
>
> Thanks for the help.
>
>
> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>
>> I am getting following exception while trying to submit a crunch pipeline
>> job to a remote hadoop cluster:
>>
>> Exception in thread "main" java.lang.RuntimeException: Cannot create job
>> output directory /tmp/crunch-324987940
>> at
>> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>> at test.CrunchTest.setup(CrunchTest.java:98)
>> at test.CrunchTest.main(CrunchTest.java:367)
>> Caused by: java.io.IOException: Failed on local exception:
>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>> end-group tag did not match expected tag.; Host Details : local host is:
>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>> at
>> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
>> ... 3 more
>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>> message end-group tag did not match expected tag.
>> at
>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>> at
>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
>> at
>> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>> at
>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>> java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>> at
>> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
>> at
>> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>>
>> Google search on this error yielded solutions that asked to confirm that
>> /etc/hosts file contained the entry for NARLIN which it does in my case.
>>
>> Here's the code that I am using to set up the MRPipeline:
>>
>> Configuration conf = HBaseConfiguration.create();
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>
>> System.out.println("Hadoop configuration created.");
>> System.out.println("Initializing crunch pipeline ...");
>>
>> conf.set("mapred.jar", "<path_to_jar_file>");
>>
>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>
>> Has anyone faced this issue before and knows how to resolve it/point out
>> if I am missing anything?
>>
>> Thanks for the help.
>
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Shekhar Sharma <sh...@gmail.com>.
: java.net.UnknownHostException: bdatadev


edit your /etc/hosts file
Regards,
Som Shekhar Sharma
+91-8197243810


On Sat, Aug 31, 2013 at 2:05 AM, Narlin M <hp...@gmail.com> wrote:
> Looks like I was pointing to incorrect ports. After correcting the port
> numbers,
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
> conf.set("mapred.job.tracker", "<server_address>:8021");
>
> I am now getting the following exception:
>
> 2880 [Thread-15] INFO
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
> java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
> at
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
> at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
> at
> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
> at
> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
> at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: java.net.UnknownHostException: bdatadev
> ... 27 more
>
> However nowhere in my code a host named "bdatadev" is mentioned, and I
> cannot ping this host.
>
> Thanks for the help.
>
>
> On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:
>>
>> I am getting following exception while trying to submit a crunch pipeline
>> job to a remote hadoop cluster:
>>
>> Exception in thread "main" java.lang.RuntimeException: Cannot create job
>> output directory /tmp/crunch-324987940
>> at
>> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
>> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
>> at test.CrunchTest.setup(CrunchTest.java:98)
>> at test.CrunchTest.main(CrunchTest.java:367)
>> Caused by: java.io.IOException: Failed on local exception:
>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>> end-group tag did not match expected tag.; Host Details : local host is:
>> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
>> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
>> at
>> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
>> ... 3 more
>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>> message end-group tag did not match expected tag.
>> at
>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>> at
>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
>> at
>> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>> at
>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
>> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>> java.lang.NoSuchMethodError:
>> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
>> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
>> at
>> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
>> at
>> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>>
>> Google search on this error yielded solutions that asked to confirm that
>> /etc/hosts file contained the entry for NARLIN which it does in my case.
>>
>> Here's the code that I am using to set up the MRPipeline:
>>
>> Configuration conf = HBaseConfiguration.create();
>>
>> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
>> conf.set("mapred.job.tracker", "<server_address>:50030");
>>
>> System.out.println("Hadoop configuration created.");
>> System.out.println("Initializing crunch pipeline ...");
>>
>> conf.set("mapred.jar", "<path_to_jar_file>");
>>
>> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>>
>> Has anyone faced this issue before and knows how to resolve it/point out
>> if I am missing anything?
>>
>> Thanks for the help.
>
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
Looks like I was pointing to incorrect ports. After correcting the port
numbers,

conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
conf.set("mapred.job.tracker", "<server_address>:8021");

I am now getting the following exception:

2880 [Thread-15] INFO
 org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
at
org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.net.UnknownHostException: bdatadev
... 27 more

However nowhere in my code a host named "bdatadev" is mentioned, and I
cannot ping this host.

Thanks for the help.


On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:

> I am getting following exception while trying to submit a crunch pipeline
> job to a remote hadoop cluster:
>
> Exception in thread "main" java.lang.RuntimeException: Cannot create job
> output directory /tmp/crunch-324987940
>  at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
>  at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> at test.CrunchTest.setup(CrunchTest.java:98)
>  at test.CrunchTest.main(CrunchTest.java:367)
> Caused by: java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> end-group tag did not match expected tag.; Host Details : local host is:
> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>  at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>  at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>  at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
>  at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
>  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
>  ... 3 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.
>  at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>  at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>  at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>  at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>  at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>  at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>  at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
>  at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Google search on this error yielded solutions that asked to confirm that
> /etc/hosts file contained the entry for NARLIN which it does in my case.
>
> Here's the code that I am using to set up the MRPipeline:
>
> Configuration conf = HBaseConfiguration.create();
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> conf.set("mapred.job.tracker", "<server_address>:50030");
>
> System.out.println("Hadoop configuration created.");
> System.out.println("Initializing crunch pipeline ...");
>
> conf.set("mapred.jar", "<path_to_jar_file>");
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>
> Has anyone faced this issue before and knows how to resolve it/point out
> if I am missing anything?
>
> Thanks for the help.
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
Thank you Shekhar, Harsh. I will follow and try to implement your
suggestions. I appreciate the help.


On Sat, Aug 31, 2013 at 11:12 AM, Harsh J <ha...@cloudera.com> wrote:

> Your cluster is using HDFS HA, and therefore requires a little more
> configs than just fs.defaultFS/etc..
>
> You need to use the right set of cluster client configs. If you don't
> have them at /etc/hadoop/conf and /etc/hbase/conf on your cluster edge
> node to pull from, try asking your cluster administrator for a
> configuration set, and place their parent directories on your
> application's classpath.
>
> The first error deals with perhaps you also including a guava
> dependency in your project, which is different than the one
> transitively pulled in by hadoop-client via crunch. You should be able
> to use guava libs without needing an explicit dependency, and it would
> be the right needed version.
>
> The second error deals with your MR submission failing, cause the JT
> is using a staging directory over a HDFS HA, which uses a "logical"
> name of "bdatadev". A logical HA name needs other configs (typically
> in the hdfs-site.xml) that tell it which are the actual physical NNs
> under it - configs that you're missing here.
>
> On Sat, Aug 31, 2013 at 1:34 AM, Narlin M <hp...@gmail.com> wrote:
> > I am getting following exception while trying to submit a crunch pipeline
> > job to a remote hadoop cluster:
> >
> > Exception in thread "main" java.lang.RuntimeException: Cannot create job
> > output directory /tmp/crunch-324987940
> > at
> >
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
> > at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> > at test.CrunchTest.setup(CrunchTest.java:98)
> > at test.CrunchTest.main(CrunchTest.java:367)
> > Caused by: java.io.IOException: Failed on local exception:
> > com.google.protobuf.InvalidProtocolBufferException: Protocol message
> > end-group tag did not match expected tag.; Host Details : local host is:
> > "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
> > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1164)
> > at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> > at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> > at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> > at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
> > at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
> > at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> > at
> >
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
> > ... 3 more
> > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> > message end-group tag did not match expected tag.
> > at
> >
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
> > at
> >
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> > at
> >
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
> > at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> > 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> > ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> > com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> > java.lang.NoSuchMethodError:
> > com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> > at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
> > at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
> > at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> > at
> >
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
> > at
> >
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> >
> > Google search on this error yielded solutions that asked to confirm that
> > /etc/hosts file contained the entry for NARLIN which it does in my case.
> >
> > Here's the code that I am using to set up the MRPipeline:
> >
> > Configuration conf = HBaseConfiguration.create();
> >
> > conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> > conf.set("mapred.job.tracker", "<server_address>:50030");
> >
> > System.out.println("Hadoop configuration created.");
> > System.out.println("Initializing crunch pipeline ...");
> >
> > conf.set("mapred.jar", "<path_to_jar_file>");
> >
> > pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
> >
> > Has anyone faced this issue before and knows how to resolve it/point out
> if
> > I am missing anything?
> >
> > Thanks for the help.
>
>
>
> --
> Harsh J
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
Thank you Shekhar, Harsh. I will follow and try to implement your
suggestions. I appreciate the help.


On Sat, Aug 31, 2013 at 11:12 AM, Harsh J <ha...@cloudera.com> wrote:

> Your cluster is using HDFS HA, and therefore requires a little more
> configs than just fs.defaultFS/etc..
>
> You need to use the right set of cluster client configs. If you don't
> have them at /etc/hadoop/conf and /etc/hbase/conf on your cluster edge
> node to pull from, try asking your cluster administrator for a
> configuration set, and place their parent directories on your
> application's classpath.
>
> The first error deals with perhaps you also including a guava
> dependency in your project, which is different than the one
> transitively pulled in by hadoop-client via crunch. You should be able
> to use guava libs without needing an explicit dependency, and it would
> be the right needed version.
>
> The second error deals with your MR submission failing, cause the JT
> is using a staging directory over a HDFS HA, which uses a "logical"
> name of "bdatadev". A logical HA name needs other configs (typically
> in the hdfs-site.xml) that tell it which are the actual physical NNs
> under it - configs that you're missing here.
>
> On Sat, Aug 31, 2013 at 1:34 AM, Narlin M <hp...@gmail.com> wrote:
> > I am getting following exception while trying to submit a crunch pipeline
> > job to a remote hadoop cluster:
> >
> > Exception in thread "main" java.lang.RuntimeException: Cannot create job
> > output directory /tmp/crunch-324987940
> > at
> >
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
> > at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> > at test.CrunchTest.setup(CrunchTest.java:98)
> > at test.CrunchTest.main(CrunchTest.java:367)
> > Caused by: java.io.IOException: Failed on local exception:
> > com.google.protobuf.InvalidProtocolBufferException: Protocol message
> > end-group tag did not match expected tag.; Host Details : local host is:
> > "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
> > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1164)
> > at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> > at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> > at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> > at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
> > at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
> > at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> > at
> >
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
> > ... 3 more
> > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> > message end-group tag did not match expected tag.
> > at
> >
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
> > at
> >
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> > at
> >
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
> > at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> > 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> > ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> > com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> > java.lang.NoSuchMethodError:
> > com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> > at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
> > at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
> > at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> > at
> >
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
> > at
> >
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> >
> > Google search on this error yielded solutions that asked to confirm that
> > /etc/hosts file contained the entry for NARLIN which it does in my case.
> >
> > Here's the code that I am using to set up the MRPipeline:
> >
> > Configuration conf = HBaseConfiguration.create();
> >
> > conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> > conf.set("mapred.job.tracker", "<server_address>:50030");
> >
> > System.out.println("Hadoop configuration created.");
> > System.out.println("Initializing crunch pipeline ...");
> >
> > conf.set("mapred.jar", "<path_to_jar_file>");
> >
> > pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
> >
> > Has anyone faced this issue before and knows how to resolve it/point out
> if
> > I am missing anything?
> >
> > Thanks for the help.
>
>
>
> --
> Harsh J
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
Thank you Shekhar, Harsh. I will follow and try to implement your
suggestions. I appreciate the help.


On Sat, Aug 31, 2013 at 11:12 AM, Harsh J <ha...@cloudera.com> wrote:

> Your cluster is using HDFS HA, and therefore requires a little more
> configs than just fs.defaultFS/etc..
>
> You need to use the right set of cluster client configs. If you don't
> have them at /etc/hadoop/conf and /etc/hbase/conf on your cluster edge
> node to pull from, try asking your cluster administrator for a
> configuration set, and place their parent directories on your
> application's classpath.
>
> The first error deals with perhaps you also including a guava
> dependency in your project, which is different than the one
> transitively pulled in by hadoop-client via crunch. You should be able
> to use guava libs without needing an explicit dependency, and it would
> be the right needed version.
>
> The second error deals with your MR submission failing, cause the JT
> is using a staging directory over a HDFS HA, which uses a "logical"
> name of "bdatadev". A logical HA name needs other configs (typically
> in the hdfs-site.xml) that tell it which are the actual physical NNs
> under it - configs that you're missing here.
>
> On Sat, Aug 31, 2013 at 1:34 AM, Narlin M <hp...@gmail.com> wrote:
> > I am getting following exception while trying to submit a crunch pipeline
> > job to a remote hadoop cluster:
> >
> > Exception in thread "main" java.lang.RuntimeException: Cannot create job
> > output directory /tmp/crunch-324987940
> > at
> >
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
> > at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> > at test.CrunchTest.setup(CrunchTest.java:98)
> > at test.CrunchTest.main(CrunchTest.java:367)
> > Caused by: java.io.IOException: Failed on local exception:
> > com.google.protobuf.InvalidProtocolBufferException: Protocol message
> > end-group tag did not match expected tag.; Host Details : local host is:
> > "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
> > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1164)
> > at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> > at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> > at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> > at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
> > at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
> > at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> > at
> >
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
> > ... 3 more
> > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> > message end-group tag did not match expected tag.
> > at
> >
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
> > at
> >
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> > at
> >
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
> > at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> > 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> > ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> > com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> > java.lang.NoSuchMethodError:
> > com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> > at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
> > at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
> > at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> > at
> >
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
> > at
> >
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> >
> > Google search on this error yielded solutions that asked to confirm that
> > /etc/hosts file contained the entry for NARLIN which it does in my case.
> >
> > Here's the code that I am using to set up the MRPipeline:
> >
> > Configuration conf = HBaseConfiguration.create();
> >
> > conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> > conf.set("mapred.job.tracker", "<server_address>:50030");
> >
> > System.out.println("Hadoop configuration created.");
> > System.out.println("Initializing crunch pipeline ...");
> >
> > conf.set("mapred.jar", "<path_to_jar_file>");
> >
> > pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
> >
> > Has anyone faced this issue before and knows how to resolve it/point out
> if
> > I am missing anything?
> >
> > Thanks for the help.
>
>
>
> --
> Harsh J
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
Thank you Shekhar, Harsh. I will follow and try to implement your
suggestions. I appreciate the help.


On Sat, Aug 31, 2013 at 11:12 AM, Harsh J <ha...@cloudera.com> wrote:

> Your cluster is using HDFS HA, and therefore requires a little more
> configs than just fs.defaultFS/etc..
>
> You need to use the right set of cluster client configs. If you don't
> have them at /etc/hadoop/conf and /etc/hbase/conf on your cluster edge
> node to pull from, try asking your cluster administrator for a
> configuration set, and place their parent directories on your
> application's classpath.
>
> The first error deals with perhaps you also including a guava
> dependency in your project, which is different than the one
> transitively pulled in by hadoop-client via crunch. You should be able
> to use guava libs without needing an explicit dependency, and it would
> be the right needed version.
>
> The second error deals with your MR submission failing, cause the JT
> is using a staging directory over a HDFS HA, which uses a "logical"
> name of "bdatadev". A logical HA name needs other configs (typically
> in the hdfs-site.xml) that tell it which are the actual physical NNs
> under it - configs that you're missing here.
>
> On Sat, Aug 31, 2013 at 1:34 AM, Narlin M <hp...@gmail.com> wrote:
> > I am getting following exception while trying to submit a crunch pipeline
> > job to a remote hadoop cluster:
> >
> > Exception in thread "main" java.lang.RuntimeException: Cannot create job
> > output directory /tmp/crunch-324987940
> > at
> >
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
> > at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> > at test.CrunchTest.setup(CrunchTest.java:98)
> > at test.CrunchTest.main(CrunchTest.java:367)
> > Caused by: java.io.IOException: Failed on local exception:
> > com.google.protobuf.InvalidProtocolBufferException: Protocol message
> > end-group tag did not match expected tag.; Host Details : local host is:
> > "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
> > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1164)
> > at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> > at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> > at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> > at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
> > at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
> > at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> > at
> >
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
> > ... 3 more
> > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> > message end-group tag did not match expected tag.
> > at
> >
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
> > at
> >
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
> > at
> >
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
> > at
> >
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> > at
> >
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
> > at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> > 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> > ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> > com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> > java.lang.NoSuchMethodError:
> > com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> > at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
> > at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> > at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
> > at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> > at
> >
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
> > at
> >
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> >
> > Google search on this error yielded solutions that asked to confirm that
> > /etc/hosts file contained the entry for NARLIN which it does in my case.
> >
> > Here's the code that I am using to set up the MRPipeline:
> >
> > Configuration conf = HBaseConfiguration.create();
> >
> > conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> > conf.set("mapred.job.tracker", "<server_address>:50030");
> >
> > System.out.println("Hadoop configuration created.");
> > System.out.println("Initializing crunch pipeline ...");
> >
> > conf.set("mapred.jar", "<path_to_jar_file>");
> >
> > pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
> >
> > Has anyone faced this issue before and knows how to resolve it/point out
> if
> > I am missing anything?
> >
> > Thanks for the help.
>
>
>
> --
> Harsh J
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Harsh J <ha...@cloudera.com>.
Your cluster is using HDFS HA, and therefore requires a little more
configs than just fs.defaultFS/etc..

You need to use the right set of cluster client configs. If you don't
have them at /etc/hadoop/conf and /etc/hbase/conf on your cluster edge
node to pull from, try asking your cluster administrator for a
configuration set, and place their parent directories on your
application's classpath.

The first error deals with perhaps you also including a guava
dependency in your project, which is different than the one
transitively pulled in by hadoop-client via crunch. You should be able
to use guava libs without needing an explicit dependency, and it would
be the right needed version.

The second error deals with your MR submission failing, cause the JT
is using a staging directory over a HDFS HA, which uses a "logical"
name of "bdatadev". A logical HA name needs other configs (typically
in the hdfs-site.xml) that tell it which are the actual physical NNs
under it - configs that you're missing here.

On Sat, Aug 31, 2013 at 1:34 AM, Narlin M <hp...@gmail.com> wrote:
> I am getting following exception while trying to submit a crunch pipeline
> job to a remote hadoop cluster:
>
> Exception in thread "main" java.lang.RuntimeException: Cannot create job
> output directory /tmp/crunch-324987940
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> at test.CrunchTest.setup(CrunchTest.java:98)
> at test.CrunchTest.main(CrunchTest.java:367)
> Caused by: java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> end-group tag did not match expected tag.; Host Details : local host is:
> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
> ... 3 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.
> at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
> at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
> at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
> at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Google search on this error yielded solutions that asked to confirm that
> /etc/hosts file contained the entry for NARLIN which it does in my case.
>
> Here's the code that I am using to set up the MRPipeline:
>
> Configuration conf = HBaseConfiguration.create();
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> conf.set("mapred.job.tracker", "<server_address>:50030");
>
> System.out.println("Hadoop configuration created.");
> System.out.println("Initializing crunch pipeline ...");
>
> conf.set("mapred.jar", "<path_to_jar_file>");
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>
> Has anyone faced this issue before and knows how to resolve it/point out if
> I am missing anything?
>
> Thanks for the help.



-- 
Harsh J

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Harsh J <ha...@cloudera.com>.
Your cluster is using HDFS HA, and therefore requires a little more
configs than just fs.defaultFS/etc..

You need to use the right set of cluster client configs. If you don't
have them at /etc/hadoop/conf and /etc/hbase/conf on your cluster edge
node to pull from, try asking your cluster administrator for a
configuration set, and place their parent directories on your
application's classpath.

The first error deals with perhaps you also including a guava
dependency in your project, which is different than the one
transitively pulled in by hadoop-client via crunch. You should be able
to use guava libs without needing an explicit dependency, and it would
be the right needed version.

The second error deals with your MR submission failing, cause the JT
is using a staging directory over a HDFS HA, which uses a "logical"
name of "bdatadev". A logical HA name needs other configs (typically
in the hdfs-site.xml) that tell it which are the actual physical NNs
under it - configs that you're missing here.

On Sat, Aug 31, 2013 at 1:34 AM, Narlin M <hp...@gmail.com> wrote:
> I am getting following exception while trying to submit a crunch pipeline
> job to a remote hadoop cluster:
>
> Exception in thread "main" java.lang.RuntimeException: Cannot create job
> output directory /tmp/crunch-324987940
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> at test.CrunchTest.setup(CrunchTest.java:98)
> at test.CrunchTest.main(CrunchTest.java:367)
> Caused by: java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> end-group tag did not match expected tag.; Host Details : local host is:
> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
> ... 3 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.
> at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
> at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
> at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
> at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Google search on this error yielded solutions that asked to confirm that
> /etc/hosts file contained the entry for NARLIN which it does in my case.
>
> Here's the code that I am using to set up the MRPipeline:
>
> Configuration conf = HBaseConfiguration.create();
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> conf.set("mapred.job.tracker", "<server_address>:50030");
>
> System.out.println("Hadoop configuration created.");
> System.out.println("Initializing crunch pipeline ...");
>
> conf.set("mapred.jar", "<path_to_jar_file>");
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>
> Has anyone faced this issue before and knows how to resolve it/point out if
> I am missing anything?
>
> Thanks for the help.



-- 
Harsh J

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Harsh J <ha...@cloudera.com>.
Your cluster is using HDFS HA, and therefore requires a little more
configs than just fs.defaultFS/etc..

You need to use the right set of cluster client configs. If you don't
have them at /etc/hadoop/conf and /etc/hbase/conf on your cluster edge
node to pull from, try asking your cluster administrator for a
configuration set, and place their parent directories on your
application's classpath.

The first error deals with perhaps you also including a guava
dependency in your project, which is different than the one
transitively pulled in by hadoop-client via crunch. You should be able
to use guava libs without needing an explicit dependency, and it would
be the right needed version.

The second error deals with your MR submission failing, cause the JT
is using a staging directory over a HDFS HA, which uses a "logical"
name of "bdatadev". A logical HA name needs other configs (typically
in the hdfs-site.xml) that tell it which are the actual physical NNs
under it - configs that you're missing here.

On Sat, Aug 31, 2013 at 1:34 AM, Narlin M <hp...@gmail.com> wrote:
> I am getting following exception while trying to submit a crunch pipeline
> job to a remote hadoop cluster:
>
> Exception in thread "main" java.lang.RuntimeException: Cannot create job
> output directory /tmp/crunch-324987940
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> at test.CrunchTest.setup(CrunchTest.java:98)
> at test.CrunchTest.main(CrunchTest.java:367)
> Caused by: java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> end-group tag did not match expected tag.; Host Details : local host is:
> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
> ... 3 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.
> at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
> at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
> at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
> at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Google search on this error yielded solutions that asked to confirm that
> /etc/hosts file contained the entry for NARLIN which it does in my case.
>
> Here's the code that I am using to set up the MRPipeline:
>
> Configuration conf = HBaseConfiguration.create();
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> conf.set("mapred.job.tracker", "<server_address>:50030");
>
> System.out.println("Hadoop configuration created.");
> System.out.println("Initializing crunch pipeline ...");
>
> conf.set("mapred.jar", "<path_to_jar_file>");
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>
> Has anyone faced this issue before and knows how to resolve it/point out if
> I am missing anything?
>
> Thanks for the help.



-- 
Harsh J

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
Looks like I was pointing to incorrect ports. After correcting the port
numbers,

conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
conf.set("mapred.job.tracker", "<server_address>:8021");

I am now getting the following exception:

2880 [Thread-15] INFO
 org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
at
org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.net.UnknownHostException: bdatadev
... 27 more

However nowhere in my code a host named "bdatadev" is mentioned, and I
cannot ping this host.

Thanks for the help.


On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:

> I am getting following exception while trying to submit a crunch pipeline
> job to a remote hadoop cluster:
>
> Exception in thread "main" java.lang.RuntimeException: Cannot create job
> output directory /tmp/crunch-324987940
>  at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
>  at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> at test.CrunchTest.setup(CrunchTest.java:98)
>  at test.CrunchTest.main(CrunchTest.java:367)
> Caused by: java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> end-group tag did not match expected tag.; Host Details : local host is:
> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>  at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>  at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>  at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
>  at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
>  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
>  ... 3 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.
>  at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>  at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>  at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>  at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>  at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>  at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>  at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
>  at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Google search on this error yielded solutions that asked to confirm that
> /etc/hosts file contained the entry for NARLIN which it does in my case.
>
> Here's the code that I am using to set up the MRPipeline:
>
> Configuration conf = HBaseConfiguration.create();
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> conf.set("mapred.job.tracker", "<server_address>:50030");
>
> System.out.println("Hadoop configuration created.");
> System.out.println("Initializing crunch pipeline ...");
>
> conf.set("mapred.jar", "<path_to_jar_file>");
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>
> Has anyone faced this issue before and knows how to resolve it/point out
> if I am missing anything?
>
> Thanks for the help.
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
Looks like I was pointing to incorrect ports. After correcting the port
numbers,

conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
conf.set("mapred.job.tracker", "<server_address>:8021");

I am now getting the following exception:

2880 [Thread-15] INFO
 org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
at
org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.net.UnknownHostException: bdatadev
... 27 more

However nowhere in my code a host named "bdatadev" is mentioned, and I
cannot ping this host.

Thanks for the help.


On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:

> I am getting following exception while trying to submit a crunch pipeline
> job to a remote hadoop cluster:
>
> Exception in thread "main" java.lang.RuntimeException: Cannot create job
> output directory /tmp/crunch-324987940
>  at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
>  at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> at test.CrunchTest.setup(CrunchTest.java:98)
>  at test.CrunchTest.main(CrunchTest.java:367)
> Caused by: java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> end-group tag did not match expected tag.; Host Details : local host is:
> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>  at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>  at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>  at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
>  at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
>  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
>  ... 3 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.
>  at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>  at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>  at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>  at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>  at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>  at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>  at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
>  at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Google search on this error yielded solutions that asked to confirm that
> /etc/hosts file contained the entry for NARLIN which it does in my case.
>
> Here's the code that I am using to set up the MRPipeline:
>
> Configuration conf = HBaseConfiguration.create();
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> conf.set("mapred.job.tracker", "<server_address>:50030");
>
> System.out.println("Hadoop configuration created.");
> System.out.println("Initializing crunch pipeline ...");
>
> conf.set("mapred.jar", "<path_to_jar_file>");
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>
> Has anyone faced this issue before and knows how to resolve it/point out
> if I am missing anything?
>
> Thanks for the help.
>

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Harsh J <ha...@cloudera.com>.
Your cluster is using HDFS HA, and therefore requires a little more
configs than just fs.defaultFS/etc..

You need to use the right set of cluster client configs. If you don't
have them at /etc/hadoop/conf and /etc/hbase/conf on your cluster edge
node to pull from, try asking your cluster administrator for a
configuration set, and place their parent directories on your
application's classpath.

The first error deals with perhaps you also including a guava
dependency in your project, which is different than the one
transitively pulled in by hadoop-client via crunch. You should be able
to use guava libs without needing an explicit dependency, and it would
be the right needed version.

The second error deals with your MR submission failing, cause the JT
is using a staging directory over a HDFS HA, which uses a "logical"
name of "bdatadev". A logical HA name needs other configs (typically
in the hdfs-site.xml) that tell it which are the actual physical NNs
under it - configs that you're missing here.

On Sat, Aug 31, 2013 at 1:34 AM, Narlin M <hp...@gmail.com> wrote:
> I am getting following exception while trying to submit a crunch pipeline
> job to a remote hadoop cluster:
>
> Exception in thread "main" java.lang.RuntimeException: Cannot create job
> output directory /tmp/crunch-324987940
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
> at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> at test.CrunchTest.setup(CrunchTest.java:98)
> at test.CrunchTest.main(CrunchTest.java:367)
> Caused by: java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> end-group tag did not match expected tag.; Host Details : local host is:
> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
> ... 3 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.
> at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
> at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
> at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
> at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
> at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
> at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Google search on this error yielded solutions that asked to confirm that
> /etc/hosts file contained the entry for NARLIN which it does in my case.
>
> Here's the code that I am using to set up the MRPipeline:
>
> Configuration conf = HBaseConfiguration.create();
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> conf.set("mapred.job.tracker", "<server_address>:50030");
>
> System.out.println("Hadoop configuration created.");
> System.out.println("Initializing crunch pipeline ...");
>
> conf.set("mapred.jar", "<path_to_jar_file>");
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>
> Has anyone faced this issue before and knows how to resolve it/point out if
> I am missing anything?
>
> Thanks for the help.



-- 
Harsh J

Re: InvalidProtocolBufferException while submitting crunch job to cluster

Posted by Narlin M <hp...@gmail.com>.
Looks like I was pointing to incorrect ports. After correcting the port
numbers,

conf.set("fs.defaultFS", "hdfs://<server_address>:8020");
conf.set("mapred.job.tracker", "<server_address>:8021");

I am now getting the following exception:

2880 [Thread-15] INFO
 org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob  -
java.lang.IllegalArgumentException: java.net.UnknownHostException: bdatadev
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:305)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:180)
at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:209)
at
org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:100)
at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:51)
at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:75)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.net.UnknownHostException: bdatadev
... 27 more

However nowhere in my code a host named "bdatadev" is mentioned, and I
cannot ping this host.

Thanks for the help.


On Fri, Aug 30, 2013 at 3:04 PM, Narlin M <hp...@gmail.com> wrote:

> I am getting following exception while trying to submit a crunch pipeline
> job to a remote hadoop cluster:
>
> Exception in thread "main" java.lang.RuntimeException: Cannot create job
> output directory /tmp/crunch-324987940
>  at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)
>  at org.apache.crunch.impl.mr.MRPipeline.<init>(MRPipeline.java:125)
> at test.CrunchTest.setup(CrunchTest.java:98)
>  at test.CrunchTest.main(CrunchTest.java:367)
> Caused by: java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> end-group tag did not match expected tag.; Host Details : local host is:
> "NARLIN/127.0.0.1"; destination host is: "<server_address>":50070;
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>  at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>  at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
>  at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:425)
>  at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1943)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:523)
>  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1799)
> at
> org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:342)
>  ... 3 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.
>  at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>  at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>  at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
> at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>  at
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>  at
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
> at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>  at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
> 0    [Thread-3] WARN  org.apache.hadoop.util.ShutdownHookManager  -
> ShutdownHook 'ClientFinalizer' failed, java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> java.lang.NoSuchMethodError:
> com.google.common.collect.LinkedListMultimap.values()Ljava/util/List;
> at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:135)
>  at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:672)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>  at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2308)
> at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2324)
>  at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Google search on this error yielded solutions that asked to confirm that
> /etc/hosts file contained the entry for NARLIN which it does in my case.
>
> Here's the code that I am using to set up the MRPipeline:
>
> Configuration conf = HBaseConfiguration.create();
>
> conf.set("fs.defaultFS", "hdfs://<server_address>:50070");
> conf.set("mapred.job.tracker", "<server_address>:50030");
>
> System.out.println("Hadoop configuration created.");
> System.out.println("Initializing crunch pipeline ...");
>
> conf.set("mapred.jar", "<path_to_jar_file>");
>
> pipeline = new MRPipeline(getClass(), "crunchjobtest", conf);
>
> Has anyone faced this issue before and knows how to resolve it/point out
> if I am missing anything?
>
> Thanks for the help.
>