You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by suman bharadwaj <su...@gmail.com> on 2014/01/20 22:05:11 UTC

SPARK protocol buffer issue. Need Help

*Hi,*

*I'm new to spark. And I was trying to read a file residing in HDFS. And
perform some basic actions on this dataset. See below the code i used:*

*object Hbase {*
*  def main(args: Array[String]) {*
*    val sc = new
SparkContext("spark://<servername>:<portno>","<somename>")*
*     val input =
sc.textFile("hdfs://<servername>/user/cool/inputWrite.txt")*
*       input.count()*
*  }*
*}*

*Also see below the the .sbt file content. *

*name := "Simple Project"*

*version := "1.0"*

*scalaVersion := "2.9.3"*

*libraryDependencies ++=Seq("org.apache.spark" %% "spark-core" %
"0.8.0-incubating","org.apache.hadoop" % "hadoop-client" %
"2.0.4-alpha","com.google.protobuf" % "protobuf-java" % "2.4.1" force())*

*resolvers += "Akka Repository" at "http://repo.akka.io/releases/
<http://repo.akka.io/releases/>"*

*When i do "sbt run", I'm seeing below error. Can someone help me resolve
this issue? *

*java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Message missing
required fields: callId, status; Host Details : local host is: ;
destination host is: ;*
*java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Message missing
required fields: callId, status; Host Details : local host is: ;
destination host is: ;*
*        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)*
*        at org.apache.hadoop.ipc.Client.call(Client.java:1239)*
*        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)*
*        at $Proxy12.getFileInfo(Unknown Source)*
*        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
*        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)*
*        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)*
*        at java.lang.reflect.Method.invoke(Method.java:597)*
*        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)*
*        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)*
*        at $Proxy12.getFileInfo(Unknown Source)*
*        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:630)*
*        at
org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1559)*
*        at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:811)*
*        at
org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1649)*
*        at
org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1595)*
*        at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:207)*
*        at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)*
*        at
org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:70)*
*        at org.apache.spark.rdd.RDD.partitions(RDD.scala:199)*
*        at
org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:26)*
*        at org.apache.spark.rdd.RDD.partitions(RDD.scala:199)*
*        at org.apache.spark.SparkContext.runJob(SparkContext.scala:772)*
*        at org.apache.spark.rdd.RDD.count(RDD.scala:677)*
*        at Hbase$.main(Hbase.scala:7)*
*        at Hbase.main(Hbase.scala)*
*        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
*        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)*
*        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)*
*        at java.lang.reflect.Method.invoke(Method.java:597)*
*Caused by: com.google.protobuf.InvalidProtocolBufferException: Message
missing required fields: callId, status*
*        at
com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:81)*
*        at
org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.buildParsed(RpcPayloadHeaderProtos.java:1094)*
*        at
org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.access$1300(RpcPayloadHeaderProtos.java:1028)*
*        at
org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:986)*
*        at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:946)*
*        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)*
*[trace] Stack trace suppressed: run last compile:run for the full output.*
*java.lang.RuntimeException: Nonzero exit code: 1*
*        at scala.sys.package$.error(package.scala:27)*
*[trace] Stack trace suppressed: run last compile:run for the full output.*
*[error] (compile:run) Nonzero exit code: 1*
*[error] Total time: 3 s, completed Jan 20, 2014 12:57:32 PM*


Regards,
SB

Re: SPARK protocol buffer issue. Need Help

Posted by Suman Subash <ss...@gmail.com>.
Hi Sean,

Thanks. You are right. The SPARK_HOME , lib_managed folder has a different
protocol buffer version jar than in "/usr/lib/hadoop/lib". In hadoop lib, I
have 2.4.0a version and in lib_managed i have 2.4.1 version which is, as
you said, is conflicting.

I'm really new to SPARK and SCALA as well. I did the following.

libraryDependencies ++=Seq("org.apache.spark" %% "spark-core" %
"0.8.0-incubating","org.apache.hadoop" % "hadoop-client" %
"2.0.4-alpha","com.google.protobuf" % "protobuf-java" % "2.4.0a" force())

But this doesn't seem to be working. I get the same error. I really don't
know how to force SPARK to use 2.4.0a ? Any ideas ?

Regards,
SB





On 21 January 2014 03:15, Sean Owen <so...@cloudera.com> wrote:

> Every time I see the magic words...
>
> InvalidProtocolBufferException: Message missing required fields: callId,
> status;
>
> ... it indicates that a client of something is using protobuf 2.4 and
> the server is using protobuf 2.5. Here you are using protobuf 2.4,
> check. And I suppose you are using HDFS from a Hadoop 2.2.x
> distribution? that uses protobuf 2.5.
>
> While I suspect that is the cause, others here might actually have a
> solution. Can you force protobuf 2.5 instead of 2.4? I am aware of a
> different build profile for YARN which might help too.
> --
> Sean Owen | Director, Data Science | London
>
>
> On Mon, Jan 20, 2014 at 9:05 PM, suman bharadwaj <su...@gmail.com>
> wrote:
> > Hi,
> >
> > I'm new to spark. And I was trying to read a file residing in HDFS. And
> > perform some basic actions on this dataset. See below the code i used:
> >
> > object Hbase {
> >   def main(args: Array[String]) {
> >     val sc = new
> SparkContext("spark://<servername>:<portno>","<somename>")
> >      val input =
> sc.textFile("hdfs://<servername>/user/cool/inputWrite.txt")
> >        input.count()
> >   }
> > }
> >
> > Also see below the the .sbt file content.
> >
> > name := "Simple Project"
> >
> > version := "1.0"
> >
> > scalaVersion := "2.9.3"
> >
> > libraryDependencies ++=Seq("org.apache.spark" %% "spark-core" %
> > "0.8.0-incubating","org.apache.hadoop" % "hadoop-client" %
> > "2.0.4-alpha","com.google.protobuf" % "protobuf-java" % "2.4.1" force())
> >
> > resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
> >
> > When i do "sbt run", I'm seeing below error. Can someone help me resolve
> > this issue?
> >
> > java.io.IOException: Failed on local exception:
> > com.google.protobuf.InvalidProtocolBufferException: Message missing
> required
> > fields: callId, status; Host Details : local host is: ; destination host
> is:
> > ;
> > java.io.IOException: Failed on local exception:
> > com.google.protobuf.InvalidProtocolBufferException: Message missing
> required
> > fields: callId, status; Host Details : local host is: ; destination host
> is:
> > ;
> >         at
> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1239)
> >         at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> >         at $Proxy12.getFileInfo(Unknown Source)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> >         at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> >         at $Proxy12.getFileInfo(Unknown Source)
> >         at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:630)
> >         at
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1559)
> >         at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:811)
> >         at
> > org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1649)
> >         at
> org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1595)
> >         at
> >
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:207)
> >         at
> >
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)
> >         at
> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:70)
> >         at org.apache.spark.rdd.RDD.partitions(RDD.scala:199)
> >         at
> org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:26)
> >         at org.apache.spark.rdd.RDD.partitions(RDD.scala:199)
> >         at org.apache.spark.SparkContext.runJob(SparkContext.scala:772)
> >         at org.apache.spark.rdd.RDD.count(RDD.scala:677)
> >         at Hbase$.main(Hbase.scala:7)
> >         at Hbase.main(Hbase.scala)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> > Caused by: com.google.protobuf.InvalidProtocolBufferException: Message
> > missing required fields: callId, status
> >         at
> >
> com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:81)
> >         at
> >
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.buildParsed(RpcPayloadHeaderProtos.java:1094)
> >         at
> >
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.access$1300(RpcPayloadHeaderProtos.java:1028)
> >         at
> >
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:986)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:946)
> >         at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
> > [trace] Stack trace suppressed: run last compile:run for the full output.
> > java.lang.RuntimeException: Nonzero exit code: 1
> >         at scala.sys.package$.error(package.scala:27)
> > [trace] Stack trace suppressed: run last compile:run for the full output.
> > [error] (compile:run) Nonzero exit code: 1
> > [error] Total time: 3 s, completed Jan 20, 2014 12:57:32 PM
> >
> >
> > Regards,
> > SB
>

Re: SPARK protocol buffer issue. Need Help

Posted by Sean Owen <so...@cloudera.com>.
Every time I see the magic words...

InvalidProtocolBufferException: Message missing required fields: callId, status;

... it indicates that a client of something is using protobuf 2.4 and
the server is using protobuf 2.5. Here you are using protobuf 2.4,
check. And I suppose you are using HDFS from a Hadoop 2.2.x
distribution? that uses protobuf 2.5.

While I suspect that is the cause, others here might actually have a
solution. Can you force protobuf 2.5 instead of 2.4? I am aware of a
different build profile for YARN which might help too.
--
Sean Owen | Director, Data Science | London


On Mon, Jan 20, 2014 at 9:05 PM, suman bharadwaj <su...@gmail.com> wrote:
> Hi,
>
> I'm new to spark. And I was trying to read a file residing in HDFS. And
> perform some basic actions on this dataset. See below the code i used:
>
> object Hbase {
>   def main(args: Array[String]) {
>     val sc = new SparkContext("spark://<servername>:<portno>","<somename>")
>      val input = sc.textFile("hdfs://<servername>/user/cool/inputWrite.txt")
>        input.count()
>   }
> }
>
> Also see below the the .sbt file content.
>
> name := "Simple Project"
>
> version := "1.0"
>
> scalaVersion := "2.9.3"
>
> libraryDependencies ++=Seq("org.apache.spark" %% "spark-core" %
> "0.8.0-incubating","org.apache.hadoop" % "hadoop-client" %
> "2.0.4-alpha","com.google.protobuf" % "protobuf-java" % "2.4.1" force())
>
> resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
>
> When i do "sbt run", I'm seeing below error. Can someone help me resolve
> this issue?
>
> java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Message missing required
> fields: callId, status; Host Details : local host is: ; destination host is:
> ;
> java.io.IOException: Failed on local exception:
> com.google.protobuf.InvalidProtocolBufferException: Message missing required
> fields: callId, status; Host Details : local host is: ; destination host is:
> ;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1239)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>         at $Proxy12.getFileInfo(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>         at $Proxy12.getFileInfo(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:630)
>         at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1559)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:811)
>         at
> org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1649)
>         at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1595)
>         at
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:207)
>         at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)
>         at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:70)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:199)
>         at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:26)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:199)
>         at org.apache.spark.SparkContext.runJob(SparkContext.scala:772)
>         at org.apache.spark.rdd.RDD.count(RDD.scala:677)
>         at Hbase$.main(Hbase.scala:7)
>         at Hbase.main(Hbase.scala)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Message
> missing required fields: callId, status
>         at
> com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:81)
>         at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.buildParsed(RpcPayloadHeaderProtos.java:1094)
>         at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.access$1300(RpcPayloadHeaderProtos.java:1028)
>         at
> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:986)
>         at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:946)
>         at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
> [trace] Stack trace suppressed: run last compile:run for the full output.
> java.lang.RuntimeException: Nonzero exit code: 1
>         at scala.sys.package$.error(package.scala:27)
> [trace] Stack trace suppressed: run last compile:run for the full output.
> [error] (compile:run) Nonzero exit code: 1
> [error] Total time: 3 s, completed Jan 20, 2014 12:57:32 PM
>
>
> Regards,
> SB