You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Gilberto Lira <gi...@scanboo.com.br> on 2014/09/23 18:39:05 UTC

Spark 1.1.0 hbase_inputformat.py not work

Hi,

i'm trying to run hbase_inputformat.py example but i'm not getting.

this is the error:

Traceback (most recent call last):
  File "/root/spark/examples/src/main/python/hbase_inputformat.py", line
70, in <module>
    conf=conf)
  File "/root/spark/python/pyspark/context.py", line 471, in newAPIHadoopRDD
    jconf, batchSize)
  File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
line 538, in __call__
  File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line
300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.io.ImmutableBytesWritable
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
at
org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
at
org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
at org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)

can anyone help me?

Re: Spark 1.1.0 hbase_inputformat.py not work

Posted by Gilberto Lira <gi...@scanboo.com.br>.

Thank you Zhang!

I am grateful for your help!

2014-10-01 14:05 GMT-03:00 Kan Zhang <kz...@apache.org>:

> CC user@ for indexing.
>
> Glad you fixed it. All source code for these examples are under
> SPARK_HOME/examples. For example, the converters used here are in
> examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala
>
> Btw, you may find our blog post useful.
>
> https://databricks.com/blog/2014/09/17/spark-1-1-bringing-hadoop-inputoutput-formats-to-pyspark.html
>
> On Wed, Oct 1, 2014 at 6:54 AM, Gilberto Lira <gi...@scanboo.com.br> wrote:
>
>> Exactly Kan, this was the problem!!
>>
>> By the way, I have not found the source code of these examples, you know
>> where I can find?
>>
>> Thanks
>>
>> 2014-10-01 1:37 GMT-03:00 Kan Zhang <kz...@apache.org>:
>>
>>> I somehow missed this. Do you still have problem? You probably didn't
>>> specify the correct spark-examples jar using --driver-class-path.  See
>>> the following for an example.
>>>
>>> MASTER=local ./bin/spark-submit --driver-class-path
>>> ./examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop1.0.4.jar
>>> ./examples/src/main/python/hbase_inputformat.py localhost test
>>>
>>> On Tue, Sep 23, 2014 at 9:39 AM, Gilberto Lira <gi...@scanboo.com.br>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> i'm trying to run hbase_inputformat.py example but i'm not getting.
>>>>
>>>> this is the error:
>>>>
>>>> Traceback (most recent call last):
>>>>   File "/root/spark/examples/src/main/python/hbase_inputformat.py",
>>>> line 70, in <module>
>>>>     conf=conf)
>>>>   File "/root/spark/python/pyspark/context.py", line 471, in
>>>> newAPIHadoopRDD
>>>>     jconf, batchSize)
>>>>   File
>>>> "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line
>>>> 538, in __call__
>>>>   File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
>>>> line 300, in get_return_value
>>>> py4j.protocol.Py4JJavaError: An error occurred while calling
>>>> z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
>>>> : java.lang.ClassNotFoundException:
>>>> org.apache.hadoop.hbase.io.ImmutableBytesWritable
>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>> at java.lang.Class.forName0(Native Method)
>>>> at java.lang.Class.forName(Class.java:270)
>>>> at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
>>>> at
>>>> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
>>>> at
>>>> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
>>>> at
>>>> org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>> at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>>>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>>>> at py4j.Gateway.invoke(Gateway.java:259)
>>>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>>>> at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>> at py4j.GatewayConnection.run(GatewayConnection.java:207)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>>
>>>> can anyone help me?
>>>>
>>>
>>>
>>
>

Re: Spark 1.1.0 hbase_inputformat.py not work

Posted by Kan Zhang <kz...@apache.org>.

CC user@ for indexing.

Glad you fixed it. All source code for these examples are under
SPARK_HOME/examples. For example, the converters used here are in
examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala

Btw, you may find our blog post useful.
https://databricks.com/blog/2014/09/17/spark-1-1-bringing-hadoop-inputoutput-formats-to-pyspark.html

On Wed, Oct 1, 2014 at 6:54 AM, Gilberto Lira <gi...@scanboo.com.br> wrote:

> Exactly Kan, this was the problem!!
>
> By the way, I have not found the source code of these examples, you know
> where I can find?
>
> Thanks
>
> 2014-10-01 1:37 GMT-03:00 Kan Zhang <kz...@apache.org>:
>
>> I somehow missed this. Do you still have problem? You probably didn't
>> specify the correct spark-examples jar using --driver-class-path.  See
>> the following for an example.
>>
>> MASTER=local ./bin/spark-submit --driver-class-path
>> ./examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop1.0.4.jar
>> ./examples/src/main/python/hbase_inputformat.py localhost test
>>
>> On Tue, Sep 23, 2014 at 9:39 AM, Gilberto Lira <gi...@scanboo.com.br>
>> wrote:
>>
>>> Hi,
>>>
>>> i'm trying to run hbase_inputformat.py example but i'm not getting.
>>>
>>> this is the error:
>>>
>>> Traceback (most recent call last):
>>>   File "/root/spark/examples/src/main/python/hbase_inputformat.py", line
>>> 70, in <module>
>>>     conf=conf)
>>>   File "/root/spark/python/pyspark/context.py", line 471, in
>>> newAPIHadoopRDD
>>>     jconf, batchSize)
>>>   File
>>> "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line
>>> 538, in __call__
>>>   File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
>>> line 300, in get_return_value
>>> py4j.protocol.Py4JJavaError: An error occurred while calling
>>> z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
>>> : java.lang.ClassNotFoundException:
>>> org.apache.hadoop.hbase.io.ImmutableBytesWritable
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:270)
>>> at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
>>> at
>>> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
>>> at
>>> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
>>> at org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>>> at py4j.Gateway.invoke(Gateway.java:259)
>>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>>> at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>> at py4j.GatewayConnection.run(GatewayConnection.java:207)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> can anyone help me?
>>>
>>
>>
>

Re: Spark 1.1.0 hbase_inputformat.py not work

Posted by Kan Zhang <kz...@apache.org>.

I somehow missed this. Do you still have problem? You probably didn't
specify the correct spark-examples jar using --driver-class-path.  See the
following for an example.

MASTER=local ./bin/spark-submit --driver-class-path
./examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop1.0.4.jar
./examples/src/main/python/hbase_inputformat.py localhost test

On Tue, Sep 23, 2014 at 9:39 AM, Gilberto Lira <gi...@scanboo.com.br> wrote:

> Hi,
>
> i'm trying to run hbase_inputformat.py example but i'm not getting.
>
> this is the error:
>
> Traceback (most recent call last):
>   File "/root/spark/examples/src/main/python/hbase_inputformat.py", line
> 70, in <module>
>     conf=conf)
>   File "/root/spark/python/pyspark/context.py", line 471, in
> newAPIHadoopRDD
>     jconf, batchSize)
>   File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
> line 538, in __call__
>   File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
> line 300, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling
> z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
> : java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.io.ImmutableBytesWritable
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
> at
> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
> at
> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
> at org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:207)
> at java.lang.Thread.run(Thread.java:745)
>
> can anyone help me?
>

Re: Spark 1.1.0 hbase_inputformat.py not work

Posted by freedafeng <fr...@yahoo.com>.

I don't know if it's relevant, but I had to compile spark for my specific
hbase and hadoop version to make that hbase_inputformat.py work.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-0-hbase-inputformat-py-not-work-tp14905p14912.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org