You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Abhi Basu <90...@gmail.com> on 2016/09/14 15:05:09 UTC

Pyspark interpreter configuration for Zeppelin

%pyspark

input_file = "hdfs:////tmp/filenname.gz"

raw_rdd = sc.textFile(input_file)

Re: Pyspark interpreter configuration for Zeppelin

Posted by Felix Cheung <fe...@hotmail.com>.
I think you need to build Zeppelin from source to run against CDH





On Wed, Sep 14, 2016 at 12:58 PM -0700, "Abhi Basu" <90...@gmail.com>> wrote:

I feel there is a scala compatibility issue and I will try compiling with the right switches.

On Wed, Sep 14, 2016 at 1:54 PM, Abhi Basu <90...@gmail.com>> wrote:
Yes that fixed some of the problems.

I am using Zeppelin 0.6.1 binaries against CDH 5.8 (Spark 1.6.0). Would there be a compatibility issue?

Thanks

Abhi

On Wed, Sep 14, 2016 at 12:55 PM, moon soo Lee <mo...@apache.org>> wrote:
Could you try to set full path of python command on zeppelin.python property? not the bin directory.

On Wed, Sep 14, 2016 at 10:19 AM Abhi Basu <90...@gmail.com>> wrote:
Tried pyspark command on same machine which uses Anaconda python and sc.version returned value.

Zeppelin:
zeppelin.python /home/cloudera/anaconda2/bin

In zeppelin, nothing is returned.


On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee <mo...@apache.org>> wrote:
Did you export SPARK_HOME in conf/zeppelin-env.sh?
Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on the same machine that zeppelin runs?

Thanks,
moon


On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <90...@gmail.com>> wrote:
Oops sorry. the above code generated this error:

RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11} NotebookServer.java[onMessage]:221) - Can't handle message
org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TT<http://org.apache.thrift.transport.TT>ransportException
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:319)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:100)
at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330)
at org.apache.zeppelin.scheduler.Job.abort(Job.java:239)
at org.apache.zeppelin.socket.NotebookServer.cancelParagraph(NotebookServer.java:995)
at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:180)
at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.WebSocketSession.incomingFrame(WebSocketSession.java:309)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.Parser.notifyFrame(Parser.java:220)
at org.eclipse.jetty.websocket.co<http://org.eclipse.jetty.websocket.co>mmon.Parser.parse(Parser.java:258)
at org.eclipse.jetty.websocket.common.io<http://common.io>.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
at org.eclipse.jetty.websocket.common.io<http://common.io>.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
at org.eclipse.jetty.io<http://org.eclipse.jetty.io>.AbstractConnection$2.run(AbstractConnection.java:544)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TT<http://org.apache.thrift.transport.TT>ransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TT<http://org.apache.thrift.transport.TT>ransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_cancel(RemoteInterpreterService.java:274)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.cancel(RemoteInterpreterService.java:259)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:316)
... 21 more


This is my spark interpreter settings:


spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep
Option
Interpreter for note

Connect to existing process
Properties
name value
args
master yarn-client
spark.app.name<http://spark.app.name> Zeppelin
spark.cores.max
spark.executor.memory
zeppelin.R.cmd R
zeppelin.R.image.width 100%
zeppelin.R.knitr true
zeppelin.R.render.options out.format = 'html', comment = NA, echo = FALSE, results = 'asis', message = F, warning = F
zeppelin.dep.additionalRemoteRepository spark-packages,http://dl.bintray.com/spark-packages/maven,false;
zeppelin.dep.localrepo local-repo
zeppelin.interpreter.localRepo /usr/local/bin/zeppelin-0.6.1-bin-all/local-repo/2BXF675WU
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.importImplicit true
zeppelin.spark.maxResult 1000
zeppelin.spark.printREPLOutput true
zeppelin.spark.sql.stacktrace false
zeppelin.spark.useHiveContext true


On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <90...@gmail.com>> wrote:
%pyspark

input_file = "hdfs:////tmp/filenname.gz"

raw_rdd = sc.textFile(input_file)





--
Abhi Basu



--
Abhi Basu



--
Abhi Basu



--
Abhi Basu

Re: Pyspark interpreter configuration for Zeppelin

Posted by Abhi Basu <90...@gmail.com>.
I feel there is a scala compatibility issue and I will try compiling with
the right switches.

On Wed, Sep 14, 2016 at 1:54 PM, Abhi Basu <90...@gmail.com> wrote:

> Yes that fixed some of the problems.
>
> I am using Zeppelin 0.6.1 binaries against CDH 5.8 (Spark 1.6.0). Would
> there be a compatibility issue?
>
> Thanks
>
> Abhi
>
> On Wed, Sep 14, 2016 at 12:55 PM, moon soo Lee <mo...@apache.org> wrote:
>
>> Could you try to set full path of python command on zeppelin.python
>> property? not the bin directory.
>>
>> On Wed, Sep 14, 2016 at 10:19 AM Abhi Basu <90...@gmail.com> wrote:
>>
>>> Tried pyspark command on same machine which uses Anaconda python and
>>> sc.version returned value.
>>>
>>> Zeppelin:
>>> zeppelin.python /home/cloudera/anaconda2/bin
>>>
>>> In zeppelin, nothing is returned.
>>>
>>>
>>> On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee <mo...@apache.org> wrote:
>>>
>>>> Did you export SPARK_HOME in conf/zeppelin-env.sh?
>>>> Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on
>>>> the same machine that zeppelin runs?
>>>>
>>>> Thanks,
>>>> moon
>>>>
>>>>
>>>> On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <90...@gmail.com> wrote:
>>>>
>>>>> Oops sorry. the above code generated this error:
>>>>>
>>>>> RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11}
>>>>> NotebookServer.java[onMessage]:221) - Can't handle message
>>>>> org.apache.zeppelin.interpreter.InterpreterException:
>>>>> org.apache.thrift.transport.TTransportException
>>>>> at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.
>>>>> cancel(RemoteInterpreter.java:319)
>>>>> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(L
>>>>> azyOpenInterpreter.java:100)
>>>>> at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330)
>>>>> at org.apache.zeppelin.scheduler.Job.abort(Job.java:239)
>>>>> at org.apache.zeppelin.socket.NotebookServer.cancelParagraph(No
>>>>> tebookServer.java:995)
>>>>> at org.apache.zeppelin.socket.NotebookServer.onMessage(Notebook
>>>>> Server.java:180)
>>>>> at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(No
>>>>> tebookSocket.java:56)
>>>>> at org.eclipse.jetty.websocket.common.events.JettyListenerEvent
>>>>> Driver.onTextMessage(JettyListenerEventDriver.java:128)
>>>>> at org.eclipse.jetty.websocket.common.message.SimpleTextMessage
>>>>> .messageComplete(SimpleTextMessage.java:69)
>>>>> at org.eclipse.jetty.websocket.common.events.AbstractEventDrive
>>>>> r.appendMessage(AbstractEventDriver.java:65)
>>>>> at org.eclipse.jetty.websocket.common.events.JettyListenerEvent
>>>>> Driver.onTextFrame(JettyListenerEventDriver.java:122)
>>>>> at org.eclipse.jetty.websocket.common.events.AbstractEventDrive
>>>>> r.incomingFrame(AbstractEventDriver.java:161)
>>>>> at org.eclipse.jetty.websocket.common.WebSocketSession.incoming
>>>>> Frame(WebSocketSession.java:309)
>>>>> at org.eclipse.jetty.websocket.common.extensions.ExtensionStack
>>>>> .incomingFrame(ExtensionStack.java:214)
>>>>> at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser
>>>>> .java:220)
>>>>> at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
>>>>> at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConne
>>>>> ction.readParse(AbstractWebSocketConnection.java:632)
>>>>> at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConne
>>>>> ction.onFillable(AbstractWebSocketConnection.java:480)
>>>>> at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnec
>>>>> tion.java:544)
>>>>> at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>>> ThreadPool.java:635)
>>>>> at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedT
>>>>> hreadPool.java:555)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>> Caused by: org.apache.thrift.transport.TTransportException
>>>>> at org.apache.thrift.transport.TIOStreamTransport.read(TIOStrea
>>>>> mTransport.java:132)
>>>>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>>>>> at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryPr
>>>>> otocol.java:429)
>>>>> at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryPr
>>>>> otocol.java:318)
>>>>> at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(
>>>>> TBinaryProtocol.java:219)
>>>>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.
>>>>> java:69)
>>>>> at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterServ
>>>>> ice$Client.recv_cancel(RemoteInterpreterService.java:274)
>>>>> at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterServ
>>>>> ice$Client.cancel(RemoteInterpreterService.java:259)
>>>>> at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.
>>>>> cancel(RemoteInterpreter.java:316)
>>>>> ... 21 more
>>>>>
>>>>>
>>>>> This is my spark interpreter settings:
>>>>>
>>>>>
>>>>> spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep
>>>>> Option
>>>>> Interpreter for note
>>>>>
>>>>> Connect to existing process
>>>>> Properties
>>>>> name value
>>>>> args
>>>>> master yarn-client
>>>>> spark.app.name Zeppelin
>>>>> spark.cores.max
>>>>> spark.executor.memory
>>>>> zeppelin.R.cmd R
>>>>> zeppelin.R.image.width 100%
>>>>> zeppelin.R.knitr true
>>>>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>>>>> FALSE, results = 'asis', message = F, warning = F
>>>>> zeppelin.dep.additionalRemoteRepository spark-packages,http://dl.bintr
>>>>> ay.com/spark-packages/maven,false;
>>>>> zeppelin.dep.localrepo local-repo
>>>>> zeppelin.interpreter.localRepo /usr/local/bin/zeppelin-0.6.1-
>>>>> bin-all/local-repo/2BXF675WU
>>>>> zeppelin.pyspark.python python
>>>>> zeppelin.spark.concurrentSQL false
>>>>> zeppelin.spark.importImplicit true
>>>>> zeppelin.spark.maxResult 1000
>>>>> zeppelin.spark.printREPLOutput true
>>>>> zeppelin.spark.sql.stacktrace false
>>>>> zeppelin.spark.useHiveContext true
>>>>>
>>>>>
>>>>> On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <90...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> %pyspark
>>>>>>
>>>>>> input_file = "hdfs:////tmp/filenname.gz"
>>>>>>
>>>>>> raw_rdd = sc.textFile(input_file)
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Abhi Basu
>>>>>
>>>>
>>>
>>>
>>> --
>>> Abhi Basu
>>>
>>
>
>
> --
> Abhi Basu
>



-- 
Abhi Basu

Re: Pyspark interpreter configuration for Zeppelin

Posted by Abhi Basu <90...@gmail.com>.
Yes that fixed some of the problems.

I am using Zeppelin 0.6.1 binaries against CDH 5.8 (Spark 1.6.0). Would
there be a compatibility issue?

Thanks

Abhi

On Wed, Sep 14, 2016 at 12:55 PM, moon soo Lee <mo...@apache.org> wrote:

> Could you try to set full path of python command on zeppelin.python
> property? not the bin directory.
>
> On Wed, Sep 14, 2016 at 10:19 AM Abhi Basu <90...@gmail.com> wrote:
>
>> Tried pyspark command on same machine which uses Anaconda python and
>> sc.version returned value.
>>
>> Zeppelin:
>> zeppelin.python /home/cloudera/anaconda2/bin
>>
>> In zeppelin, nothing is returned.
>>
>>
>> On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee <mo...@apache.org> wrote:
>>
>>> Did you export SPARK_HOME in conf/zeppelin-env.sh?
>>> Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on
>>> the same machine that zeppelin runs?
>>>
>>> Thanks,
>>> moon
>>>
>>>
>>> On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <90...@gmail.com> wrote:
>>>
>>>> Oops sorry. the above code generated this error:
>>>>
>>>> RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11}
>>>> NotebookServer.java[onMessage]:221) - Can't handle message
>>>> org.apache.zeppelin.interpreter.InterpreterException:
>>>> org.apache.thrift.transport.TTransportException
>>>> at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(
>>>> RemoteInterpreter.java:319)
>>>> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(
>>>> LazyOpenInterpreter.java:100)
>>>> at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330)
>>>> at org.apache.zeppelin.scheduler.Job.abort(Job.java:239)
>>>> at org.apache.zeppelin.socket.NotebookServer.cancelParagraph(
>>>> NotebookServer.java:995)
>>>> at org.apache.zeppelin.socket.NotebookServer.onMessage(
>>>> NotebookServer.java:180)
>>>> at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(
>>>> NotebookSocket.java:56)
>>>> at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.
>>>> onTextMessage(JettyListenerEventDriver.java:128)
>>>> at org.eclipse.jetty.websocket.common.message.SimpleTextMessage.
>>>> messageComplete(SimpleTextMessage.java:69)
>>>> at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.
>>>> appendMessage(AbstractEventDriver.java:65)
>>>> at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.
>>>> onTextFrame(JettyListenerEventDriver.java:122)
>>>> at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.
>>>> incomingFrame(AbstractEventDriver.java:161)
>>>> at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(
>>>> WebSocketSession.java:309)
>>>> at org.eclipse.jetty.websocket.common.extensions.
>>>> ExtensionStack.incomingFrame(ExtensionStack.java:214)
>>>> at org.eclipse.jetty.websocket.common.Parser.notifyFrame(
>>>> Parser.java:220)
>>>> at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
>>>> at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.
>>>> readParse(AbstractWebSocketConnection.java:632)
>>>> at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.
>>>> onFillable(AbstractWebSocketConnection.java:480)
>>>> at org.eclipse.jetty.io.AbstractConnection$2.run(
>>>> AbstractConnection.java:544)
>>>> at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
>>>> QueuedThreadPool.java:635)
>>>> at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(
>>>> QueuedThreadPool.java:555)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>> Caused by: org.apache.thrift.transport.TTransportException
>>>> at org.apache.thrift.transport.TIOStreamTransport.read(
>>>> TIOStreamTransport.java:132)
>>>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>>>> at org.apache.thrift.protocol.TBinaryProtocol.readAll(
>>>> TBinaryProtocol.java:429)
>>>> at org.apache.thrift.protocol.TBinaryProtocol.readI32(
>>>> TBinaryProtocol.java:318)
>>>> at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(
>>>> TBinaryProtocol.java:219)
>>>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>>>> at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$
>>>> Client.recv_cancel(RemoteInterpreterService.java:274)
>>>> at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$
>>>> Client.cancel(RemoteInterpreterService.java:259)
>>>> at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(
>>>> RemoteInterpreter.java:316)
>>>> ... 21 more
>>>>
>>>>
>>>> This is my spark interpreter settings:
>>>>
>>>>
>>>> spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep
>>>> Option
>>>> Interpreter for note
>>>>
>>>> Connect to existing process
>>>> Properties
>>>> name value
>>>> args
>>>> master yarn-client
>>>> spark.app.name Zeppelin
>>>> spark.cores.max
>>>> spark.executor.memory
>>>> zeppelin.R.cmd R
>>>> zeppelin.R.image.width 100%
>>>> zeppelin.R.knitr true
>>>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>>>> FALSE, results = 'asis', message = F, warning = F
>>>> zeppelin.dep.additionalRemoteRepository spark-packages,http://dl.
>>>> bintray.com/spark-packages/maven,false;
>>>> zeppelin.dep.localrepo local-repo
>>>> zeppelin.interpreter.localRepo /usr/local/bin/zeppelin-0.6.1-
>>>> bin-all/local-repo/2BXF675WU
>>>> zeppelin.pyspark.python python
>>>> zeppelin.spark.concurrentSQL false
>>>> zeppelin.spark.importImplicit true
>>>> zeppelin.spark.maxResult 1000
>>>> zeppelin.spark.printREPLOutput true
>>>> zeppelin.spark.sql.stacktrace false
>>>> zeppelin.spark.useHiveContext true
>>>>
>>>>
>>>> On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <90...@gmail.com> wrote:
>>>>
>>>>> %pyspark
>>>>>
>>>>> input_file = "hdfs:////tmp/filenname.gz"
>>>>>
>>>>> raw_rdd = sc.textFile(input_file)
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Abhi Basu
>>>>
>>>
>>
>>
>> --
>> Abhi Basu
>>
>


-- 
Abhi Basu

Re: Pyspark interpreter configuration for Zeppelin

Posted by moon soo Lee <mo...@apache.org>.
Could you try to set full path of python command on zeppelin.python
property? not the bin directory.

On Wed, Sep 14, 2016 at 10:19 AM Abhi Basu <90...@gmail.com> wrote:

> Tried pyspark command on same machine which uses Anaconda python and
> sc.version returned value.
>
> Zeppelin:
> zeppelin.python /home/cloudera/anaconda2/bin
>
> In zeppelin, nothing is returned.
>
>
> On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee <mo...@apache.org> wrote:
>
>> Did you export SPARK_HOME in conf/zeppelin-env.sh?
>> Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on
>> the same machine that zeppelin runs?
>>
>> Thanks,
>> moon
>>
>>
>> On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <90...@gmail.com> wrote:
>>
>>> Oops sorry. the above code generated this error:
>>>
>>> RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11}
>>> NotebookServer.java[onMessage]:221) - Can't handle message
>>> org.apache.zeppelin.interpreter.InterpreterException:
>>> org.apache.thrift.transport.TTransportException
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:319)
>>> at
>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:100)
>>> at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330)
>>> at org.apache.zeppelin.scheduler.Job.abort(Job.java:239)
>>> at
>>> org.apache.zeppelin.socket.NotebookServer.cancelParagraph(NotebookServer.java:995)
>>> at
>>> org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:180)
>>> at
>>> org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56)
>>> at
>>> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
>>> at
>>> org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
>>> at
>>> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
>>> at
>>> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)
>>> at
>>> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161)
>>> at
>>> org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309)
>>> at
>>> org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214)
>>> at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220)
>>> at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
>>> at org.eclipse.jetty.websocket.common.io
>>> .AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
>>> at org.eclipse.jetty.websocket.common.io
>>> .AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
>>> at org.eclipse.jetty.io
>>> .AbstractConnection$2.run(AbstractConnection.java:544)
>>> at
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
>>> at
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: org.apache.thrift.transport.TTransportException
>>> at
>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>>> at
>>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
>>> at
>>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
>>> at
>>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
>>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>>> at
>>> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_cancel(RemoteInterpreterService.java:274)
>>> at
>>> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.cancel(RemoteInterpreterService.java:259)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:316)
>>> ... 21 more
>>>
>>>
>>> This is my spark interpreter settings:
>>>
>>>
>>> spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep
>>> Option
>>> Interpreter for note
>>>
>>> Connect to existing process
>>> Properties
>>> name value
>>> args
>>> master yarn-client
>>> spark.app.name Zeppelin
>>> spark.cores.max
>>> spark.executor.memory
>>> zeppelin.R.cmd R
>>> zeppelin.R.image.width 100%
>>> zeppelin.R.knitr true
>>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>>> FALSE, results = 'asis', message = F, warning = F
>>> zeppelin.dep.additionalRemoteRepository spark-packages,
>>> http://dl.bintray.com/spark-packages/maven,false;
>>> zeppelin.dep.localrepo local-repo
>>> zeppelin.interpreter.localRepo
>>> /usr/local/bin/zeppelin-0.6.1-bin-all/local-repo/2BXF675WU
>>> zeppelin.pyspark.python python
>>> zeppelin.spark.concurrentSQL false
>>> zeppelin.spark.importImplicit true
>>> zeppelin.spark.maxResult 1000
>>> zeppelin.spark.printREPLOutput true
>>> zeppelin.spark.sql.stacktrace false
>>> zeppelin.spark.useHiveContext true
>>>
>>>
>>> On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <90...@gmail.com> wrote:
>>>
>>>> %pyspark
>>>>
>>>> input_file = "hdfs:////tmp/filenname.gz"
>>>>
>>>> raw_rdd = sc.textFile(input_file)
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Abhi Basu
>>>
>>
>
>
> --
> Abhi Basu
>

Re: Pyspark interpreter configuration for Zeppelin

Posted by Abhi Basu <90...@gmail.com>.
Tried pyspark command on same machine which uses Anaconda python and
sc.version returned value.

Zeppelin:
zeppelin.python /home/cloudera/anaconda2/bin

In zeppelin, nothing is returned.


On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee <mo...@apache.org> wrote:

> Did you export SPARK_HOME in conf/zeppelin-env.sh?
> Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on
> the same machine that zeppelin runs?
>
> Thanks,
> moon
>
>
> On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <90...@gmail.com> wrote:
>
>> Oops sorry. the above code generated this error:
>>
>> RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11}
>> NotebookServer.java[onMessage]:221) - Can't handle message
>> org.apache.zeppelin.interpreter.InterpreterException:
>> org.apache.thrift.transport.TTransportException
>> at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(
>> RemoteInterpreter.java:319)
>> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(
>> LazyOpenInterpreter.java:100)
>> at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330)
>> at org.apache.zeppelin.scheduler.Job.abort(Job.java:239)
>> at org.apache.zeppelin.socket.NotebookServer.cancelParagraph(
>> NotebookServer.java:995)
>> at org.apache.zeppelin.socket.NotebookServer.onMessage(
>> NotebookServer.java:180)
>> at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(
>> NotebookSocket.java:56)
>> at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.
>> onTextMessage(JettyListenerEventDriver.java:128)
>> at org.eclipse.jetty.websocket.common.message.SimpleTextMessage.
>> messageComplete(SimpleTextMessage.java:69)
>> at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.
>> appendMessage(AbstractEventDriver.java:65)
>> at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.
>> onTextFrame(JettyListenerEventDriver.java:122)
>> at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.
>> incomingFrame(AbstractEventDriver.java:161)
>> at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(
>> WebSocketSession.java:309)
>> at org.eclipse.jetty.websocket.common.extensions.
>> ExtensionStack.incomingFrame(ExtensionStack.java:214)
>> at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220)
>> at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
>> at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.
>> readParse(AbstractWebSocketConnection.java:632)
>> at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.
>> onFillable(AbstractWebSocketConnection.java:480)
>> at org.eclipse.jetty.io.AbstractConnection$2.run(
>> AbstractConnection.java:544)
>> at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
>> QueuedThreadPool.java:635)
>> at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(
>> QueuedThreadPool.java:555)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: org.apache.thrift.transport.TTransportException
>> at org.apache.thrift.transport.TIOStreamTransport.read(
>> TIOStreamTransport.java:132)
>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>> at org.apache.thrift.protocol.TBinaryProtocol.readAll(
>> TBinaryProtocol.java:429)
>> at org.apache.thrift.protocol.TBinaryProtocol.readI32(
>> TBinaryProtocol.java:318)
>> at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(
>> TBinaryProtocol.java:219)
>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>> at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$
>> Client.recv_cancel(RemoteInterpreterService.java:274)
>> at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$
>> Client.cancel(RemoteInterpreterService.java:259)
>> at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(
>> RemoteInterpreter.java:316)
>> ... 21 more
>>
>>
>> This is my spark interpreter settings:
>>
>>
>> spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep
>> Option
>> Interpreter for note
>>
>> Connect to existing process
>> Properties
>> name value
>> args
>> master yarn-client
>> spark.app.name Zeppelin
>> spark.cores.max
>> spark.executor.memory
>> zeppelin.R.cmd R
>> zeppelin.R.image.width 100%
>> zeppelin.R.knitr true
>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>> FALSE, results = 'asis', message = F, warning = F
>> zeppelin.dep.additionalRemoteRepository spark-packages,http://dl.
>> bintray.com/spark-packages/maven,false;
>> zeppelin.dep.localrepo local-repo
>> zeppelin.interpreter.localRepo /usr/local/bin/zeppelin-0.6.1-
>> bin-all/local-repo/2BXF675WU
>> zeppelin.pyspark.python python
>> zeppelin.spark.concurrentSQL false
>> zeppelin.spark.importImplicit true
>> zeppelin.spark.maxResult 1000
>> zeppelin.spark.printREPLOutput true
>> zeppelin.spark.sql.stacktrace false
>> zeppelin.spark.useHiveContext true
>>
>>
>> On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <90...@gmail.com> wrote:
>>
>>> %pyspark
>>>
>>> input_file = "hdfs:////tmp/filenname.gz"
>>>
>>> raw_rdd = sc.textFile(input_file)
>>>
>>>
>>>
>>
>>
>> --
>> Abhi Basu
>>
>


-- 
Abhi Basu

Re: Pyspark interpreter configuration for Zeppelin

Posted by moon soo Lee <mo...@apache.org>.
Did you export SPARK_HOME in conf/zeppelin-env.sh?
Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on the
same machine that zeppelin runs?

Thanks,
moon

On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <90...@gmail.com> wrote:

> Oops sorry. the above code generated this error:
>
> RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11}
> NotebookServer.java[onMessage]:221) - Can't handle message
> org.apache.zeppelin.interpreter.InterpreterException:
> org.apache.thrift.transport.TTransportException
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:319)
> at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:100)
> at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330)
> at org.apache.zeppelin.scheduler.Job.abort(Job.java:239)
> at
> org.apache.zeppelin.socket.NotebookServer.cancelParagraph(NotebookServer.java:995)
> at
> org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:180)
> at
> org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56)
> at
> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
> at
> org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
> at
> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
> at
> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)
> at
> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161)
> at
> org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309)
> at
> org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214)
> at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220)
> at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
> at
> org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
> at
> org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
> at
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.transport.TTransportException
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_cancel(RemoteInterpreterService.java:274)
> at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.cancel(RemoteInterpreterService.java:259)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:316)
> ... 21 more
>
>
> This is my spark interpreter settings:
>
>
> spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep
> Option
> Interpreter for note
>
> Connect to existing process
> Properties
> name value
> args
> master yarn-client
> spark.app.name Zeppelin
> spark.cores.max
> spark.executor.memory
> zeppelin.R.cmd R
> zeppelin.R.image.width 100%
> zeppelin.R.knitr true
> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
> FALSE, results = 'asis', message = F, warning = F
> zeppelin.dep.additionalRemoteRepository spark-packages,
> http://dl.bintray.com/spark-packages/maven,false;
> zeppelin.dep.localrepo local-repo
> zeppelin.interpreter.localRepo
> /usr/local/bin/zeppelin-0.6.1-bin-all/local-repo/2BXF675WU
> zeppelin.pyspark.python python
> zeppelin.spark.concurrentSQL false
> zeppelin.spark.importImplicit true
> zeppelin.spark.maxResult 1000
> zeppelin.spark.printREPLOutput true
> zeppelin.spark.sql.stacktrace false
> zeppelin.spark.useHiveContext true
>
>
> On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <90...@gmail.com> wrote:
>
>> %pyspark
>>
>> input_file = "hdfs:////tmp/filenname.gz"
>>
>> raw_rdd = sc.textFile(input_file)
>>
>>
>>
>
>
> --
> Abhi Basu
>

Re: Pyspark interpreter configuration for Zeppelin

Posted by Abhi Basu <90...@gmail.com>.
Oops sorry. the above code generated this error:

RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11}
NotebookServer.java[onMessage]:221) - Can't handle message
org.apache.zeppelin.interpreter.InterpreterException:
org.apache.thrift.transport.TTransportException
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:319)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:100)
at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330)
at org.apache.zeppelin.scheduler.Job.abort(Job.java:239)
at
org.apache.zeppelin.socket.NotebookServer.cancelParagraph(NotebookServer.java:995)
at
org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:180)
at
org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56)
at
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
at
org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
at
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
at
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)
at
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161)
at
org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309)
at
org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214)
at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220)
at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
at
org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
at
org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
at
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_cancel(RemoteInterpreterService.java:274)
at
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.cancel(RemoteInterpreterService.java:259)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:316)
... 21 more


This is my spark interpreter settings:


spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep
Option
Interpreter for note

Connect to existing process
Properties
name value
args
master yarn-client
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory
zeppelin.R.cmd R
zeppelin.R.image.width 100%
zeppelin.R.knitr true
zeppelin.R.render.options out.format = 'html', comment = NA, echo = FALSE,
results = 'asis', message = F, warning = F
zeppelin.dep.additionalRemoteRepository spark-packages,
http://dl.bintray.com/spark-packages/maven,false;
zeppelin.dep.localrepo local-repo
zeppelin.interpreter.localRepo
/usr/local/bin/zeppelin-0.6.1-bin-all/local-repo/2BXF675WU
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.importImplicit true
zeppelin.spark.maxResult 1000
zeppelin.spark.printREPLOutput true
zeppelin.spark.sql.stacktrace false
zeppelin.spark.useHiveContext true


On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <90...@gmail.com> wrote:

> %pyspark
>
> input_file = "hdfs:////tmp/filenname.gz"
>
> raw_rdd = sc.textFile(input_file)
>
>
>


-- 
Abhi Basu