You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Felix C <fe...@hotmail.com> on 2015/04/01 00:46:32 UTC

Re: running pyspark notes

It is very easy to run spark locally. You can download the binary distribution and unpack it on your Dev box.

--- Original Message ---

From: "moon soo Lee" <mo...@apache.org>
Sent: March 31, 2015 12:38 AM
To: users@zeppelin.incubator.apache.org
Subject: Re: running pyspark notes

For Pyspark, you need to download spark distribution, and spark.home and
SPARK_HOME need to point that directory.
The reason why you need spark distribution is, to use pyspark, Zeppelin
need some python module that spark distribution has under it's python
directory.

Thanks,
moon

On Tue, Mar 31, 2015 at 4:01 PM Ram Venkatesh <rv...@hortonworks.com>
wrote:

>  Hello,
>
>  Thank you for your reply.
>
>  I have built zeppelin with the following command line: mvn install
> -DskipTests -Pspark-1.2 -Phadoop-2.4
>
>  I don’t have a separate Spark distribution package, I am running
> zeppelin on my dev box in local mode for testing.
>
>  What should my spark.home (or SPARK_HOME) be set to, currently they are
> blank.
>
>  Thanks!
> Ram
>
>  On Mar 30, 2015, at 4:56 PM, moon soo Lee <mo...@apache.org> wrote:
>
>  Hi,
>
>  It can be happen when spark.home property and SPARK_HOME environment
> variable is misconfigured. Or when version of spark distribution package
> that spark.home (and SPARK_HOME) points is different from the version build
> with Zeppelin.
>
>  Thanks,
> moon
>
>
> On Tue, Mar 31, 2015 at 4:29 AM Ram Venkatesh <rv...@hortonworks.com>
> wrote:
>
>> Hello,
>>
>> I am having trouble running a pyspark note (zeppelin newbie, could well
>> be pilot error).
>>
>> The note is
>>
>> %pyspark
>>
>> print ‘hello world’
>>
>> The note transitions to “PENDING” and then “RUNNING” but never finishes
>> after that.
>>
>> From the zeppelin server logs:
>> INFO [2015-03-30 12:24:10,512] ({pool-2-thread-2}
>> RemoteInterpreterProcess.java[reference]:74) - Run interpreter process
>> /Users/rvenkatesh/dev/asf/zeppelin/bin/interpreter.sh -d
>> /Users/rvenkatesh/dev/asf/zeppelin/interpreter/spark -p 59569
>>  INFO [2015-03-30 12:24:11,570] ({pool-2-thread-2}
>> RemoteInterpreter.java[init]:114) - Create remote interpreter
>> com.nflabs.zeppelin.spark.SparkInterpreter
>>  INFO [2015-03-30 12:24:11,623] ({pool-2-thread-2}
>> RemoteInterpreter.java[init]:114) - Create remote interpreter
>> com.nflabs.zeppelin.spark.PySparkInterpreter
>>  INFO [2015-03-30 12:24:11,628] ({pool-2-thread-2}
>> RemoteInterpreter.java[init]:114) - Create remote interpreter
>> com.nflabs.zeppelin.spark.SparkSqlInterpreter
>>  INFO [2015-03-30 12:24:11,631] ({pool-2-thread-2}
>> RemoteInterpreter.java[init]:114) - Create remote interpreter
>> com.nflabs.zeppelin.spark.DepInterpreter
>>  INFO [2015-03-30 12:24:11,635] ({pool-2-thread-2}
>> RemoteInterpreter.java[open]:143) - open remote interpreter
>> com.nflabs.zeppelin.spark.PySparkInterpreter
>>  INFO [2015-03-30 12:24:11,682] ({pool-2-thread-2}
>> Paragraph.java[jobRun]:182) - RUN :
>> print 'hello world'
>>
>>  INFO [2015-03-30 12:24:19,444] ({Thread-24} RemoteScheduler.java[getStatus]:185)
>> - getStatus from remote RUNNING
>>  INFO [2015-03-30 12:24:19,444] ({Thread-24}
>> NotebookServer.java[broadcast]:205) - SEND >> NOTE
>>  INFO [2015-03-30 12:24:19,446] ({Thread-25}
>> NotebookServer.java[broadcast]:205) - SEND >> PROGRESS
>>  INFO [2015-03-30 12:24:19,955] ({Thread-25}
>> NotebookServer.java[broadcast]:205) - SEND >> PROGRESS
>> … ad infinetum
>>
>> Nothing interesting in the spark interpreter logs.
>>
>> Any help appreciated.
>>
>> Thanks!
>> Ram
>
>
>

Re: running pyspark notes

Posted by "Kelly, Jonathan" <jo...@amazon.com>.
I am running into the same issue as in Ram's original post, though I do correctly have SPARK_HOME set.  I see no obvious errors or warnings in any of the Zeppelin logs.  Is there anything specific I should look for?

Thanks,
Jonathan Kelly

From: Felix C <fe...@hotmail.com>>
Reply-To: "users@zeppelin.incubator.apache.org<ma...@zeppelin.incubator.apache.org>" <us...@zeppelin.incubator.apache.org>>
Date: Tuesday, March 31, 2015 at 3:46 PM
To: "users@zeppelin.incubator.apache.org<ma...@zeppelin.incubator.apache.org>" <us...@zeppelin.incubator.apache.org>>
Subject: Re: running pyspark notes

It is very easy to run spark locally. You can download the binary distribution and unpack it on your Dev box.

--- Original Message ---

From: "moon soo Lee" <mo...@apache.org>>
Sent: March 31, 2015 12:38 AM
To: users@zeppelin.incubator.apache.org<ma...@zeppelin.incubator.apache.org>
Subject: Re: running pyspark notes

For Pyspark, you need to download spark distribution, and spark.home and SPARK_HOME need to point that directory.
The reason why you need spark distribution is, to use pyspark, Zeppelin need some python module that spark distribution has under it's python directory.

Thanks,
moon

On Tue, Mar 31, 2015 at 4:01 PM Ram Venkatesh <rv...@hortonworks.com>> wrote:
Hello,

Thank you for your reply.

I have built zeppelin with the following command line: mvn install -DskipTests -Pspark-1.2 -Phadoop-2.4

I don't have a separate Spark distribution package, I am running zeppelin on my dev box in local mode for testing.

What should my spark.home (or SPARK_HOME) be set to, currently they are blank.

Thanks!
Ram

On Mar 30, 2015, at 4:56 PM, moon soo Lee <mo...@apache.org>> wrote:

Hi,

It can be happen when spark.home property and SPARK_HOME environment variable is misconfigured. Or when version of spark distribution package that spark.home (and SPARK_HOME) points is different from the version build with Zeppelin.

Thanks,
moon


On Tue, Mar 31, 2015 at 4:29 AM Ram Venkatesh <rv...@hortonworks.com>> wrote:
Hello,

I am having trouble running a pyspark note (zeppelin newbie, could well be pilot error).

The note is

%pyspark

print 'hello world'

The note transitions to "PENDING" and then "RUNNING" but never finishes after that.

>From the zeppelin server logs:
INFO [2015-03-30 12:24:10,512] ({pool-2-thread-2} RemoteInterpreterProcess.java[reference]:74) - Run interpreter process /Users/rvenkatesh/dev/asf/zeppelin/bin/interpreter.sh -d /Users/rvenkatesh/dev/asf/zeppelin/interpreter/spark -p 59569
 INFO [2015-03-30 12:24:11,570] ({pool-2-thread-2} RemoteInterpreter.java[init]:114) - Create remote interpreter com.nflabs.zeppelin.spark.SparkInterpreter
 INFO [2015-03-30 12:24:11,623] ({pool-2-thread-2} RemoteInterpreter.java[init]:114) - Create remote interpreter com.nflabs.zeppelin.spark.PySparkInterpreter
 INFO [2015-03-30 12:24:11,628] ({pool-2-thread-2} RemoteInterpreter.java[init]:114) - Create remote interpreter com.nflabs.zeppelin.spark.SparkSqlInterpreter
 INFO [2015-03-30 12:24:11,631] ({pool-2-thread-2} RemoteInterpreter.java[init]:114) - Create remote interpreter com.nflabs.zeppelin.spark.DepInterpreter
 INFO [2015-03-30 12:24:11,635] ({pool-2-thread-2} RemoteInterpreter.java[open]:143) - open remote interpreter com.nflabs.zeppelin.spark.PySparkInterpreter
 INFO [2015-03-30 12:24:11,682] ({pool-2-thread-2} Paragraph.java[jobRun]:182) - RUN :
print 'hello world'

 INFO [2015-03-30 12:24:19,444] ({Thread-24} RemoteScheduler.java[getStatus]:185) - getStatus from remote RUNNING
 INFO [2015-03-30 12:24:19,444] ({Thread-24} NotebookServer.java[broadcast]:205) - SEND >> NOTE
 INFO [2015-03-30 12:24:19,446] ({Thread-25} NotebookServer.java[broadcast]:205) - SEND >> PROGRESS
 INFO [2015-03-30 12:24:19,955] ({Thread-25} NotebookServer.java[broadcast]:205) - SEND >> PROGRESS
... ad infinetum

Nothing interesting in the spark interpreter logs.

Any help appreciated.

Thanks!
Ram