You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Spico Florin <sp...@gmail.com> on 2018/08/08 13:59:47 UTC

Run/install tensorframes on zeppelin pyspark

Hi!

I would like to use tensorframes in my pyspark notebook.

I have performed the following:

1. In the spark intepreter adde a new repository
http://dl.bintray.com/spark-packages/maven
2. in the spark interpreter added the
dependency databricks:tensorframes:0.2.9-s_2.11
3. pip install tensorframes


In both 0.7.3 and 0.8.0:
1.  the following code resulted in error: "ImportError: No module named
tensorframes"

%pyspark
import tensorframes as tfs

2. the following code succeeded
%spark
import org.tensorframes.{dsl => tf}
import org.tensorframes.dsl.Implicits._
val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")

// As in Python, scoping is recommended to prevent name collisions.
val df2 = tf.withGraph {
    val a = df.block("a")
    // Unlike python, the scala syntax is more flexible:
    val out = a + 3.0 named "out"
    // The 'mapBlocks' method is added using implicits to dataframes.
    df.mapBlocks(out).select("a", "out")
}

// The transform is all lazy at this point, let's execute it with collect:
df2.collect()

I ran the code above directly with spark interpreter with the default
configurations (master set up to local[*] - so not via spark-submit
command) .

Also, I have installed spark home locally and ran the command

$SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11

and the code below worked as expcted

import tensorframes as tfs

 Can you please help to solve this?

Thanks,

 Florin

Re: Run/install tensorframes on zeppelin pyspark

Posted by Spico Florin <sp...@gmail.com>.

Hello!
  Thank you very much for your response.
As I understood, in order to use tensorframes in Zeppelin pyspark notebook
with spark master locally
1. we should run command pip install tensorframes
2. we should set up the PYSPARK_PYTHON in conf/zeppelin-env.sh

I have performed the above steps like this

python2.7 -m pip install tensorframes==0.2.7
export PYSPARK_PYTHON=python2.7 in  in conf/zeppelin-env.sh
"zeppelin.pyspark.python": "python2.7 in conf/interpreter.json

As you can see the installation and the configurations refers to the same
python2.7 version.
After performing all of these steps, I'm still getting the same error
 *"ImportError:
No module named tensorframes"*

I'm still puzzled how this import works fine in the pyspark command from
the spark and for example in python2.7 results in errors.
Also I've observed that pyspark shell from /spark/bin doesn't need the
tensorframes python package installed and this is more confusing.
Zeppelin pyspark interpreter is not using the same approach as spark
pyspark shell?

Is someone succeeded to import/use correctly tensorframes in Zeppelin with
default spark master setup (local[*]?) If yes how?

I look forward for your answers/

Regards,
 Florin

On Thu, Aug 9, 2018 at 3:52 AM, Jeff Zhang <zj...@gmail.com> wrote:

>
> Make sure you use the correct python which has tensorframe installed.  Use PYSPARK_PYTHON
> to configure the python
>
>
>
> Spico Florin <sp...@gmail.com>于2018年8月8日周三 下午9:59写道：
>
>> Hi!
>>
>> I would like to use tensorframes in my pyspark notebook.
>>
>> I have performed the following:
>>
>> 1. In the spark intepreter adde a new repository http://dl.bintray.
>> com/spark-packages/maven
>> 2. in the spark interpreter added the dependency databricks:
>> tensorframes:0.2.9-s_2.11
>> 3. pip install tensorframes
>>
>>
>> In both 0.7.3 and 0.8.0:
>> 1.  the following code resulted in error: "ImportError: No module named
>> tensorframes"
>>
>> %pyspark
>> import tensorframes as tfs
>>
>> 2. the following code succeeded
>> %spark
>> import org.tensorframes.{dsl => tf}
>> import org.tensorframes.dsl.Implicits._
>> val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")
>>
>> // As in Python, scoping is recommended to prevent name collisions.
>> val df2 = tf.withGraph {
>>     val a = df.block("a")
>>     // Unlike python, the scala syntax is more flexible:
>>     val out = a + 3.0 named "out"
>>     // The 'mapBlocks' method is added using implicits to dataframes.
>>     df.mapBlocks(out).select("a", "out")
>> }
>>
>> // The transform is all lazy at this point, let's execute it with collect:
>> df2.collect()
>>
>> I ran the code above directly with spark interpreter with the default
>> configurations (master set up to local[*] - so not via spark-submit
>> command) .
>>
>> Also, I have installed spark home locally and ran the command
>>
>> $SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11
>>
>> and the code below worked as expcted
>>
>> import tensorframes as tfs
>>
>>  Can you please help to solve this?
>>
>> Thanks,
>>
>>  Florin
>>
>>
>>
>>
>>
>>
>>
>>
>>

Re: Run/install tensorframes on zeppelin pyspark

Posted by Spico Florin <sp...@gmail.com>.

Hello!
  Thank you very much for your response.
As I understood, in order to use tensorframes in Zeppelin pyspark notebook
with spark master locally
1. we should run command pip install tensorframes
2. we should set up the PYSPARK_PYTHON in conf/zeppelin-env.sh

I have performed the above steps like this

python2.7 -m pip install tensorframes==0.2.7
export PYSPARK_PYTHON=python2.7 in  in conf/zeppelin-env.sh
"zeppelin.pyspark.python": "python2.7 in conf/interpreter.json

As you can see the installation and the configurations refers to the same
python2.7 version.
After performing all of these steps, I'm still getting the same error
 *"ImportError:
No module named tensorframes"*

I'm still puzzled how this import works fine in the pyspark command from
the spark and for example in python2.7 results in errors.
Also I've observed that pyspark shell from /spark/bin doesn't need the
tensorframes python package installed and this is more confusing.
Zeppelin pyspark interpreter is not using the same approach as spark
pyspark shell?

Is someone succeeded to import/use correctly tensorframes in Zeppelin with
default spark master setup (local[*]?) If yes how?

I look forward for your answers/

Regards,
 Florin

On Thu, Aug 9, 2018 at 3:52 AM, Jeff Zhang <zj...@gmail.com> wrote:

>
> Make sure you use the correct python which has tensorframe installed.  Use PYSPARK_PYTHON
> to configure the python
>
>
>
> Spico Florin <sp...@gmail.com>于2018年8月8日周三 下午9:59写道：
>
>> Hi!
>>
>> I would like to use tensorframes in my pyspark notebook.
>>
>> I have performed the following:
>>
>> 1. In the spark intepreter adde a new repository http://dl.bintray.
>> com/spark-packages/maven
>> 2. in the spark interpreter added the dependency databricks:
>> tensorframes:0.2.9-s_2.11
>> 3. pip install tensorframes
>>
>>
>> In both 0.7.3 and 0.8.0:
>> 1.  the following code resulted in error: "ImportError: No module named
>> tensorframes"
>>
>> %pyspark
>> import tensorframes as tfs
>>
>> 2. the following code succeeded
>> %spark
>> import org.tensorframes.{dsl => tf}
>> import org.tensorframes.dsl.Implicits._
>> val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")
>>
>> // As in Python, scoping is recommended to prevent name collisions.
>> val df2 = tf.withGraph {
>>     val a = df.block("a")
>>     // Unlike python, the scala syntax is more flexible:
>>     val out = a + 3.0 named "out"
>>     // The 'mapBlocks' method is added using implicits to dataframes.
>>     df.mapBlocks(out).select("a", "out")
>> }
>>
>> // The transform is all lazy at this point, let's execute it with collect:
>> df2.collect()
>>
>> I ran the code above directly with spark interpreter with the default
>> configurations (master set up to local[*] - so not via spark-submit
>> command) .
>>
>> Also, I have installed spark home locally and ran the command
>>
>> $SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11
>>
>> and the code below worked as expcted
>>
>> import tensorframes as tfs
>>
>>  Can you please help to solve this?
>>
>> Thanks,
>>
>>  Florin
>>
>>
>>
>>
>>
>>
>>
>>
>>

Re: Run/install tensorframes on zeppelin pyspark

Posted by Jeff Zhang <zj...@gmail.com>.

Make sure you use the correct python which has tensorframe installed.
Use PYSPARK_PYTHON
to configure the python



Spico Florin <sp...@gmail.com>于2018年8月8日周三 下午9:59写道：

> Hi!
>
> I would like to use tensorframes in my pyspark notebook.
>
> I have performed the following:
>
> 1. In the spark intepreter adde a new repository
> http://dl.bintray.com/spark-packages/maven
> 2. in the spark interpreter added the
> dependency databricks:tensorframes:0.2.9-s_2.11
> 3. pip install tensorframes
>
>
> In both 0.7.3 and 0.8.0:
> 1.  the following code resulted in error: "ImportError: No module named
> tensorframes"
>
> %pyspark
> import tensorframes as tfs
>
> 2. the following code succeeded
> %spark
> import org.tensorframes.{dsl => tf}
> import org.tensorframes.dsl.Implicits._
> val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")
>
> // As in Python, scoping is recommended to prevent name collisions.
> val df2 = tf.withGraph {
>     val a = df.block("a")
>     // Unlike python, the scala syntax is more flexible:
>     val out = a + 3.0 named "out"
>     // The 'mapBlocks' method is added using implicits to dataframes.
>     df.mapBlocks(out).select("a", "out")
> }
>
> // The transform is all lazy at this point, let's execute it with collect:
> df2.collect()
>
> I ran the code above directly with spark interpreter with the default
> configurations (master set up to local[*] - so not via spark-submit
> command) .
>
> Also, I have installed spark home locally and ran the command
>
> $SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11
>
> and the code below worked as expcted
>
> import tensorframes as tfs
>
>  Can you please help to solve this?
>
> Thanks,
>
>  Florin
>
>
>
>
>
>
>
>
>

Re: Run/install tensorframes on zeppelin pyspark

Posted by Jeff Zhang <zj...@gmail.com>.

Make sure you use the correct python which has tensorframe installed.
Use PYSPARK_PYTHON
to configure the python



Spico Florin <sp...@gmail.com>于2018年8月8日周三 下午9:59写道：

> Hi!
>
> I would like to use tensorframes in my pyspark notebook.
>
> I have performed the following:
>
> 1. In the spark intepreter adde a new repository
> http://dl.bintray.com/spark-packages/maven
> 2. in the spark interpreter added the
> dependency databricks:tensorframes:0.2.9-s_2.11
> 3. pip install tensorframes
>
>
> In both 0.7.3 and 0.8.0:
> 1.  the following code resulted in error: "ImportError: No module named
> tensorframes"
>
> %pyspark
> import tensorframes as tfs
>
> 2. the following code succeeded
> %spark
> import org.tensorframes.{dsl => tf}
> import org.tensorframes.dsl.Implicits._
> val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")
>
> // As in Python, scoping is recommended to prevent name collisions.
> val df2 = tf.withGraph {
>     val a = df.block("a")
>     // Unlike python, the scala syntax is more flexible:
>     val out = a + 3.0 named "out"
>     // The 'mapBlocks' method is added using implicits to dataframes.
>     df.mapBlocks(out).select("a", "out")
> }
>
> // The transform is all lazy at this point, let's execute it with collect:
> df2.collect()
>
> I ran the code above directly with spark interpreter with the default
> configurations (master set up to local[*] - so not via spark-submit
> command) .
>
> Also, I have installed spark home locally and ran the command
>
> $SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11
>
> and the code below worked as expcted
>
> import tensorframes as tfs
>
>  Can you please help to solve this?
>
> Thanks,
>
>  Florin
>
>
>
>
>
>
>
>
>