You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Joaquin Alzola <Jo...@lebara.com> on 2016/06/29 12:58:26 UTC

Spark jobs

Hi,

This is a totally newbie question but I seem not to find the link ..... when I create a spark-submit python script to be launch ...

how should I call it from the main python script with a subprocess.popen?

BR

Joaquin






This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.

RE: Spark jobs

Posted by Joaquin Alzola <Jo...@lebara.com>.

HI Sujeet,

Thinking that might not work

Running this:
#!/usr/bin/env python3
from pyspark_cassandra import CassandraSparkContext, Row
from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext
conf = SparkConf().setAppName("test").setMaster("spark://192.168.23.31:7077").set("spark.cassandra.connection.host", "192.168.23.31")
sc = CassandraSparkContext(conf=conf)
sqlContext = SQLContext(sc)
df = sqlContext.read.format("org.apache.spark.sql.cassandra").options(keyspace="lebara_diameter_codes", table="nl_lebara_diameter_codes").load()
list = df.select("errorcode2001").where("errorcode2001 > 1200").collect()
list2 = df.select("date").collect()
print([i for i in list[0]])
print(type(list[0]))

of course show this error:
py4j.protocol.Py4JJavaError: An error occurred while calling o29.load.
: java.lang.ClassNotFoundException: Failed to find data source: org.apache.spark.sql.cassandra. Please find packages at http://spark-packages.org

Is there a way to load up those jar files into the script

Jo

From: sujeet jog [mailto:sujeet.jog@gmail.com]
Sent: 29 June 2016 14:51
To: Joaquin Alzola <Jo...@lebara.com>; user <us...@spark.apache.org>
Subject: Re: Spark jobs

check if this helps,

from multiprocessing import Process

def training() :
        print ("Training Workflow")

        cmd = spark/bin/spark-submit  ./ml.py & "
        os.system(cmd)

w_training      = Process(target = training)



On Wed, Jun 29, 2016 at 6:28 PM, Joaquin Alzola <Jo...@lebara.com>> wrote:
Hi,

This is a totally newbie question but I seem not to find the link ….. when I create a spark-submit python script to be launch …

how should I call it from the main python script with a subprocess.popen?

BR

Joaquin






This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.

This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.

Re: Spark jobs

Posted by sujeet jog <su...@gmail.com>.

check if this helps,

from multiprocessing import Process

def training() :
        print ("Training Workflow")

        cmd = spark/bin/spark-submit  ./ml.py & "
        os.system(cmd)

w_training      = Process(target = training)



On Wed, Jun 29, 2016 at 6:28 PM, Joaquin Alzola <Jo...@lebara.com>
wrote:

> Hi,
>
>
>
> This is a totally newbie question but I seem not to find the link ….. when
> I create a spark-submit python script to be launch …
>
>
>
> how should I call it from the main python script with a subprocess.popen?
>
>
>
> BR
>
>
>
> Joaquin
>
>
>
>
>
>
>
>
>
>
>
>
> This email is confidential and may be subject to privilege. If you are not
> the intended recipient, please do not copy or disclose its content but
> contact the sender immediately upon receipt.
>