You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@livy.apache.org by Keiji Yoshida <kj...@gmail.com> on 2017/12/14 06:25:29 UTC

Can Spark jobs which are submitted via Livy run concurrently?

Hi Apache Livy developers,

I would like to ask a question.

Can Spark jobs which are submitted via Livy run concurrently on a single
Spark application (= Livy session)?

I set up Spark configuration in accordance with
https://spark.apache.org/docs/2.1.1/job-scheduling.html#scheduling-within-an-application
so that Spark jobs can run concurrently on a FAIR Spark scheduler and a
FAIR Spark pool.

However, when I submitted multiple Spark jobs (= Livy statements) by
executing the following command several times, they just ran sequentially
on a FAIR Spark pool.

[command which I executed several times]
curl -XPOST -H "Content-Type: application/json"
mylivyhost.com:9999/sessions/9999/statements -d '{"code":
"sc.setLocalProperty(\"spark.scheduler.pool\",
\"my-fair-pool\")\nspark.sql(\"select count(1) from mytable\").show()"}'

It seems for me that a single Livy session submits a Spark job (= Livy
statement) only after the Spark job which has been submitted previously
finished.

Is my guess true? Can I make Spark jobs which are submitted via Livy run
concurrently in some way?

I'm using Livy 0.3.0.

Thanks,
Keiji Yoshida

Re: Can Spark jobs which are submitted via Livy run concurrently?

Posted by Saisai Shao <sa...@gmail.com>.
No, current Livy interactive session doesn't support running statements
concurrently. Because Livy doesn't know if multiple statements have
dependencies or not, running concurrently will lead to unexpected results.
So submitted statements can only be run one by one.

Thanks
Jerry

2017-12-14 14:25 GMT+08:00 Keiji Yoshida <kj...@gmail.com>:

> Hi Apache Livy developers,
>
> I would like to ask a question.
>
> Can Spark jobs which are submitted via Livy run concurrently on a single
> Spark application (= Livy session)?
>
> I set up Spark configuration in accordance with
> https://spark.apache.org/docs/2.1.1/job-scheduling.html#
> scheduling-within-an-application
> so that Spark jobs can run concurrently on a FAIR Spark scheduler and a
> FAIR Spark pool.
>
> However, when I submitted multiple Spark jobs (= Livy statements) by
> executing the following command several times, they just ran sequentially
> on a FAIR Spark pool.
>
> [command which I executed several times]
> curl -XPOST -H "Content-Type: application/json"
> mylivyhost.com:9999/sessions/9999/statements -d '{"code":
> "sc.setLocalProperty(\"spark.scheduler.pool\",
> \"my-fair-pool\")\nspark.sql(\"select count(1) from mytable\").show()"}'
>
> It seems for me that a single Livy session submits a Spark job (= Livy
> statement) only after the Spark job which has been submitted previously
> finished.
>
> Is my guess true? Can I make Spark jobs which are submitted via Livy run
> concurrently in some way?
>
> I'm using Livy 0.3.0.
>
> Thanks,
> Keiji Yoshida
>