You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Vadim Semenov <va...@datadoghq.com> on 2016/10/02 02:12:23 UTC

Re: Restful WS for Spark

I worked with both, so I'll give you some insight from my perspective.

spark-jobserver has stable API and overall mature but doesn't work with
yarn-cluster mode and python support is in-development right now.

Livy has stable API (but I'm not sure if I can speak for it since it has
appeared recently but considering that Cloudera is behind of it, I'd say
it's mature), supports all deployment modes and has support for python & R.

spark-jobserver has nicer UI and better features overall for logging and
richer API.

I had hard time adding more features to spark-jobserver than into Livy, so
if it's something you would need to do you should consider that.

In some cases, spark-jobserver runs into issues that are difficult to debug
because of Akka.

Spark-jobserver is better if you use Scala and you need to work with shared
spark contexts as it has built-in API for shared RDDs and objects.

Both don't support High-Availability, but Livy has few open active PRs.

I'd start with spark-jobserver as it's easier.

On Sat, Oct 1, 2016 at 8:46 AM, ABHISHEK <ab...@gmail.com> wrote:

> Thanks Vadim.
> I looked on Spark job server but not sure about session management, will
> job run in  Hadoop cluster ?
> How stable is this API as we will need to implement it in production env.
> Livy looks more promising but still need not matured.
> Have you tested any of them ?
>
> Thanks,
> Abhishek
> Abhishek
>
>
> On Fri, Sep 30, 2016 at 11:39 PM, Vadim Semenov <
> vadim.semenov@datadoghq.com> wrote:
>
>> There're two REST job servers that work with spark:
>>
>> https://github.com/spark-jobserver/spark-jobserver
>>
>> https://github.com/cloudera/livy
>>
>>
>> On Fri, Sep 30, 2016 at 2:07 PM, ABHISHEK <ab...@gmail.com> wrote:
>>
>>> Hello all,
>>> Have you tried accessing Spark application using Restful  web-services?
>>>
>>> I have requirement where remote user submit the  request with some data,
>>> it should be sent to Spark and job should run in Hadoop cluster mode.
>>> Output should be sent back to user.
>>>
>>> Please share your  expertise.
>>> Thanks,
>>> Abhishek
>>>
>>
>>
>