You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Trevor McKay (JIRA)" <ji...@apache.org> on 2014/09/23 17:59:34 UTC

[jira] [Commented] (SPARK-3644) REST API for Spark application info (jobs / stages / tasks / storage info)

    [ https://issues.apache.org/jira/browse/SPARK-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144943#comment-14144943 ] 

Trevor McKay commented on SPARK-3644:
-------------------------------------

Anecdotal notes from a consumer :)  I recently added some simple spark job functions to Sahara on OpenStack.  I needed "submit", "basic status", and "terminate".  A RESTful API would have been great!  I ended up using ssh to the spark master with a Python launcher around spark-submit, pid files, stderr/stdout and os operations to create the functions I wanted.  It works well, but ...

I think some textual representation of the data on the web UI would have met all the status needs.  I have simple states corresponding to "running", "completed successfully", "completed with error", or "killed" based on the pid and result from the launcher script.

An additional question imho is whether to make it read only, or allow submit/cancel operations.  Spark-submit is pretty easy to use over ssh, but a REST version of spark-submit might be a nice complement to a status API.

Cancelation was a bit harder, because I wanted the job to run asynchronously from ssh without an open connection but still be cancellable.  This meant that I had to deal with closing file descriptors, saving the pid, issuing kill, etc.  A cancel-by-id REST function would be great, too, if this work can go beyond readonly status.

> REST API for Spark application info (jobs / stages / tasks / storage info)
> --------------------------------------------------------------------------
>
>                 Key: SPARK-3644
>                 URL: https://issues.apache.org/jira/browse/SPARK-3644
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Web UI
>            Reporter: Josh Rosen
>
> This JIRA is a forum to draft a design proposal for a REST interface for accessing information about Spark applications, such as job / stage / task / storage status.
> There have been a number of proposals to serve JSON representations of the information displayed in Spark's web UI.  Given that we might redesign the pages of the web UI (and possibly re-implement the UI as a client of a REST API), the API endpoints and their responses should be independent of what we choose to display on particular web UI pages / layouts.
> Let's start a discussion of what a good REST API would look like from first-principles.  We can discuss what urls / endpoints expose access to data, how our JSON responses will be formatted, how fields will be named, how the API will be documented and tested, etc.
> Some links for inspiration:
> https://developer.github.com/v3/
> http://developer.netflix.com/docs/REST_API_Reference
> https://helloreverb.com/developers/swagger



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org