You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sdap.apache.org by "Joseph Jacob (JIRA)" <ji...@apache.org> on 2019/01/16 00:13:00 UTC

[jira] [Updated] (SDAP-151) Determine parallelism automatically for Spark analytics

     [ https://issues.apache.org/jira/browse/SDAP-151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joseph Jacob updated SDAP-151:
------------------------------
    Resolution: Fixed
        Status: Done  (was: In Progress)

* Removed spark configuration, added nparts configuration, and autocompute parallelism for spark-based time series, time averaged map, correlation map, and climatological map.

> Determine parallelism automatically for Spark analytics
> -------------------------------------------------------
>
>                 Key: SDAP-151
>                 URL: https://issues.apache.org/jira/browse/SDAP-151
>             Project: Apache Science Data Analytics Platform
>          Issue Type: Improvement
>            Reporter: Joseph Jacob
>            Assignee: Joseph Jacob
>            Priority: Major
>
> Some of the built-in NEXUS analytics like TimeSeries and TimeAvgMap currently get the desired parallelism from a job request parameter like "spark=mesos,16,32".  If that is omitted, we currently default to "spark=local,1,1", which runs on a single core.  Instead we would like to automatically determine the appropriate level of parallelism based on the job's input data size.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)