You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@sdap.apache.org by GitBox <gi...@apache.org> on 2018/11/05 02:26:20 UTC

[GitHub] jjacob7734 opened a new pull request #50: SDAP-151 Determine parallelism automatically for Spark analytics

jjacob7734 opened a new pull request #50: SDAP-151 Determine parallelism automatically for Spark analytics
URL: https://github.com/apache/incubator-sdap-nexus/pull/50
 
 
   The built-in NEXUS analytics timeSeriesSpark, timeAvgMapSpark, corrMapSpark, and climMapSpark got the desired parallelism from a job request parameter like "spark=mesos,16,32".  If that was omitted, we defaulted to "spark=local,1,1", which runs on a single core.  The new algorithms automatically determine the appropriate level of parallelism based on the job's Spark cluster configuration.  The job parameter called "spark" is no longer supported.  A new optional job parameter called "nparts" can be used to explicitly set the number of data partitions (e.g., "nparts=16").

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services