You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2022/06/08 05:48:00 UTC
[jira] [Updated] (HUDI-3994) HoodieDeltaStreamer - Spark master shouldn't have a default
[ https://issues.apache.org/jira/browse/HUDI-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Guo updated HUDI-3994:
----------------------------
Fix Version/s: 0.12.0
(was: 0.11.1)
> HoodieDeltaStreamer - Spark master shouldn't have a default
> -----------------------------------------------------------
>
> Key: HUDI-3994
> URL: https://issues.apache.org/jira/browse/HUDI-3994
> Project: Apache Hudi
> Issue Type: Improvement
> Components: deltastreamer, spark
> Reporter: Angel Conde
> Priority: Minor
> Labels: easyfix, pull-request-available
> Fix For: 0.12.0
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> When trying to run HoodieDeltaStreamer on AWS Glue I found that the Spark master has no option to inherit from the environment as it defaults to {{{}local[2]{}}}. In these kind of Serverless environments where you do not have access to the master this configuration should be inherited
> This can be seen on line 329 on [HoodieDeltaStreamer|https://github.com/apache/hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java].
> {{public String sparkMaster = "local[2]";}}
> This should be changed for supporting this kind of scenarios, a JavaSparkContext option where no Spark master is defined should be there.
> *Expected behavior*
> The Spark master shouldn't have a default as there are some environments (usually serverless such as AWS Glue) where it will be inherited.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)