You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Sagar Sumit (Jira)" <ji...@apache.org> on 2022/09/13 15:46:00 UTC
[jira] [Closed] (HUDI-3994) HoodieDeltaStreamer - Spark master shouldn't have a default
[ https://issues.apache.org/jira/browse/HUDI-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sagar Sumit closed HUDI-3994.
-----------------------------
Resolution: Fixed
> HoodieDeltaStreamer - Spark master shouldn't have a default
> -----------------------------------------------------------
>
> Key: HUDI-3994
> URL: https://issues.apache.org/jira/browse/HUDI-3994
> Project: Apache Hudi
> Issue Type: Improvement
> Components: deltastreamer, spark
> Reporter: Angel Conde
> Assignee: Angel Conde
> Priority: Critical
> Labels: easyfix, pull-request-available
> Fix For: 0.12.1
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> When trying to run HoodieDeltaStreamer on AWS Glue I found that the Spark master has no option to inherit from the environment as it defaults to {{{}local[2]{}}}. In these kind of Serverless environments where you do not have access to the master this configuration should be inherited
> This can be seen on line 329 on [HoodieDeltaStreamer|https://github.com/apache/hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java].
> {{public String sparkMaster = "local[2]";}}
> This should be changed for supporting this kind of scenarios, a JavaSparkContext option where no Spark master is defined should be there.
> *Expected behavior*
> The Spark master shouldn't have a default as there are some environments (usually serverless such as AWS Glue) where it will be inherited.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)