You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Sagar Sumit (Jira)" <ji...@apache.org> on 2022/09/13 15:46:00 UTC

[jira] [Closed] (HUDI-3994) HoodieDeltaStreamer - Spark master shouldn't have a default

     [ https://issues.apache.org/jira/browse/HUDI-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sagar Sumit closed HUDI-3994.
-----------------------------
    Resolution: Fixed

> HoodieDeltaStreamer - Spark master shouldn't have a default
> -----------------------------------------------------------
>
>                 Key: HUDI-3994
>                 URL: https://issues.apache.org/jira/browse/HUDI-3994
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: deltastreamer, spark
>            Reporter: Angel Conde 
>            Assignee: Angel Conde 
>            Priority: Critical
>              Labels: easyfix, pull-request-available
>             Fix For: 0.12.1
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When trying to run HoodieDeltaStreamer on AWS Glue I found that the Spark master has no option to inherit from the environment as it defaults to {{{}local[2]{}}}. In these kind of Serverless environments where you do not have access to the master this configuration should be inherited
> This can be seen on line 329 on [HoodieDeltaStreamer|https://github.com/apache/hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java].
> {{public String sparkMaster = "local[2]";}}
> This should be changed for supporting this kind of scenarios, a JavaSparkContext option where no Spark master is defined should be there.
> *Expected behavior*
> The Spark master shouldn't have a default as there are some environments (usually serverless such as AWS Glue) where it will be inherited.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)