You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/13 04:55:43 UTC

[GitHub] [hudi] yihua commented on a diff in pull request #5630: [HUDI-3994] - Added support for initializing DeltaStreamer without a …

yihua commented on code in PR #5630:
URL: https://github.com/apache/hudi/pull/5630#discussion_r969149067


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java:
##########
@@ -285,6 +285,24 @@ private static SparkConf buildSparkConf(String appName, String defaultMaster, Ma
     return SparkRDDWriteClient.registerClasses(sparkConf);
   }
 
+  private static SparkConf buildSparkConf(String appName, Map<String, String> additionalConfigs) {
+    final SparkConf sparkConf = new SparkConf().setAppName(appName);
+    sparkConf.set("spark.ui.port", "8090");
+    sparkConf.setIfMissing("spark.driver.maxResultSize", "2g");
+    sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
+    sparkConf.set("spark.hadoop.mapred.output.compress", "true");
+    sparkConf.set("spark.hadoop.mapred.output.compression.codec", "true");
+    sparkConf.set("spark.hadoop.mapred.output.compression.codec", "org.apache.hadoop.io.compress.GzipCodec");
+    sparkConf.set("spark.hadoop.mapred.output.compression.type", "BLOCK");

Review Comment:
   @Neuw84 are these particularly for AWS Glue?  Could we also update the Hudi website / docs on how Deltastreamer can be used on serverless platform?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org