You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/05 04:05:33 UTC

[GitHub] [hudi] Ambarish-Giri edited a comment on issue #3395: [SUPPORT] Issues with read optimized query MOR table

Ambarish-Giri edited a comment on issue #3395:
URL: https://github.com/apache/hudi/issues/3395#issuecomment-893140263


   Sure @nsivabalan eventually our test and prod environment will be EMR only. But before doing actual testing and derive the benchmarking metrics as I said earlier just evaluating Hudi to explore all its features in my local setup.
    
   But for now below are the libraries I am using :  
   scalaVersion := "2.12.11" 
   libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.7"
   libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.7"
   libraryDependencies += "org.apache.hudi" %% "hudi-spark-bundle" % "0.7.0"
   libraryDependencies += "org.apache.hudi" %% "hudi-utilities-bundle" % "0.7.0"
   libraryDependencies += "org.apache.spark" %% "spark-avro" % "2.4.7"
   
   
   and while creating Spark session Object below is the spark config settings:
   val spark: SparkSession = SparkSession.builder()
         .appName("hudi-datalake")
         .master("local[*]")
         .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
         .config("spark.shuffle.compress", "true")
         .config("spark.shuffle.spill.compress", "true")
         .config("spark.executor.extraJavaOptions", "-XX:+UseG1GC")
         .config("spark.sql.hive.convertMetastoreParquet", "false") 
         .getOrCreate()


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org