You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2021/11/29 01:24:00 UTC

[jira] [Assigned] (HUDI-2880) Error reading in properties from dfs in Spark Shell

     [ https://issues.apache.org/jira/browse/HUDI-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Guo reassigned HUDI-2880:
-------------------------------

    Assignee: Wenning Ding

> Error reading in properties from dfs in Spark Shell
> ---------------------------------------------------
>
>                 Key: HUDI-2880
>                 URL: https://issues.apache.org/jira/browse/HUDI-2880
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Ethan Guo
>            Assignee: Wenning Ding
>            Priority: Major
>
> I encountered the following warnings and an error when using spark datasource in spark shell, the write is successful though:
> {code:java}
> scala> val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
> warning: there was one deprecation warning (since 2.12.0)
> warning: there was one deprecation warning (since 2.2.0)
> warning: there were two deprecation warnings in total; for details, enable `:setting -deprecation' or `:replay -deprecation'
> df: org.apache.spark.sql.DataFrame = [begin_lat: double, begin_lon: double ... 8 more fields]
> scala> 
> scala> df.write.format("hudi").
>      |   option("hoodie.insert.shuffle.parallelism", "2").
>      |   option("hoodie.upsert.shuffle.parallelism", "2").
>      |   option("hoodie.bulkinsert.shuffle.parallelism", "2").
>      |   option("hoodie.delete.shuffle.parallelism", "2").
>      |   option(PRECOMBINE_FIELD_OPT_KEY, "ts").
>      |   option(RECORDKEY_FIELD_OPT_KEY, "uuid").
>      |   option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
>      |   option(TABLE_NAME, tableName).
>      |   option("hoodie.parquet.small.file.limit", "0").
>      |   option("hoodie.clustering.inline", "true").
>      |   option("hoodie.clustering.inline.max.commits", "2").
>      |   option("hoodie.clustering.plan.strategy.target.file.max.bytes", "1073741824").
>      |   option("hoodie.clustering.plan.strategy.small.file.limit", "629145600").
>      |   option("hoodie.clustering.plan.strategy.sort.columns", "rider,driver").
>      |   option("hoodie.layout.optimize.enable", "true").
>      |   mode(Append).
>      |   save(basePath)
> warning: there was one deprecation warning; for details, enable `:setting -deprecation' or `:replay -deprecation'
> 21/11/26 20:53:41 WARN DFSPropertiesConfiguration: Cannot find HUDI_CONF_DIR, please set it as the dir of hudi-defaults.conf
> 21/11/26 20:53:41 ERROR DFSPropertiesConfiguration: Error reading in properties from dfs
> 21/11/26 20:53:41 WARN DFSPropertiesConfiguration: Didn't find config file under default conf file dir: file:/etc/hudi/conf {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)