You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@hudi.apache.org by "An, Hongguo (CORP)" <Ho...@ADP.com.INVALID> on 2022/02/18 17:45:52 UTC
Re: Help: Got "scala.None$ is not a valid external type for schema of string" upgrading to Hudi 0.10.1
Correct the subject typo
From: "An, Hongguo (CORP)" <Ho...@ADP.com.INVALID>
Reply-To: "users@hudi.apache.org" <us...@hudi.apache.org>
Date: Friday, February 18, 2022 at 9:43 AM
To: "users@hudi.apache.org" <us...@hudi.apache.org>
Cc: "dev@hudi.apache.org" <de...@hudi.apache.org>
Subject: Help: Got "scala.None$ is not a valid external type for schema of string" upgrading to Audi 0.10.1
WARNING: Do not click links or open attachments unless you recognize the source of the email and know the contents are safe.
________________________________
Greeting:
I have an app working fine with hudi 0.6.0, Now I need to upgrade it so that I can run spark 3.1.2.
I have the following dependencies:
<properties>
<java.version>8</java.version>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<encoding>UTF-8</encoding>
<spark.version>3.1.2</spark.version>
<scala.version>2.12.11</scala.version>
<scala.compat.version>2.12</scala.compat.version>
<elastic-search.version>7.13.4</elastic-search.version>
<jackson.version>2.10.0</jackson.version>
<kafka.version>2.8.0</kafka.version>
</properties>
<dependency>
<groupId>org.apache.hudi</groupId>
<artifactId>hudi-spark${spark.version}-bundle_${scala.compat.version}</artifactId>
<version>0.10.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
The method to write to hudi’s is the same as before:
def writeHudis(df: DataFrame,
databaseName: String, tableName: String, pk: String,
path: String,
timestampKey: String, partitionKey: String=null, hiveStyle: Boolean = false): Unit = {
val env = setup.value.env
val db = ensureDatabase(databaseName)
val hudiOptions = Map[String,String](
HoodieWriteConfig.TBL_NAME.key() -> tableName,
DataSourceWriteOptions.RECORDKEY_FIELD.key() -> pk,
DataSourceWriteOptions.PRECOMBINE_FIELD.key() -> timestampKey,
DataSourceWriteOptions.HIVE_SYNC_ENABLED.key() -> (env != Setup.LOCAL).toString,
DataSourceWriteOptions.HIVE_TABLE.key() -> tableName,
DataSourceWriteOptions.HIVE_DATABASE.key() -> db,
DataSourceWriteOptions.HIVE_URL.key() -> s"jdbc:hive2://${emsMaster(env)}:10000/;ssl=true"
)
// Write a DataFrame as a Hudi dataset
val write = df.write
.format("org.apache.hudi")
.options(hudiOptions)
if(partitionKey == null){
write.option(DataSourceWriteOptions.KEYGENERATOR_CLASS_NAME.key(), "org.apache.hudi.keygen.NonpartitionedKeyGenerator")
.option(DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS.key(), "org.apache.hudi.hive.NonPartitionedExtractor")
.option(DataSourceWriteOptions.OPERATION.key(), DataSourceWriteOptions.INSERT_OPERATION_OPT_VAL)
.option("hoodie.datasource.write.hive_style_partitioning", hiveStyle)
.mode(SaveMode.Overwrite)
.save(path)
}
else {
write
.option(KeyGeneratorOptions.PARTITIONPATH_FIELD_NAME.key(), partitionKey)
.option(DataSourceWriteOptions.HIVE_PARTITION_FIELDS.key(), partitionKey)
.mode(SaveMode.Append)
.save(path)
spark.sql(s"MSCK REPAIR TABLE $db.$tableName")
}
}
But now I am getting
java.lang.RuntimeException: scala.None$ is not a valid external type for schema of string
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.StaticInvoke_4$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_4$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Serializer.apply(ExpressionEncoder.scala:209)
Please help, thanks
Andrew
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.