You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sedona.apache.org by "Brian Rice (Jira)" <ji...@apache.org> on 2022/07/14 01:15:00 UTC
[jira] [Created] (SEDONA-133) Allow user-defined schemas in Adapter.toDf()
Brian Rice created SEDONA-133:
---------------------------------
Summary: Allow user-defined schemas in Adapter.toDf()
Key: SEDONA-133
URL: https://issues.apache.org/jira/browse/SEDONA-133
Project: Apache Sedona
Issue Type: Improvement
Reporter: Brian Rice
Hello!
I would like to propose a new overloaded method for supporting user-defined schemas in {{Adapter.toDf()}} (for both SpatialRDD and JavaPairRDD). Currently fields are coerced to StringType, which does not work for all use cases (specifically, I have structs that lose all their nested columns if casted to StringType). I can do a workaround, but it would be nice to have this off the shelf. Some sample code from Adapter.scala:
{{cols = cols ++ fieldNames.map(f => StructField(f, {+}StringType{+}))}}
{{...}}
{{cols = cols ++ leftFieldnames.map(fName => StructField(fName, {+}StringType{+}))}}
{{cols = cols ++ rightFieldNames.map(fName => StructField(fName, {+}StringType{+}))}}
My thinking is that the user could provide the schema directly in the form of a StructType object. The expectation would be that they are responsible enough to provide the correct field names and data types if they want to provide the schema at all.
I would be happy to work on a PR if it's deemed appropriate. What are your thoughts?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)