You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sedona.apache.org by "Jia Yu (Jira)" <ji...@apache.org> on 2022/07/15 07:44:00 UTC
[jira] [Commented] (SEDONA-133) Allow user-defined schemas in Adapter.toDf()
[ https://issues.apache.org/jira/browse/SEDONA-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567137#comment-17567137 ]
Jia Yu commented on SEDONA-133:
-------------------------------
This makes perfect sense to me. Can you prepare a PR?
> Allow user-defined schemas in Adapter.toDf()
> --------------------------------------------
>
> Key: SEDONA-133
> URL: https://issues.apache.org/jira/browse/SEDONA-133
> Project: Apache Sedona
> Issue Type: Improvement
> Reporter: Brian Rice
> Priority: Normal
>
> Hello!
> I would like to propose a new overloaded method for supporting user-defined schemas in {{Adapter.toDf()}} (for both SpatialRDD and JavaPairRDD). Currently fields are coerced to StringType, which does not work for all use cases (specifically, I have structs that lose all their nested columns if casted to StringType). I can do a workaround, but it would be nice to have this off the shelf. Some sample code from Adapter.scala:
> {{cols = cols ++ fieldNames.map(f => StructField(f, {+}StringType{+}))}}
>
> {{...}}
>
> {{cols = cols ++ leftFieldnames.map(fName => StructField(fName, {+}StringType{+}))}}
> {{cols = cols ++ rightFieldNames.map(fName => StructField(fName, {+}StringType{+}))}}
>
> My thinking is that the user could provide the schema directly in the form of a StructType object. The expectation would be that they are responsible enough to provide the correct field names and data types if they want to provide the schema at all.
>
> I would be happy to work on a PR if it's deemed appropriate. What are your thoughts?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)