You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sedona.apache.org by "Jia Yu (Jira)" <ji...@apache.org> on 2022/07/15 07:44:00 UTC

[jira] [Commented] (SEDONA-133) Allow user-defined schemas in Adapter.toDf()

    [ https://issues.apache.org/jira/browse/SEDONA-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567137#comment-17567137 ] 

Jia Yu commented on SEDONA-133:
-------------------------------

This makes perfect sense to me. Can you prepare a PR?

> Allow user-defined schemas in Adapter.toDf()
> --------------------------------------------
>
>                 Key: SEDONA-133
>                 URL: https://issues.apache.org/jira/browse/SEDONA-133
>             Project: Apache Sedona
>          Issue Type: Improvement
>            Reporter: Brian Rice
>            Priority: Normal
>
> Hello!
> I would like to propose a new overloaded method for supporting user-defined schemas in {{Adapter.toDf()}} (for both SpatialRDD and JavaPairRDD). Currently fields are coerced to StringType, which does not work for all use cases (specifically, I have structs that lose all their nested columns if casted to StringType). I can do a workaround, but it would be nice to have this off the shelf. Some sample code from Adapter.scala:
> {{cols = cols ++ fieldNames.map(f => StructField(f, {+}StringType{+}))}}
>  
> {{...}}
>  
> {{cols = cols ++ leftFieldnames.map(fName => StructField(fName, {+}StringType{+}))}}
> {{cols = cols ++ rightFieldNames.map(fName => StructField(fName, {+}StringType{+}))}}
>  
> My thinking is that the user could provide the schema directly in the form of a StructType object. The expectation would be that they are responsible enough to provide the correct field names and data types if they want to provide the schema at all.
>  
> I would be happy to work on a PR if it's deemed appropriate. What are your thoughts?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)