You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jayant Kumar (Jira)" <ji...@apache.org> on 2022/08/30 12:59:00 UTC

[jira] [Created] (SPARK-40277) Use DataFrame's column for referring to DDL schema for from_csv() and from_json()

Jayant Kumar created SPARK-40277:
------------------------------------

             Summary: Use DataFrame's column for referring to DDL schema for from_csv() and from_json()
                 Key: SPARK-40277
                 URL: https://issues.apache.org/jira/browse/SPARK-40277
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Jayant Kumar


With spark's DataFrame api one has to explicitly pass the StrucType to functions like from_csv and from_json. This works okay in general.

In certain circumstances when schema depends on the one of the DataFrame's field, it gets complicated and one has to switch to RDD. This requires additional libraries to be added with additional parsing logic.

I am trying to explore a way to enable such use cases with DataFrame api and function itself. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org