You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/12/10 17:03:10 UTC

[jira] [Commented] (SPARK-12264) Could DataType provide a TypeTag?

    [ https://issues.apache.org/jira/browse/SPARK-12264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051150#comment-15051150 ] 

Sean Owen commented on SPARK-12264:
-----------------------------------

Maybe this should be a question on user@, or else rephrased as a concrete proposed change.

The types that can turn up in a dataframe are pretty basic, like strings and integers. Is it hard to manage? Of course, user-defined types would require some kind of manual work anyway.

Can you interrogate the class of the columns in the first row to get the ClassTag?

> Could DataType provide a TypeTag?
> ---------------------------------
>
>                 Key: SPARK-12264
>                 URL: https://issues.apache.org/jira/browse/SPARK-12264
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Andras Nemeth
>            Priority: Minor
>
> We are writing code that's dealing with generic DataFrames as inputs and further processes their contents with normal RDD operations (not SQL). We need some mechanism that tells us exactly what Scala types we will find inside a Row of a given DataFrame.
> The schema of the DataFrame contains this information in an abstract sense. But we need to map it to TypeTags, as that's what the rest of the system uses to identify what RDD contains what type of data - quite the natural choice in Scala.
> As far as I can tell, there is no good way to do this today. For now we have a hand coded mapping, but that feels very fragile as spark evolves. Is there a better way I'm missing? And if not, could we create one? Adding a typeTag or scalaTypeTag method to DataType, or at least to AtomicType  seems easy enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org