You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2018/01/04 06:49:00 UTC
[jira] [Commented] (SPARK-22771) SQL concat for binary
[ https://issues.apache.org/jira/browse/SPARK-22771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310847#comment-16310847 ]
Apache Spark commented on SPARK-22771:
--------------------------------------
User 'maropu' has created a pull request for this issue:
https://github.com/apache/spark/pull/20149
> SQL concat for binary
> ----------------------
>
> Key: SPARK-22771
> URL: https://issues.apache.org/jira/browse/SPARK-22771
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.2.1
> Reporter: Fernando Pereira
> Assignee: Takeshi Yamamuro
> Priority: Minor
> Fix For: 2.3.0
>
>
> spark.sql {{concat}} function automatically casts arguments to StringType and returns a String.
> This might be the behavior of traditional databases, however in Spark there's Binary as a standard type, and concat'ing binary seems reasonable if it returns another binary sequence.
> Taking the example of, e.g. Python where both {{bytes}} and {{unicode}} represent text, by concat'ing both we end up with the same type as the arguments, and in case they are intermixed (str + unicode) the most generic type is returned (unicode).
> Following the same principle, I believe that when concat'ing binary it would make sense to return a binary.
> In terms of Spark behavior, it would affect only the case when all arguments are binary. All other cases should remain unchanged.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org