You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/06/24 17:17:00 UTC

[jira] [Comment Edited] (ARROW-13167) [C++] Type determination kernels ("type", "type_id")

    [ https://issues.apache.org/jira/browse/ARROW-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368978#comment-17368978 ] 

Ian Cook edited comment on ARROW-13167 at 6/24/21, 5:16 PM:
------------------------------------------------------------

From the perspective of a user of a compute API, it is quite inconvenient to switch to using a completely different API in order to determine the data type of a datum. This is a notorious gripe that SQL developers have with many database systems, and it's the reason why some SQL engines like [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/typeof.html] and [Impala|https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/impala_conversion_functions.html#conversion_functions__typeof] have added a {{typeof()}} function to their SQL dialects. I agree that it feels like an "off-label" use of compute functions, but I think the utility it affords warrants this.


was (Author: icook):
From the perspective of a user of a compute API, it is quite inconvenient to switch to using a completely different API in order to determine the data type of a datum. This is a notorious gripe that SQL developers have with many database systems, and it's the reason why some SQL engines like [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/typeof.html] and [Impala title|https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/impala_conversion_functions.html#conversion_functions__typeof] have added a {{typeof()}} function to their SQL dialects. I agree that it feels like an "off-label" use of compute functions, but I think the utility it affords warrants this.

> [C++] Type determination kernels ("type", "type_id")
> ----------------------------------------------------
>
>                 Key: ARROW-13167
>                 URL: https://issues.apache.org/jira/browse/ARROW-13167
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Ian Cook
>            Priority: Major
>
> The Arrow C++ library exposes an API for determining the data type of an expression, but it is exposed as a method of the expression class and it requires that the user pass a schema as an argument to the method. This is inconvenient; for example, we have had to write some inconsistent code in the R bindings to make expression objects carry schemas along with them and then pass the schemas to derivative expressions, unifying schemas as needed for derivative expressions that take 2+ expressions as arguments.
> This would be much cleaner if we could use the kernel function calling interface to call a unary {{type_id}} function that would simply determine the type of its input datum and return a scalar integer value from the data type enum indicating the its data type. It would be convenient to also have a version of this that returned the string description of the data type; I think this could be named {{type}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)