You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@impala.apache.org by Vincent Tran <vt...@cloudera.com> on 2017/06/19 13:22:12 UTC

IMPALA-4326 - split() function

This request appears to be blocked by the current UDF framework's
limitation.
As far as I can tell, functions can still only return simple scalar types,
right?

Re: IMPALA-4326 - split() function

Posted by Edward Capriolo <ed...@gmail.com>.

That standard 2016 spec did not predate hive's implementation of lateral
view

On Sunday, July 9, 2017, Greg Rahn <gr...@gmail.com> wrote:

> (also commented on IMPALA-4326)
>
> For this functionality, I'd prefer to follow what Postgres does and use its
> well-named functions like string_to_array().
> This becomes powerful when using the unnest() table function, which is
> defined and is part of the ANSI/ISO SQL:2016 spec (vs the non-standard
> lateral view explode Hive syntax).
>
> with t as (
>   select
>     42 as id,
>     '1,2,3,4,5,6'::text as string_array
> )
> select
>   t.id,
>   u.l
> from t, unnest(string_to_array(t.string_array,',')) as u(l);
>
> id | l
> ----+---
> 42 | 1
> 42 | 2
> 42 | 3
> 42 | 4
> 42 | 5
> 42 | 6
>
>
> On Mon, Jun 19, 2017 at 7:40 AM, Alexander Behm <alex.behm@cloudera.com
> <javascript:;>>
> wrote:
>
> > Yes and no. Extending the UDF framework might be hard, but I think
> > implementing a built-in split() is feasible. We already have a built-in
> > Expr that returns an array type to implement unnest.
> >
> > On Mon, Jun 19, 2017 at 6:22 AM, Vincent Tran <vttran@cloudera.com
> <javascript:;>> wrote:
> >
> > > This request appears to be blocked by the current UDF framework's
> > > limitation.
> > > As far as I can tell, functions can still only return simple scalar
> > types,
> > > right?
> > >
> >
>


-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.

Re: IMPALA-4326 - split() function

Posted by Greg Rahn <gr...@gmail.com>.

(also commented on IMPALA-4326)

For this functionality, I'd prefer to follow what Postgres does and use its
well-named functions like string_to_array().
This becomes powerful when using the unnest() table function, which is
defined and is part of the ANSI/ISO SQL:2016 spec (vs the non-standard
lateral view explode Hive syntax).

with t as (
  select
    42 as id,
    '1,2,3,4,5,6'::text as string_array
)
select
  t.id,
  u.l
from t, unnest(string_to_array(t.string_array,',')) as u(l);

id | l
----+---
42 | 1
42 | 2
42 | 3
42 | 4
42 | 5
42 | 6

On Mon, Jun 19, 2017 at 7:40 AM, Alexander Behm <al...@cloudera.com>
wrote:

> Yes and no. Extending the UDF framework might be hard, but I think
> implementing a built-in split() is feasible. We already have a built-in
> Expr that returns an array type to implement unnest.
>
> On Mon, Jun 19, 2017 at 6:22 AM, Vincent Tran <vt...@cloudera.com> wrote:
>
> > This request appears to be blocked by the current UDF framework's
> > limitation.
> > As far as I can tell, functions can still only return simple scalar
> types,
> > right?
> >
>

Re: IMPALA-4326 - split() function

Posted by Alexander Behm <al...@cloudera.com>.

Yes and no. Extending the UDF framework might be hard, but I think
implementing a built-in split() is feasible. We already have a built-in
Expr that returns an array type to implement unnest.

On Mon, Jun 19, 2017 at 6:22 AM, Vincent Tran <vt...@cloudera.com> wrote:

> This request appears to be blocked by the current UDF framework's
> limitation.
> As far as I can tell, functions can still only return simple scalar types,
> right?
>