You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Koert Kuipers <ko...@tresata.com> on 2017/04/04 03:18:17 UTC

map transform on array in spark sql

i have a DataFrame where one column has type:

ArrayType(StructType(Seq(
  StructField("a", typeA, nullableA),
  StructField("b", typeB, nullableB)
)))

i would like to map over this array to pick the first element in the
struct. so the result should be a ArrayType(typeA, nullableA). i realize i
can do this with a scala udf if i know typeA. but what if i dont know typeA?

basically i would like to do an expression like:
map(col("x"), _(0)))

any suggestions?

Re: map transform on array in spark sql

Posted by Michael Armbrust <mi...@databricks.com>.
If you can find the name of the struct field from the schema you can just
do:

df.select($"arrayField.a")

Selecting a field from an array returns an array with that field selected
from each element.

On Mon, Apr 3, 2017 at 8:18 PM, Koert Kuipers <ko...@tresata.com> wrote:

> i have a DataFrame where one column has type:
>
> ArrayType(StructType(Seq(
>   StructField("a", typeA, nullableA),
>   StructField("b", typeB, nullableB)
> )))
>
> i would like to map over this array to pick the first element in the
> struct. so the result should be a ArrayType(typeA, nullableA). i realize i
> can do this with a scala udf if i know typeA. but what if i dont know typeA?
>
> basically i would like to do an expression like:
> map(col("x"), _(0)))
>
> any suggestions?
>
>