You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@datafu.apache.org by "Efrat Rotenberg (Jira)" <ji...@apache.org> on 2021/10/03 09:42:00 UTC

[jira] [Updated] (DATAFU-158) Document Spark explodeArray function behavior

     [ https://issues.apache.org/jira/browse/DATAFU-158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Efrat Rotenberg updated DATAFU-158:
-----------------------------------
    Attachment: DATAFU-158.patch

> Document Spark explodeArray function behavior
> ---------------------------------------------
>
>                 Key: DATAFU-158
>                 URL: https://issues.apache.org/jira/browse/DATAFU-158
>             Project: DataFu
>          Issue Type: Improvement
>            Reporter: Shay Elbaz
>            Priority: Trivial
>         Attachments: DATAFU-158.patch
>
>
> The `explodeArray` function counts the size of the output array by executing Spark job internally on the input data.  This should be documented, so users could choose whether to persist the input DataFrame, or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)