You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@datafu.apache.org by ey...@apache.org on 2021/10/03 19:09:50 UTC
[datafu] branch master updated: DATAFU-158 Document explodeArray
function
This is an automated email from the ASF dual-hosted git repository.
eyal pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/datafu.git
The following commit(s) were added to refs/heads/master by this push:
new af79454 DATAFU-158 Document explodeArray function
af79454 is described below
commit af79454721bab3c0b45a163ccceadc5579161d2a
Author: efrotenberg <ef...@paypal.com>
AuthorDate: Sun Oct 3 12:07:02 2021 +0300
DATAFU-158 Document explodeArray function
Signed-off-by: Eyal Allweil <ea...@paypal.com>
---
datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala b/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala
index 7853e16..b459da4 100644
--- a/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala
+++ b/datafu-spark/src/main/scala/datafu/spark/SparkDFUtils.scala
@@ -508,7 +508,10 @@ object SparkDFUtils {
"range_size")
}
-/** given an array column that you need to explode into different columns, use this method.
+/**
+ * Given an array column that you need to explode into different columns, use this method.
+ * This function counts the number of output columns by executing the Spark job internally on the input array column.
+ * Consider caching the input dataframe if this is an expensive operation.
*
* @param df
* @param arrayCol