You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by li...@apache.org on 2017/08/01 06:15:55 UTC

spark git commit: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF

Repository: spark
Updated Branches:
  refs/heads/master 9570e81aa -> 110695db7


[SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF

## What changes were proposed in this pull request?
This pr added documents about unsupported functions in Hive UDF/UDTF/UDAF.
This pr relates to #18768 and #18527.

## How was this patch tested?
N/A

Author: Takeshi Yamamuro <ya...@apache.org>

Closes #18792 from maropu/HOTFIX-20170731.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/110695db
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/110695db
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/110695db

Branch: refs/heads/master
Commit: 110695db701d0420ee53cfd5b7096428c489d456
Parents: 9570e81
Author: Takeshi Yamamuro <ya...@apache.org>
Authored: Mon Jul 31 23:15:52 2017 -0700
Committer: gatorsmile <ga...@gmail.com>
Committed: Mon Jul 31 23:15:52 2017 -0700

----------------------------------------------------------------------
 docs/sql-programming-guide.md | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/110695db/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index b5eca76..7f7cf59 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1903,6 +1903,23 @@ releases of Spark SQL.
   Hive can optionally merge the small files into fewer large files to avoid overflowing the HDFS
   metadata. Spark SQL does not support that.
 
+**Hive UDF/UDTF/UDAF**
+
+Not all the APIs of the Hive UDF/UDTF/UDAF are supported by Spark SQL. Below are the unsupported APIs:
+
+* `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to automatically
+  include additional resources required by this UDF.
+* `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses
+  a deprecated interface `initialize(ObjectInspector[])` only.
+* `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize
+  functions with `MapredContext`, which is inapplicable to Spark.
+* `close` (`GenericUDF` and `GenericUDAFEvaluator`) is a function to release associated resources.
+  Spark SQL does not call this function when tasks finish.
+* `reset` (`GenericUDAFEvaluator`) is a function to re-initialize aggregation for reusing the same aggregation.
+  Spark SQL currently does not support the reuse of aggregation.
+* `getWindowingEvaluator` (`GenericUDAFEvaluator`) is a function to optimize aggregation by evaluating
+  an aggregate over a fixed window.
+
 # Reference
 
 ## Data Types


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org