You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by maropu <gi...@git.apache.org> on 2017/08/01 03:08:04 UTC

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

GitHub user maropu opened a pull request:

    https://github.com/apache/spark/pull/18792

    [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF

    ## What changes were proposed in this pull request?
    This pr added documents about unsupported functions in Hive UDF/UDTF/UDAF.
    This pr relates to #18768 and #18527.
    
    ## How was this patch tested?
    N/A


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maropu/spark HOTFIX-20170731

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18792.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18792
    
----
commit 1434bde71df9da49fd4a3aa171c0915efda0c9b5
Author: Takeshi Yamamuro <ya...@apache.org>
Date:   2017-08-01T02:59:06Z

    Add documents about Hive UDFs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Thanks for working on it! Just left some minor comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    **[Test build #80109 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80109/testReport)** for PR 18792 at commit [`29f1108`](https://github.com/apache/spark/commit/29f1108f424c0776005aaaf22dfef04ac7c07cc1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    **[Test build #80109 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80109/testReport)** for PR 18792 at commit [`29f1108`](https://github.com/apache/spark/commit/29f1108f424c0776005aaaf22dfef04ac7c07cc1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18792#discussion_r130523169

--- Diff: docs/sql-programming-guide.md ---
@@ -1903,6 +1903,23 @@ releases of Spark SQL.
Hive can optionally merge the small files into fewer large files to avoid overflowing the HDFS
metadata. Spark SQL does not support that.

+**Hive UDF/UDTF/UDAF**
+
+Not all the APIs of the Hive UDF/UDTF/UDAF are supported by Spark SQL. Below are the unsupported APIs:
+
+* `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to automatically
+ include additional resources required by this UDF.
+* `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses
+ a deprecated interface `initialize(ObjectInspector[])` only.
+* `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize
+ functions with `MapredContext`, which is inapplicable to Spark. But, Spark SQL does not use `MapredContext` internally.
--- End diff --

nit: `But` looks redundant here, because there's `inapplicable` before. Looks like negative to negative...

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80103/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80106/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18792#discussion_r130512423
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1903,6 +1903,25 @@ releases of Spark SQL.
       Hive can optionally merge the small files into fewer large files to avoid overflowing the HDFS
       metadata. Spark SQL does not support that.
     
    +**Hive UDF/UDTF/UDAF**
    +
    +Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users.
    +Some of them are meaningless in Spark and the others are rarely used by users.
    +Below is a list of major APIs we don't support in Spark SQL:
    --- End diff --
    
    `we don't support in Spark SQL:` -> `that are not supported by Spark SQL:`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    **[Test build #80103 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80103/testReport)** for PR 18792 at commit [`1434bde`](https://github.com/apache/spark/commit/1434bde71df9da49fd4a3aa171c0915efda0c9b5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    **[Test build #80103 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80103/testReport)** for PR 18792 at commit [`1434bde`](https://github.com/apache/spark/commit/1434bde71df9da49fd4a3aa171c0915efda0c9b5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/18792#discussion_r130523581

removed. Thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    **[Test build #80107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80107/testReport)** for PR 18792 at commit [`c703d57`](https://github.com/apache/spark/commit/c703d575c85b576ecb2331a32913fa8362c3d140).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18792#discussion_r130512311
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1903,6 +1903,25 @@ releases of Spark SQL.
       Hive can optionally merge the small files into fewer large files to avoid overflowing the HDFS
       metadata. Spark SQL does not support that.
     
    +**Hive UDF/UDTF/UDAF**
    +
    +Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users.
    +Some of them are meaningless in Spark and the others are rarely used by users.
    +Below is a list of major APIs we don't support in Spark SQL:
    +
    +* `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to to automatically
    +  include additional resources required by this UDF.
    +* `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses
    +  a deprecated interface `initialize(ObjectInspector[])` only.
    +* `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize
    +  functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally.
    +* `close` (`GenericUDF` and `GenericUDAFEvaluator`) is a function to release associated resources.
    +  Spark SQL does not call this function when tasks finished.
    +* `reset` (`GenericUDAFEvaluator`) is a function to re-initialize aggregation for reusing the same aggregation.
    +  Spark SQL currently does not support the reuse of aggregation.
    +* `getWindowingEvaluator` (`GenericUDAFEvaluator`) is a function to optimize aggregation by evaluating
    +  an aggregate over a fixed window. Spark SQL does not support this optimization yet.
    --- End diff --
    
    Please remove ` Spark SQL does not support this optimization yet` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80109/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80107/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    LGTM pending Jenkins


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18792#discussion_r130511955
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1903,6 +1903,25 @@ releases of Spark SQL.
       Hive can optionally merge the small files into fewer large files to avoid overflowing the HDFS
       metadata. Spark SQL does not support that.
     
    +**Hive UDF/UDTF/UDAF**
    +
    +Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users.
    +Some of them are meaningless in Spark and the others are rarely used by users.
    +Below is a list of major APIs we don't support in Spark SQL:
    +
    +* `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to to automatically
    --- End diff --
    
    `to to` -> `to`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    **[Test build #80107 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80107/testReport)** for PR 18792 at commit [`c703d57`](https://github.com/apache/spark/commit/c703d575c85b576ecb2331a32913fa8362c3d140).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    **[Test build #80106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80106/testReport)** for PR 18792 at commit [`7d07e6b`](https://github.com/apache/spark/commit/7d07e6bc98d3ccfbc0857dd65c4e6eb2baea5922).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18792#discussion_r130512174
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1903,6 +1903,25 @@ releases of Spark SQL.
       Hive can optionally merge the small files into fewer large files to avoid overflowing the HDFS
       metadata. Spark SQL does not support that.
     
    +**Hive UDF/UDTF/UDAF**
    +
    +Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users.
    +Some of them are meaningless in Spark and the others are rarely used by users.
    +Below is a list of major APIs we don't support in Spark SQL:
    +
    +* `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to to automatically
    +  include additional resources required by this UDF.
    +* `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses
    +  a deprecated interface `initialize(ObjectInspector[])` only.
    +* `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize
    +  functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally.
    +* `close` (`GenericUDF` and `GenericUDAFEvaluator`) is a function to release associated resources.
    +  Spark SQL does not call this function when tasks finished.
    --- End diff --
    
    `finished` -> `finish`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    @gatorsmile If you get time, could you check this? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18792#discussion_r130512125

--- Diff: docs/sql-programming-guide.md ---
@@ -1903,6 +1903,25 @@ releases of Spark SQL.
Hive can optionally merge the small files into fewer large files to avoid overflowing the HDFS
metadata. Spark SQL does not support that.

+**Hive UDF/UDTF/UDAF**
+
+Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users.
+Some of them are meaningless in Spark and the others are rarely used by users.
+Below is a list of major APIs we don't support in Spark SQL:
+
+* `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to to automatically
+ include additional resources required by this UDF.
+* `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses
+ a deprecated interface `initialize(ObjectInspector[])` only.
+* `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize
+ functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally.
--- End diff --

> functions with `MapredContext`, which is inapplicable to Spark.

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    **[Test build #80106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80106/testReport)** for PR 18792 at commit [`7d07e6b`](https://github.com/apache/spark/commit/7d07e6bc98d3ccfbc0857dd65c4e6eb2baea5922).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18792: [SPARK-21589][SQL][DOC] Add documents about Hive ...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18792


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    @gatorsmile ok, fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18792: [SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDT...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18792
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org