You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Anton Okolnychyi <an...@gmail.com> on 2016/12/15 08:28:58 UTC

Expand the Spark SQL programming guide?

Hi,

I am wondering whether it makes sense to expand the Spark SQL programming
guide with examples of aggregations (including user-defined via the
Aggregator API) and window functions.  For instance, there might be a
separate subsection under "Getting Started" for each functionality.

SPARK-16046 seems to be related but there is no activity for more than 4
months.

Best regards,
Anton

Re: Expand the Spark SQL programming guide?

Posted by Ricardo Almeida <ri...@actnowib.com>.
The examples look great indeed. Seems a good addition to the existing
documentation.
I understand the UDAF examples don't apply to Python but is there any
relevant reason to skip Python API altogether from this window functions
documentation?

On 20 December 2016 at 16:56, Jim Hughes <jn...@ccri.com> wrote:

> Hi Anton,
>
> Your example and documentation looks great!  I left some comments
> suggesting a few additions, but the PR in its current state is a great
> improvement!
>
> Thanks,
>
> Jim
>
>
> On 12/18/2016 09:09 AM, Anton Okolnychyi wrote:
>
> Any comments/suggestions are more than welcome.
>
> Thanks,
> Anton
>
> 2016-12-18 15:08 GMT+01:00 Anton Okolnychyi <an...@gmail.com>:
>
>> Here is the pull request: <https://github.com/apache/spark/pull/16329>
>> https://github.com/apache/spark/pull/16329
>>
>>
>>
>> 2016-12-16 20:54 GMT+01:00 Jim Hughes < <jn...@ccri.com>:
>>
>>> I'd be happy to review a PR.  At the minute, I'm still learning Spark
>>> SQL, so writing documentation might be a bit of a stretch, but reviewing
>>> would be fine.
>>>
>>> Thanks!
>>>
>>>
>>> On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>>>
>>> Yes - that sounds good Anton, I can work on documenting the window
>>> functions.
>>>
>>>
>>>
>>> *From: *Anton Okolnychyi <an...@gmail.com>
>>> <an...@gmail.com> <an...@gmail.com>
>>> *Date: *Thursday, December 15, 2016 at 4:34 PM
>>> *To: *Conversant <jt...@conversantmedia.com>
>>> <jt...@conversantmedia.com> <jt...@conversantmedia.com>
>>> *Cc: *Michael Armbrust <mi...@databricks.com>
>>> <mi...@databricks.com>, Jim Hughes <jn...@ccri.com>
>>> <jn...@ccri.com>, "dev@spark.apache.org" <de...@spark.apache.org>
>>> <de...@spark.apache.org> <de...@spark.apache.org>
>>> *Subject: *Re: Expand the Spark SQL programming guide?
>>>
>>>
>>>
>>> I think it will make sense to show a sample implementation of
>>> UserDefinedAggregateFunction for DataFrames, and an example of the
>>> Aggregator API for typed Datasets.
>>>
>>>
>>>
>>> Jim, what if I submit a PR and you join the review process? I also do
>>> not mind to split this if you want, but it seems to be an overkill for this
>>> part.
>>>
>>>
>>>
>>> Jayesh, shall I skip the window functions part since you are going to
>>> work on that?
>>>
>>>
>>>
>>> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <
>>> <jt...@conversantmedia.com>:
>>>
>>> I too am interested in expanding the documentation for Spark SQL.
>>>
>>> For my work I needed to get some info/examples/guidance on window
>>> functions and have been using
>>> <https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html>
>>> https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
>>> .
>>>
>>> How about divide and conquer?
>>>
>>>
>>>
>>>
>>>
>>> *From: *Michael Armbrust < <mi...@databricks.com>
>>> michael@databricks.com>
>>> *Date: *Thursday, December 15, 2016 at 3:21 PM
>>> *To: *Jim Hughes < <jn...@ccri.com>
>>> *Cc: *" <de...@spark.apache.org>dev@spark.apache.org" <
>>> <de...@spark.apache.org>
>>> *Subject: *Re: Expand the Spark SQL programming guide?
>>>
>>>
>>>
>>> Pull requests would be welcome for any major missing features in the
>>> guide:
>>> <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
>>> https://github.com/apache/spark/blob/master/docs/sql-
>>> programming-guide.md
>>>
>>>
>>>
>>> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes < <jn...@ccri.com>
>>> jnh5y@ccri.com> wrote:
>>>
>>> Hi Anton,
>>>
>>> I'd like to see this as well.  I've been working on implementing
>>> geospatial user-defined types and functions.  Having examples of
>>> aggregations and window functions would be awesome!
>>>
>>> I did test out implementing a distributed convex hull as a
>>> UserDefinedAggregateFunction, and that seemed to work sensibly.
>>>
>>> Cheers,
>>>
>>> Jim
>>>
>>>
>>>
>>> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> I am wondering whether it makes sense to expand the Spark SQL
>>> programming guide with examples of aggregations (including user-defined via
>>> the Aggregator API) and window functions.  For instance, there might be a
>>> separate subsection under "Getting Started" for each functionality.
>>>
>>>
>>>
>>> SPARK-16046 seems to be related but there is no activity for more than 4
>>> months.
>>>
>>>
>>>
>>> Best regards,
>>>
>>> Anton
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>

Re: Expand the Spark SQL programming guide?

Posted by Jim Hughes <jn...@ccri.com>.
Hi Anton,

Your example and documentation looks great!  I left some comments 
suggesting a few additions, but the PR in its current state is a great 
improvement!

Thanks,

Jim

On 12/18/2016 09:09 AM, Anton Okolnychyi wrote:
> Any comments/suggestions are more than welcome.
>
> Thanks,
> Anton
>
> 2016-12-18 15:08 GMT+01:00 Anton Okolnychyi 
> <anton.okolnychyi@gmail.com <ma...@gmail.com>>:
>
>     Here is the pull request:
>     https://github.com/apache/spark/pull/16329
>     <https://github.com/apache/spark/pull/16329>
>
>
>
>     2016-12-16 20:54 GMT+01:00 Jim Hughes <jnh5y@ccri.com
>     <ma...@ccri.com>>:
>
>         I'd be happy to review a PR.  At the minute, I'm still
>         learning Spark SQL, so writing documentation might be a bit of
>         a stretch, but reviewing would be fine.
>
>         Thanks!
>
>
>         On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>>
>>         Yes - that sounds good Anton, I can work on documenting the
>>         window functions.
>>
>>         *From: *Anton Okolnychyi <an...@gmail.com>
>>         <ma...@gmail.com>
>>         *Date: *Thursday, December 15, 2016 at 4:34 PM
>>         *To: *Conversant <jt...@conversantmedia.com>
>>         <ma...@conversantmedia.com>
>>         *Cc: *Michael Armbrust <mi...@databricks.com>
>>         <ma...@databricks.com>, Jim Hughes <jn...@ccri.com>
>>         <ma...@ccri.com>, "dev@spark.apache.org"
>>         <ma...@spark.apache.org> <de...@spark.apache.org>
>>         <ma...@spark.apache.org>
>>         *Subject: *Re: Expand the Spark SQL programming guide?
>>
>>         I think it will make sense to show a sample implementation of
>>         UserDefinedAggregateFunction for DataFrames, and an example
>>         of the Aggregator API for typed Datasets.
>>
>>         Jim, what if I submit a PR and you join the review process? I
>>         also do not mind to split this if you want, but it seems to
>>         be an overkill for this part.
>>
>>         Jayesh, shall I skip the window functions part since you are
>>         going to work on that?
>>
>>         2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh
>>         <jthakrar@conversantmedia.com
>>         <ma...@conversantmedia.com>>:
>>
>>             I too am interested in expanding the documentation for
>>             Spark SQL.
>>
>>             For my work I needed to get some info/examples/guidance
>>             on window functions and have been using
>>             https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
>>             <https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html>
>>             .
>>
>>             How about divide and conquer?
>>
>>             *From: *Michael Armbrust <michael@databricks.com
>>             <ma...@databricks.com>>
>>             *Date: *Thursday, December 15, 2016 at 3:21 PM
>>             *To: *Jim Hughes <jnh5y@ccri.com <ma...@ccri.com>>
>>             *Cc: *"dev@spark.apache.org
>>             <ma...@spark.apache.org>" <dev@spark.apache.org
>>             <ma...@spark.apache.org>>
>>             *Subject: *Re: Expand the Spark SQL programming guide?
>>
>>             Pull requests would be welcome for any major missing
>>             features in the guide:
>>             https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>>             <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
>>
>>             On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes
>>             <jnh5y@ccri.com <ma...@ccri.com>> wrote:
>>
>>                 Hi Anton,
>>
>>                 I'd like to see this as well.  I've been working on
>>                 implementing geospatial user-defined types and
>>                 functions. Having examples of aggregations and window
>>                 functions would be awesome!
>>
>>                 I did test out implementing a distributed convex hull
>>                 as a UserDefinedAggregateFunction, and that seemed to
>>                 work sensibly.
>>
>>                 Cheers,
>>
>>                 Jim
>>
>>                 On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>>
>>                     Hi,
>>
>>                     I am wondering whether it makes sense to expand
>>                     the Spark SQL programming guide with examples of
>>                     aggregations (including user-defined via the
>>                     Aggregator API) and window functions. For
>>                     instance, there might be a separate
>>                     subsection under "Getting Started" for each
>>                     functionality.
>>
>>                     SPARK-16046 seems to be related but there is no
>>                     activity for more than 4 months.
>>
>>                     Best regards,
>>
>>                     Anton
>>
>
>
>


Re: Expand the Spark SQL programming guide?

Posted by Anton Okolnychyi <an...@gmail.com>.
Any comments/suggestions are more than welcome.

Thanks,
Anton

2016-12-18 15:08 GMT+01:00 Anton Okolnychyi <an...@gmail.com>:

> Here is the pull request: https://github.com/apache/spark/pull/16329
>
>
>
> 2016-12-16 20:54 GMT+01:00 Jim Hughes <jn...@ccri.com>:
>
>> I'd be happy to review a PR.  At the minute, I'm still learning Spark
>> SQL, so writing documentation might be a bit of a stretch, but reviewing
>> would be fine.
>>
>> Thanks!
>>
>>
>> On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>>
>> Yes - that sounds good Anton, I can work on documenting the window
>> functions.
>>
>>
>>
>> *From: *Anton Okolnychyi <an...@gmail.com>
>> <an...@gmail.com>
>> *Date: *Thursday, December 15, 2016 at 4:34 PM
>> *To: *Conversant <jt...@conversantmedia.com>
>> <jt...@conversantmedia.com>
>> *Cc: *Michael Armbrust <mi...@databricks.com> <mi...@databricks.com>,
>> Jim Hughes <jn...@ccri.com> <jn...@ccri.com>, "dev@spark.apache.org"
>> <de...@spark.apache.org> <de...@spark.apache.org> <de...@spark.apache.org>
>> *Subject: *Re: Expand the Spark SQL programming guide?
>>
>>
>>
>> I think it will make sense to show a sample implementation of
>> UserDefinedAggregateFunction for DataFrames, and an example of the
>> Aggregator API for typed Datasets.
>>
>>
>>
>> Jim, what if I submit a PR and you join the review process? I also do not
>> mind to split this if you want, but it seems to be an overkill for this
>> part.
>>
>>
>>
>> Jayesh, shall I skip the window functions part since you are going to
>> work on that?
>>
>>
>>
>> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jt...@conversantmedia.com>
>> :
>>
>> I too am interested in expanding the documentation for Spark SQL.
>>
>> For my work I needed to get some info/examples/guidance on window
>> functions and have been using https://databricks.com/blog/20
>> 15/07/15/introducing-window-functions-in-spark-sql.html .
>>
>> How about divide and conquer?
>>
>>
>>
>>
>>
>> *From: *Michael Armbrust <mi...@databricks.com>
>> *Date: *Thursday, December 15, 2016 at 3:21 PM
>> *To: *Jim Hughes < <jn...@ccri.com>
>> *Cc: *"dev@spark.apache.org" <de...@spark.apache.org>
>> *Subject: *Re: Expand the Spark SQL programming guide?
>>
>>
>>
>> Pull requests would be welcome for any major missing features in the
>> guide:
>> <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
>> https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>>
>>
>>
>> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com> wrote:
>>
>> Hi Anton,
>>
>> I'd like to see this as well.  I've been working on implementing
>> geospatial user-defined types and functions.  Having examples of
>> aggregations and window functions would be awesome!
>>
>> I did test out implementing a distributed convex hull as a
>> UserDefinedAggregateFunction, and that seemed to work sensibly.
>>
>> Cheers,
>>
>> Jim
>>
>>
>>
>> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>>
>> Hi,
>>
>>
>>
>> I am wondering whether it makes sense to expand the Spark SQL programming
>> guide with examples of aggregations (including user-defined via the
>> Aggregator API) and window functions.  For instance, there might be a
>> separate subsection under "Getting Started" for each functionality.
>>
>>
>>
>> SPARK-16046 seems to be related but there is no activity for more than 4
>> months.
>>
>>
>>
>> Best regards,
>>
>> Anton
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: Expand the Spark SQL programming guide?

Posted by Anton Okolnychyi <an...@gmail.com>.
Here is the pull request: https://github.com/apache/spark/pull/16329



2016-12-16 20:54 GMT+01:00 Jim Hughes <jn...@ccri.com>:

> I'd be happy to review a PR.  At the minute, I'm still learning Spark SQL,
> so writing documentation might be a bit of a stretch, but reviewing would
> be fine.
>
> Thanks!
>
>
> On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>
> Yes - that sounds good Anton, I can work on documenting the window
> functions.
>
>
>
> *From: *Anton Okolnychyi <an...@gmail.com>
> <an...@gmail.com>
> *Date: *Thursday, December 15, 2016 at 4:34 PM
> *To: *Conversant <jt...@conversantmedia.com>
> <jt...@conversantmedia.com>
> *Cc: *Michael Armbrust <mi...@databricks.com> <mi...@databricks.com>,
> Jim Hughes <jn...@ccri.com> <jn...@ccri.com>, "dev@spark.apache.org"
> <de...@spark.apache.org> <de...@spark.apache.org> <de...@spark.apache.org>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
>
>
> I think it will make sense to show a sample implementation of
> UserDefinedAggregateFunction for DataFrames, and an example of the
> Aggregator API for typed Datasets.
>
>
>
> Jim, what if I submit a PR and you join the review process? I also do not
> mind to split this if you want, but it seems to be an overkill for this
> part.
>
>
>
> Jayesh, shall I skip the window functions part since you are going to work
> on that?
>
>
>
> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jt...@conversantmedia.com>:
>
> I too am interested in expanding the documentation for Spark SQL.
>
> For my work I needed to get some info/examples/guidance on window
> functions and have been using https://databricks.com/blog/
> 2015/07/15/introducing-window-functions-in-spark-sql.html .
>
> How about divide and conquer?
>
>
>
>
>
> *From: *Michael Armbrust <mi...@databricks.com>
> *Date: *Thursday, December 15, 2016 at 3:21 PM
> *To: *Jim Hughes < <jn...@ccri.com>
> *Cc: *"dev@spark.apache.org" <de...@spark.apache.org>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
>
>
> Pull requests would be welcome for any major missing features in the
> guide:
> <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
> https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>
>
>
> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com> wrote:
>
> Hi Anton,
>
> I'd like to see this as well.  I've been working on implementing
> geospatial user-defined types and functions.  Having examples of
> aggregations and window functions would be awesome!
>
> I did test out implementing a distributed convex hull as a
> UserDefinedAggregateFunction, and that seemed to work sensibly.
>
> Cheers,
>
> Jim
>
>
>
> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>
> Hi,
>
>
>
> I am wondering whether it makes sense to expand the Spark SQL programming
> guide with examples of aggregations (including user-defined via the
> Aggregator API) and window functions.  For instance, there might be a
> separate subsection under "Getting Started" for each functionality.
>
>
>
> SPARK-16046 seems to be related but there is no activity for more than 4
> months.
>
>
>
> Best regards,
>
> Anton
>
>
>
>
>
>
>
>
>

Re: Expand the Spark SQL programming guide?

Posted by Jim Hughes <jn...@ccri.com>.
I'd be happy to review a PR.  At the minute, I'm still learning Spark 
SQL, so writing documentation might be a bit of a stretch, but reviewing 
would be fine.

Thanks!

On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>
> Yes - that sounds good Anton, I can work on documenting the window 
> functions.
>
> *From: *Anton Okolnychyi <an...@gmail.com>
> *Date: *Thursday, December 15, 2016 at 4:34 PM
> *To: *Conversant <jt...@conversantmedia.com>
> *Cc: *Michael Armbrust <mi...@databricks.com>, Jim Hughes 
> <jn...@ccri.com>, "dev@spark.apache.org" <de...@spark.apache.org>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
> I think it will make sense to show a sample implementation of 
> UserDefinedAggregateFunction for DataFrames, and an example of the 
> Aggregator API for typed Datasets.
>
> Jim, what if I submit a PR and you join the review process? I also do 
> not mind to split this if you want, but it seems to be an overkill for 
> this part.
>
> Jayesh, shall I skip the window functions part since you are going to 
> work on that?
>
> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh 
> <jthakrar@conversantmedia.com <ma...@conversantmedia.com>>:
>
>     I too am interested in expanding the documentation for Spark SQL.
>
>     For my work I needed to get some info/examples/guidance on window
>     functions and have been using
>     https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
>     .
>
>     How about divide and conquer?
>
>     *From: *Michael Armbrust <michael@databricks.com
>     <ma...@databricks.com>>
>     *Date: *Thursday, December 15, 2016 at 3:21 PM
>     *To: *Jim Hughes <jnh5y@ccri.com <ma...@ccri.com>>
>     *Cc: *"dev@spark.apache.org <ma...@spark.apache.org>"
>     <dev@spark.apache.org <ma...@spark.apache.org>>
>     *Subject: *Re: Expand the Spark SQL programming guide?
>
>     Pull requests would be welcome for any major missing features in
>     the guide:
>     https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>
>     On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jnh5y@ccri.com
>     <ma...@ccri.com>> wrote:
>
>         Hi Anton,
>
>         I'd like to see this as well.  I've been working on
>         implementing geospatial user-defined types and functions. 
>         Having examples of aggregations and window functions would be
>         awesome!
>
>         I did test out implementing a distributed convex hull as a
>         UserDefinedAggregateFunction, and that seemed to work sensibly.
>
>         Cheers,
>
>         Jim
>
>         On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>
>             Hi,
>
>             I am wondering whether it makes sense to expand the Spark
>             SQL programming guide with examples of aggregations
>             (including user-defined via the Aggregator API) and window
>             functions. For instance, there might be a separate
>             subsection under "Getting Started" for each functionality.
>
>             SPARK-16046 seems to be related but there is no activity
>             for more than 4 months.
>
>             Best regards,
>
>             Anton
>


Re: Expand the Spark SQL programming guide?

Posted by "Thakrar, Jayesh" <jt...@conversantmedia.com>.
Yes - that sounds good Anton, I can work on documenting the window functions.

From: Anton Okolnychyi <an...@gmail.com>
Date: Thursday, December 15, 2016 at 4:34 PM
To: Conversant <jt...@conversantmedia.com>
Cc: Michael Armbrust <mi...@databricks.com>, Jim Hughes <jn...@ccri.com>, "dev@spark.apache.org" <de...@spark.apache.org>
Subject: Re: Expand the Spark SQL programming guide?

I think it will make sense to show a sample implementation of UserDefinedAggregateFunction for DataFrames, and an example of the Aggregator API for typed Datasets.

Jim, what if I submit a PR and you join the review process? I also do not mind to split this if you want, but it seems to be an overkill for this part.

Jayesh, shall I skip the window functions part since you are going to work on that?

2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jt...@conversantmedia.com>>:
I too am interested in expanding the documentation for Spark SQL.
For my work I needed to get some info/examples/guidance on window functions and have been using https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html .
How about divide and conquer?


From: Michael Armbrust <mi...@databricks.com>>
Date: Thursday, December 15, 2016 at 3:21 PM
To: Jim Hughes <jn...@ccri.com>>
Cc: "dev@spark.apache.org<ma...@spark.apache.org>" <de...@spark.apache.org>>
Subject: Re: Expand the Spark SQL programming guide?

Pull requests would be welcome for any major missing features in the guide: https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md

On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com>> wrote:
Hi Anton,

I'd like to see this as well.  I've been working on implementing geospatial user-defined types and functions.  Having examples of aggregations and window functions would be awesome!

I did test out implementing a distributed convex hull as a UserDefinedAggregateFunction, and that seemed to work sensibly.

Cheers,

Jim

On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
Hi,

I am wondering whether it makes sense to expand the Spark SQL programming guide with examples of aggregations (including user-defined via the Aggregator API) and window functions.  For instance, there might be a separate subsection under "Getting Started" for each functionality.

SPARK-16046 seems to be related but there is no activity for more than 4 months.

Best regards,
Anton





Re: Expand the Spark SQL programming guide?

Posted by Anton Okolnychyi <an...@gmail.com>.
I think it will make sense to show a sample implementation of
UserDefinedAggregateFunction
for DataFrames, and an example of the Aggregator API for typed Datasets.

Jim, what if I submit a PR and you join the review process? I also do not
mind to split this if you want, but it seems to be an overkill for this
part.

Jayesh, shall I skip the window functions part since you are going to work
on that?

2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jt...@conversantmedia.com>:

> I too am interested in expanding the documentation for Spark SQL.
>
> For my work I needed to get some info/examples/guidance on window
> functions and have been using https://databricks.com/blog/
> 2015/07/15/introducing-window-functions-in-spark-sql.html .
>
> How about divide and conquer?
>
>
>
>
>
> *From: *Michael Armbrust <mi...@databricks.com>
> *Date: *Thursday, December 15, 2016 at 3:21 PM
> *To: *Jim Hughes <jn...@ccri.com>
> *Cc: *"dev@spark.apache.org" <de...@spark.apache.org>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
>
>
> Pull requests would be welcome for any major missing features in the
> guide: https://github.com/apache/spark/blob/master/docs/
> sql-programming-guide.md
>
>
>
> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com> wrote:
>
> Hi Anton,
>
> I'd like to see this as well.  I've been working on implementing
> geospatial user-defined types and functions.  Having examples of
> aggregations and window functions would be awesome!
>
> I did test out implementing a distributed convex hull as a
> UserDefinedAggregateFunction, and that seemed to work sensibly.
>
> Cheers,
>
> Jim
>
>
>
> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>
> Hi,
>
>
>
> I am wondering whether it makes sense to expand the Spark SQL programming
> guide with examples of aggregations (including user-defined via the
> Aggregator API) and window functions.  For instance, there might be a
> separate subsection under "Getting Started" for each functionality.
>
>
>
> SPARK-16046 seems to be related but there is no activity for more than 4
> months.
>
>
>
> Best regards,
>
> Anton
>
>
>
>
>

Re: Expand the Spark SQL programming guide?

Posted by "Thakrar, Jayesh" <jt...@conversantmedia.com>.
I too am interested in expanding the documentation for Spark SQL.
For my work I needed to get some info/examples/guidance on window functions and have been using https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html .
How about divide and conquer?


From: Michael Armbrust <mi...@databricks.com>
Date: Thursday, December 15, 2016 at 3:21 PM
To: Jim Hughes <jn...@ccri.com>
Cc: "dev@spark.apache.org" <de...@spark.apache.org>
Subject: Re: Expand the Spark SQL programming guide?

Pull requests would be welcome for any major missing features in the guide: https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md

On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com>> wrote:
Hi Anton,

I'd like to see this as well.  I've been working on implementing geospatial user-defined types and functions.  Having examples of aggregations and window functions would be awesome!

I did test out implementing a distributed convex hull as a UserDefinedAggregateFunction, and that seemed to work sensibly.

Cheers,

Jim

On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
Hi,

I am wondering whether it makes sense to expand the Spark SQL programming guide with examples of aggregations (including user-defined via the Aggregator API) and window functions.  For instance, there might be a separate subsection under "Getting Started" for each functionality.

SPARK-16046 seems to be related but there is no activity for more than 4 months.

Best regards,
Anton




Re: Expand the Spark SQL programming guide?

Posted by Michael Armbrust <mi...@databricks.com>.
Pull requests would be welcome for any major missing features in the guide:
https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md

On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com> wrote:

> Hi Anton,
>
> I'd like to see this as well.  I've been working on implementing
> geospatial user-defined types and functions.  Having examples of
> aggregations and window functions would be awesome!
>
> I did test out implementing a distributed convex hull as a
> UserDefinedAggregateFunction, and that seemed to work sensibly.
>
> Cheers,
>
> Jim
>
>
> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>
> Hi,
>
> I am wondering whether it makes sense to expand the Spark SQL programming
> guide with examples of aggregations (including user-defined via the
> Aggregator API) and window functions.  For instance, there might be a
> separate subsection under "Getting Started" for each functionality.
>
> SPARK-16046 seems to be related but there is no activity for more than 4
> months.
>
> Best regards,
> Anton
>
>
>

Re: Expand the Spark SQL programming guide?

Posted by Jim Hughes <jn...@ccri.com>.
Hi Anton,

I'd like to see this as well.  I've been working on implementing 
geospatial user-defined types and functions.  Having examples of 
aggregations and window functions would be awesome!

I did test out implementing a distributed convex hull as a 
UserDefinedAggregateFunction, and that seemed to work sensibly.

Cheers,

Jim

On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
> Hi,
>
> I am wondering whether it makes sense to expand the Spark SQL 
> programming guide with examples of aggregations (including 
> user-defined via the Aggregator API) and window functions.  For 
> instance, there might be a separate subsection under "Getting Started" 
> for each functionality.
>
> SPARK-16046 seems to be related but there is no activity for more than 
> 4 months.
>
> Best regards,
> Anton