You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Anton Okolnychyi <an...@gmail.com> on 2016/12/15 08:28:58 UTC
Expand the Spark SQL programming guide?
Hi,
I am wondering whether it makes sense to expand the Spark SQL programming
guide with examples of aggregations (including user-defined via the
Aggregator API) and window functions. For instance, there might be a
separate subsection under "Getting Started" for each functionality.
SPARK-16046 seems to be related but there is no activity for more than 4
months.
Best regards,
Anton
Re: Expand the Spark SQL programming guide?
Posted by Ricardo Almeida <ri...@actnowib.com>.
The examples look great indeed. Seems a good addition to the existing
documentation.
I understand the UDAF examples don't apply to Python but is there any
relevant reason to skip Python API altogether from this window functions
documentation?
On 20 December 2016 at 16:56, Jim Hughes <jn...@ccri.com> wrote:
> Hi Anton,
>
> Your example and documentation looks great! I left some comments
> suggesting a few additions, but the PR in its current state is a great
> improvement!
>
> Thanks,
>
> Jim
>
>
> On 12/18/2016 09:09 AM, Anton Okolnychyi wrote:
>
> Any comments/suggestions are more than welcome.
>
> Thanks,
> Anton
>
> 2016-12-18 15:08 GMT+01:00 Anton Okolnychyi <an...@gmail.com>:
>
>> Here is the pull request: <https://github.com/apache/spark/pull/16329>
>> https://github.com/apache/spark/pull/16329
>>
>>
>>
>> 2016-12-16 20:54 GMT+01:00 Jim Hughes < <jn...@ccri.com>:
>>
>>> I'd be happy to review a PR. At the minute, I'm still learning Spark
>>> SQL, so writing documentation might be a bit of a stretch, but reviewing
>>> would be fine.
>>>
>>> Thanks!
>>>
>>>
>>> On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>>>
>>> Yes - that sounds good Anton, I can work on documenting the window
>>> functions.
>>>
>>>
>>>
>>> *From: *Anton Okolnychyi <an...@gmail.com>
>>> <an...@gmail.com> <an...@gmail.com>
>>> *Date: *Thursday, December 15, 2016 at 4:34 PM
>>> *To: *Conversant <jt...@conversantmedia.com>
>>> <jt...@conversantmedia.com> <jt...@conversantmedia.com>
>>> *Cc: *Michael Armbrust <mi...@databricks.com>
>>> <mi...@databricks.com>, Jim Hughes <jn...@ccri.com>
>>> <jn...@ccri.com>, "dev@spark.apache.org" <de...@spark.apache.org>
>>> <de...@spark.apache.org> <de...@spark.apache.org>
>>> *Subject: *Re: Expand the Spark SQL programming guide?
>>>
>>>
>>>
>>> I think it will make sense to show a sample implementation of
>>> UserDefinedAggregateFunction for DataFrames, and an example of the
>>> Aggregator API for typed Datasets.
>>>
>>>
>>>
>>> Jim, what if I submit a PR and you join the review process? I also do
>>> not mind to split this if you want, but it seems to be an overkill for this
>>> part.
>>>
>>>
>>>
>>> Jayesh, shall I skip the window functions part since you are going to
>>> work on that?
>>>
>>>
>>>
>>> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <
>>> <jt...@conversantmedia.com>:
>>>
>>> I too am interested in expanding the documentation for Spark SQL.
>>>
>>> For my work I needed to get some info/examples/guidance on window
>>> functions and have been using
>>> <https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html>
>>> https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
>>> .
>>>
>>> How about divide and conquer?
>>>
>>>
>>>
>>>
>>>
>>> *From: *Michael Armbrust < <mi...@databricks.com>
>>> michael@databricks.com>
>>> *Date: *Thursday, December 15, 2016 at 3:21 PM
>>> *To: *Jim Hughes < <jn...@ccri.com>
>>> *Cc: *" <de...@spark.apache.org>dev@spark.apache.org" <
>>> <de...@spark.apache.org>
>>> *Subject: *Re: Expand the Spark SQL programming guide?
>>>
>>>
>>>
>>> Pull requests would be welcome for any major missing features in the
>>> guide:
>>> <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
>>> https://github.com/apache/spark/blob/master/docs/sql-
>>> programming-guide.md
>>>
>>>
>>>
>>> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes < <jn...@ccri.com>
>>> jnh5y@ccri.com> wrote:
>>>
>>> Hi Anton,
>>>
>>> I'd like to see this as well. I've been working on implementing
>>> geospatial user-defined types and functions. Having examples of
>>> aggregations and window functions would be awesome!
>>>
>>> I did test out implementing a distributed convex hull as a
>>> UserDefinedAggregateFunction, and that seemed to work sensibly.
>>>
>>> Cheers,
>>>
>>> Jim
>>>
>>>
>>>
>>> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> I am wondering whether it makes sense to expand the Spark SQL
>>> programming guide with examples of aggregations (including user-defined via
>>> the Aggregator API) and window functions. For instance, there might be a
>>> separate subsection under "Getting Started" for each functionality.
>>>
>>>
>>>
>>> SPARK-16046 seems to be related but there is no activity for more than 4
>>> months.
>>>
>>>
>>>
>>> Best regards,
>>>
>>> Anton
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
Re: Expand the Spark SQL programming guide?
Posted by Jim Hughes <jn...@ccri.com>.
Hi Anton,
Your example and documentation looks great! I left some comments
suggesting a few additions, but the PR in its current state is a great
improvement!
Thanks,
Jim
On 12/18/2016 09:09 AM, Anton Okolnychyi wrote:
> Any comments/suggestions are more than welcome.
>
> Thanks,
> Anton
>
> 2016-12-18 15:08 GMT+01:00 Anton Okolnychyi
> <anton.okolnychyi@gmail.com <ma...@gmail.com>>:
>
> Here is the pull request:
> https://github.com/apache/spark/pull/16329
> <https://github.com/apache/spark/pull/16329>
>
>
>
> 2016-12-16 20:54 GMT+01:00 Jim Hughes <jnh5y@ccri.com
> <ma...@ccri.com>>:
>
> I'd be happy to review a PR. At the minute, I'm still
> learning Spark SQL, so writing documentation might be a bit of
> a stretch, but reviewing would be fine.
>
> Thanks!
>
>
> On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>>
>> Yes - that sounds good Anton, I can work on documenting the
>> window functions.
>>
>> *From: *Anton Okolnychyi <an...@gmail.com>
>> <ma...@gmail.com>
>> *Date: *Thursday, December 15, 2016 at 4:34 PM
>> *To: *Conversant <jt...@conversantmedia.com>
>> <ma...@conversantmedia.com>
>> *Cc: *Michael Armbrust <mi...@databricks.com>
>> <ma...@databricks.com>, Jim Hughes <jn...@ccri.com>
>> <ma...@ccri.com>, "dev@spark.apache.org"
>> <ma...@spark.apache.org> <de...@spark.apache.org>
>> <ma...@spark.apache.org>
>> *Subject: *Re: Expand the Spark SQL programming guide?
>>
>> I think it will make sense to show a sample implementation of
>> UserDefinedAggregateFunction for DataFrames, and an example
>> of the Aggregator API for typed Datasets.
>>
>> Jim, what if I submit a PR and you join the review process? I
>> also do not mind to split this if you want, but it seems to
>> be an overkill for this part.
>>
>> Jayesh, shall I skip the window functions part since you are
>> going to work on that?
>>
>> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh
>> <jthakrar@conversantmedia.com
>> <ma...@conversantmedia.com>>:
>>
>> I too am interested in expanding the documentation for
>> Spark SQL.
>>
>> For my work I needed to get some info/examples/guidance
>> on window functions and have been using
>> https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
>> <https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html>
>> .
>>
>> How about divide and conquer?
>>
>> *From: *Michael Armbrust <michael@databricks.com
>> <ma...@databricks.com>>
>> *Date: *Thursday, December 15, 2016 at 3:21 PM
>> *To: *Jim Hughes <jnh5y@ccri.com <ma...@ccri.com>>
>> *Cc: *"dev@spark.apache.org
>> <ma...@spark.apache.org>" <dev@spark.apache.org
>> <ma...@spark.apache.org>>
>> *Subject: *Re: Expand the Spark SQL programming guide?
>>
>> Pull requests would be welcome for any major missing
>> features in the guide:
>> https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>> <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
>>
>> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes
>> <jnh5y@ccri.com <ma...@ccri.com>> wrote:
>>
>> Hi Anton,
>>
>> I'd like to see this as well. I've been working on
>> implementing geospatial user-defined types and
>> functions. Having examples of aggregations and window
>> functions would be awesome!
>>
>> I did test out implementing a distributed convex hull
>> as a UserDefinedAggregateFunction, and that seemed to
>> work sensibly.
>>
>> Cheers,
>>
>> Jim
>>
>> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>>
>> Hi,
>>
>> I am wondering whether it makes sense to expand
>> the Spark SQL programming guide with examples of
>> aggregations (including user-defined via the
>> Aggregator API) and window functions. For
>> instance, there might be a separate
>> subsection under "Getting Started" for each
>> functionality.
>>
>> SPARK-16046 seems to be related but there is no
>> activity for more than 4 months.
>>
>> Best regards,
>>
>> Anton
>>
>
>
>
Re: Expand the Spark SQL programming guide?
Posted by Anton Okolnychyi <an...@gmail.com>.
Any comments/suggestions are more than welcome.
Thanks,
Anton
2016-12-18 15:08 GMT+01:00 Anton Okolnychyi <an...@gmail.com>:
> Here is the pull request: https://github.com/apache/spark/pull/16329
>
>
>
> 2016-12-16 20:54 GMT+01:00 Jim Hughes <jn...@ccri.com>:
>
>> I'd be happy to review a PR. At the minute, I'm still learning Spark
>> SQL, so writing documentation might be a bit of a stretch, but reviewing
>> would be fine.
>>
>> Thanks!
>>
>>
>> On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>>
>> Yes - that sounds good Anton, I can work on documenting the window
>> functions.
>>
>>
>>
>> *From: *Anton Okolnychyi <an...@gmail.com>
>> <an...@gmail.com>
>> *Date: *Thursday, December 15, 2016 at 4:34 PM
>> *To: *Conversant <jt...@conversantmedia.com>
>> <jt...@conversantmedia.com>
>> *Cc: *Michael Armbrust <mi...@databricks.com> <mi...@databricks.com>,
>> Jim Hughes <jn...@ccri.com> <jn...@ccri.com>, "dev@spark.apache.org"
>> <de...@spark.apache.org> <de...@spark.apache.org> <de...@spark.apache.org>
>> *Subject: *Re: Expand the Spark SQL programming guide?
>>
>>
>>
>> I think it will make sense to show a sample implementation of
>> UserDefinedAggregateFunction for DataFrames, and an example of the
>> Aggregator API for typed Datasets.
>>
>>
>>
>> Jim, what if I submit a PR and you join the review process? I also do not
>> mind to split this if you want, but it seems to be an overkill for this
>> part.
>>
>>
>>
>> Jayesh, shall I skip the window functions part since you are going to
>> work on that?
>>
>>
>>
>> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jt...@conversantmedia.com>
>> :
>>
>> I too am interested in expanding the documentation for Spark SQL.
>>
>> For my work I needed to get some info/examples/guidance on window
>> functions and have been using https://databricks.com/blog/20
>> 15/07/15/introducing-window-functions-in-spark-sql.html .
>>
>> How about divide and conquer?
>>
>>
>>
>>
>>
>> *From: *Michael Armbrust <mi...@databricks.com>
>> *Date: *Thursday, December 15, 2016 at 3:21 PM
>> *To: *Jim Hughes < <jn...@ccri.com>
>> *Cc: *"dev@spark.apache.org" <de...@spark.apache.org>
>> *Subject: *Re: Expand the Spark SQL programming guide?
>>
>>
>>
>> Pull requests would be welcome for any major missing features in the
>> guide:
>> <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
>> https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>>
>>
>>
>> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com> wrote:
>>
>> Hi Anton,
>>
>> I'd like to see this as well. I've been working on implementing
>> geospatial user-defined types and functions. Having examples of
>> aggregations and window functions would be awesome!
>>
>> I did test out implementing a distributed convex hull as a
>> UserDefinedAggregateFunction, and that seemed to work sensibly.
>>
>> Cheers,
>>
>> Jim
>>
>>
>>
>> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>>
>> Hi,
>>
>>
>>
>> I am wondering whether it makes sense to expand the Spark SQL programming
>> guide with examples of aggregations (including user-defined via the
>> Aggregator API) and window functions. For instance, there might be a
>> separate subsection under "Getting Started" for each functionality.
>>
>>
>>
>> SPARK-16046 seems to be related but there is no activity for more than 4
>> months.
>>
>>
>>
>> Best regards,
>>
>> Anton
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
Re: Expand the Spark SQL programming guide?
Posted by Anton Okolnychyi <an...@gmail.com>.
Here is the pull request: https://github.com/apache/spark/pull/16329
2016-12-16 20:54 GMT+01:00 Jim Hughes <jn...@ccri.com>:
> I'd be happy to review a PR. At the minute, I'm still learning Spark SQL,
> so writing documentation might be a bit of a stretch, but reviewing would
> be fine.
>
> Thanks!
>
>
> On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>
> Yes - that sounds good Anton, I can work on documenting the window
> functions.
>
>
>
> *From: *Anton Okolnychyi <an...@gmail.com>
> <an...@gmail.com>
> *Date: *Thursday, December 15, 2016 at 4:34 PM
> *To: *Conversant <jt...@conversantmedia.com>
> <jt...@conversantmedia.com>
> *Cc: *Michael Armbrust <mi...@databricks.com> <mi...@databricks.com>,
> Jim Hughes <jn...@ccri.com> <jn...@ccri.com>, "dev@spark.apache.org"
> <de...@spark.apache.org> <de...@spark.apache.org> <de...@spark.apache.org>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
>
>
> I think it will make sense to show a sample implementation of
> UserDefinedAggregateFunction for DataFrames, and an example of the
> Aggregator API for typed Datasets.
>
>
>
> Jim, what if I submit a PR and you join the review process? I also do not
> mind to split this if you want, but it seems to be an overkill for this
> part.
>
>
>
> Jayesh, shall I skip the window functions part since you are going to work
> on that?
>
>
>
> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jt...@conversantmedia.com>:
>
> I too am interested in expanding the documentation for Spark SQL.
>
> For my work I needed to get some info/examples/guidance on window
> functions and have been using https://databricks.com/blog/
> 2015/07/15/introducing-window-functions-in-spark-sql.html .
>
> How about divide and conquer?
>
>
>
>
>
> *From: *Michael Armbrust <mi...@databricks.com>
> *Date: *Thursday, December 15, 2016 at 3:21 PM
> *To: *Jim Hughes < <jn...@ccri.com>
> *Cc: *"dev@spark.apache.org" <de...@spark.apache.org>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
>
>
> Pull requests would be welcome for any major missing features in the
> guide:
> <https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>
> https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>
>
>
> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com> wrote:
>
> Hi Anton,
>
> I'd like to see this as well. I've been working on implementing
> geospatial user-defined types and functions. Having examples of
> aggregations and window functions would be awesome!
>
> I did test out implementing a distributed convex hull as a
> UserDefinedAggregateFunction, and that seemed to work sensibly.
>
> Cheers,
>
> Jim
>
>
>
> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>
> Hi,
>
>
>
> I am wondering whether it makes sense to expand the Spark SQL programming
> guide with examples of aggregations (including user-defined via the
> Aggregator API) and window functions. For instance, there might be a
> separate subsection under "Getting Started" for each functionality.
>
>
>
> SPARK-16046 seems to be related but there is no activity for more than 4
> months.
>
>
>
> Best regards,
>
> Anton
>
>
>
>
>
>
>
>
>
Re: Expand the Spark SQL programming guide?
Posted by Jim Hughes <jn...@ccri.com>.
I'd be happy to review a PR. At the minute, I'm still learning Spark
SQL, so writing documentation might be a bit of a stretch, but reviewing
would be fine.
Thanks!
On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:
>
> Yes - that sounds good Anton, I can work on documenting the window
> functions.
>
> *From: *Anton Okolnychyi <an...@gmail.com>
> *Date: *Thursday, December 15, 2016 at 4:34 PM
> *To: *Conversant <jt...@conversantmedia.com>
> *Cc: *Michael Armbrust <mi...@databricks.com>, Jim Hughes
> <jn...@ccri.com>, "dev@spark.apache.org" <de...@spark.apache.org>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
> I think it will make sense to show a sample implementation of
> UserDefinedAggregateFunction for DataFrames, and an example of the
> Aggregator API for typed Datasets.
>
> Jim, what if I submit a PR and you join the review process? I also do
> not mind to split this if you want, but it seems to be an overkill for
> this part.
>
> Jayesh, shall I skip the window functions part since you are going to
> work on that?
>
> 2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh
> <jthakrar@conversantmedia.com <ma...@conversantmedia.com>>:
>
> I too am interested in expanding the documentation for Spark SQL.
>
> For my work I needed to get some info/examples/guidance on window
> functions and have been using
> https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
> .
>
> How about divide and conquer?
>
> *From: *Michael Armbrust <michael@databricks.com
> <ma...@databricks.com>>
> *Date: *Thursday, December 15, 2016 at 3:21 PM
> *To: *Jim Hughes <jnh5y@ccri.com <ma...@ccri.com>>
> *Cc: *"dev@spark.apache.org <ma...@spark.apache.org>"
> <dev@spark.apache.org <ma...@spark.apache.org>>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
> Pull requests would be welcome for any major missing features in
> the guide:
> https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>
> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jnh5y@ccri.com
> <ma...@ccri.com>> wrote:
>
> Hi Anton,
>
> I'd like to see this as well. I've been working on
> implementing geospatial user-defined types and functions.
> Having examples of aggregations and window functions would be
> awesome!
>
> I did test out implementing a distributed convex hull as a
> UserDefinedAggregateFunction, and that seemed to work sensibly.
>
> Cheers,
>
> Jim
>
> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>
> Hi,
>
> I am wondering whether it makes sense to expand the Spark
> SQL programming guide with examples of aggregations
> (including user-defined via the Aggregator API) and window
> functions. For instance, there might be a separate
> subsection under "Getting Started" for each functionality.
>
> SPARK-16046 seems to be related but there is no activity
> for more than 4 months.
>
> Best regards,
>
> Anton
>
Re: Expand the Spark SQL programming guide?
Posted by "Thakrar, Jayesh" <jt...@conversantmedia.com>.
Yes - that sounds good Anton, I can work on documenting the window functions.
From: Anton Okolnychyi <an...@gmail.com>
Date: Thursday, December 15, 2016 at 4:34 PM
To: Conversant <jt...@conversantmedia.com>
Cc: Michael Armbrust <mi...@databricks.com>, Jim Hughes <jn...@ccri.com>, "dev@spark.apache.org" <de...@spark.apache.org>
Subject: Re: Expand the Spark SQL programming guide?
I think it will make sense to show a sample implementation of UserDefinedAggregateFunction for DataFrames, and an example of the Aggregator API for typed Datasets.
Jim, what if I submit a PR and you join the review process? I also do not mind to split this if you want, but it seems to be an overkill for this part.
Jayesh, shall I skip the window functions part since you are going to work on that?
2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jt...@conversantmedia.com>>:
I too am interested in expanding the documentation for Spark SQL.
For my work I needed to get some info/examples/guidance on window functions and have been using https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html .
How about divide and conquer?
From: Michael Armbrust <mi...@databricks.com>>
Date: Thursday, December 15, 2016 at 3:21 PM
To: Jim Hughes <jn...@ccri.com>>
Cc: "dev@spark.apache.org<ma...@spark.apache.org>" <de...@spark.apache.org>>
Subject: Re: Expand the Spark SQL programming guide?
Pull requests would be welcome for any major missing features in the guide: https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com>> wrote:
Hi Anton,
I'd like to see this as well. I've been working on implementing geospatial user-defined types and functions. Having examples of aggregations and window functions would be awesome!
I did test out implementing a distributed convex hull as a UserDefinedAggregateFunction, and that seemed to work sensibly.
Cheers,
Jim
On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
Hi,
I am wondering whether it makes sense to expand the Spark SQL programming guide with examples of aggregations (including user-defined via the Aggregator API) and window functions. For instance, there might be a separate subsection under "Getting Started" for each functionality.
SPARK-16046 seems to be related but there is no activity for more than 4 months.
Best regards,
Anton
Re: Expand the Spark SQL programming guide?
Posted by Anton Okolnychyi <an...@gmail.com>.
I think it will make sense to show a sample implementation of
UserDefinedAggregateFunction
for DataFrames, and an example of the Aggregator API for typed Datasets.
Jim, what if I submit a PR and you join the review process? I also do not
mind to split this if you want, but it seems to be an overkill for this
part.
Jayesh, shall I skip the window functions part since you are going to work
on that?
2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh <jt...@conversantmedia.com>:
> I too am interested in expanding the documentation for Spark SQL.
>
> For my work I needed to get some info/examples/guidance on window
> functions and have been using https://databricks.com/blog/
> 2015/07/15/introducing-window-functions-in-spark-sql.html .
>
> How about divide and conquer?
>
>
>
>
>
> *From: *Michael Armbrust <mi...@databricks.com>
> *Date: *Thursday, December 15, 2016 at 3:21 PM
> *To: *Jim Hughes <jn...@ccri.com>
> *Cc: *"dev@spark.apache.org" <de...@spark.apache.org>
> *Subject: *Re: Expand the Spark SQL programming guide?
>
>
>
> Pull requests would be welcome for any major missing features in the
> guide: https://github.com/apache/spark/blob/master/docs/
> sql-programming-guide.md
>
>
>
> On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com> wrote:
>
> Hi Anton,
>
> I'd like to see this as well. I've been working on implementing
> geospatial user-defined types and functions. Having examples of
> aggregations and window functions would be awesome!
>
> I did test out implementing a distributed convex hull as a
> UserDefinedAggregateFunction, and that seemed to work sensibly.
>
> Cheers,
>
> Jim
>
>
>
> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>
> Hi,
>
>
>
> I am wondering whether it makes sense to expand the Spark SQL programming
> guide with examples of aggregations (including user-defined via the
> Aggregator API) and window functions. For instance, there might be a
> separate subsection under "Getting Started" for each functionality.
>
>
>
> SPARK-16046 seems to be related but there is no activity for more than 4
> months.
>
>
>
> Best regards,
>
> Anton
>
>
>
>
>
Re: Expand the Spark SQL programming guide?
Posted by "Thakrar, Jayesh" <jt...@conversantmedia.com>.
I too am interested in expanding the documentation for Spark SQL.
For my work I needed to get some info/examples/guidance on window functions and have been using https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html .
How about divide and conquer?
From: Michael Armbrust <mi...@databricks.com>
Date: Thursday, December 15, 2016 at 3:21 PM
To: Jim Hughes <jn...@ccri.com>
Cc: "dev@spark.apache.org" <de...@spark.apache.org>
Subject: Re: Expand the Spark SQL programming guide?
Pull requests would be welcome for any major missing features in the guide: https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com>> wrote:
Hi Anton,
I'd like to see this as well. I've been working on implementing geospatial user-defined types and functions. Having examples of aggregations and window functions would be awesome!
I did test out implementing a distributed convex hull as a UserDefinedAggregateFunction, and that seemed to work sensibly.
Cheers,
Jim
On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
Hi,
I am wondering whether it makes sense to expand the Spark SQL programming guide with examples of aggregations (including user-defined via the Aggregator API) and window functions. For instance, there might be a separate subsection under "Getting Started" for each functionality.
SPARK-16046 seems to be related but there is no activity for more than 4 months.
Best regards,
Anton
Re: Expand the Spark SQL programming guide?
Posted by Michael Armbrust <mi...@databricks.com>.
Pull requests would be welcome for any major missing features in the guide:
https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes <jn...@ccri.com> wrote:
> Hi Anton,
>
> I'd like to see this as well. I've been working on implementing
> geospatial user-defined types and functions. Having examples of
> aggregations and window functions would be awesome!
>
> I did test out implementing a distributed convex hull as a
> UserDefinedAggregateFunction, and that seemed to work sensibly.
>
> Cheers,
>
> Jim
>
>
> On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
>
> Hi,
>
> I am wondering whether it makes sense to expand the Spark SQL programming
> guide with examples of aggregations (including user-defined via the
> Aggregator API) and window functions. For instance, there might be a
> separate subsection under "Getting Started" for each functionality.
>
> SPARK-16046 seems to be related but there is no activity for more than 4
> months.
>
> Best regards,
> Anton
>
>
>
Re: Expand the Spark SQL programming guide?
Posted by Jim Hughes <jn...@ccri.com>.
Hi Anton,
I'd like to see this as well. I've been working on implementing
geospatial user-defined types and functions. Having examples of
aggregations and window functions would be awesome!
I did test out implementing a distributed convex hull as a
UserDefinedAggregateFunction, and that seemed to work sensibly.
Cheers,
Jim
On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:
> Hi,
>
> I am wondering whether it makes sense to expand the Spark SQL
> programming guide with examples of aggregations (including
> user-defined via the Aggregator API) and window functions. For
> instance, there might be a separate subsection under "Getting Started"
> for each functionality.
>
> SPARK-16046 seems to be related but there is no activity for more than
> 4 months.
>
> Best regards,
> Anton