You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ximo Guanter (JIRA)" <ji...@apache.org> on 2019/03/13 08:59:00 UTC

[jira] [Commented] (SPARK-18127) Add hooks and extension points to Spark

    [ https://issues.apache.org/jira/browse/SPARK-18127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791464#comment-16791464 ] 

Ximo Guanter commented on SPARK-18127:
--------------------------------------

This article explain some of it: https://developer.ibm.com/code/2017/11/30/learn-extension-points-apache-spark-extend-spark-catalyst-optimizer/

> Add hooks and extension points to Spark
> ---------------------------------------
>
>                 Key: SPARK-18127
>                 URL: https://issues.apache.org/jira/browse/SPARK-18127
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Srinath
>            Assignee: Sameer Agarwal
>            Priority: Major
>             Fix For: 2.2.0
>
>
> As a Spark user I want to be able to customize my spark session. I currently want to be able to do the following things:
> # I want to be able to add custom analyzer rules. This allows me to implement my own logical constructs; an example of this could be a recursive operator.
> # I want to be able to add my own analysis checks. This allows me to catch problems with spark plans early on. An example of this can be some datasource specific checks.
> # I want to be able to add my own optimizations. This allows me to optimize plans in different ways, for instance when you use a very different cluster (for example a one-node X1 instance). This supersedes the current {{spark.experimental}} methods
> # I want to be able to add my own planning strategies. This supersedes the current {{spark.experimental}} methods. This allows me to plan my own physical plan, an example of this would to plan my own heavily integrated data source (CarbonData for example).
> # I want to be able to use my own customized SQL constructs. An example of this would supporting my own dialect, or be able to add constructs to the current SQL language. I should not have to implement a complete parse, and should be able to delegate to an underlying parser.
> # I want to be able to track modifications and calls to the external catalog. I want this API to be stable. This allows me to do synchronize with other systems.
> This API should modify the SparkSession when the session gets started, and it should NOT change the session in flight.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org