You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Lantao Jin (Jira)" <ji...@apache.org> on 2020/07/07 13:15:00 UTC

[jira] [Comment Edited] (SPARK-29038) SPIP: Support Spark Materialized View

    [ https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152738#comment-17152738 ] 

Lantao Jin edited comment on SPARK-29038 at 7/7/20, 1:14 PM:
-------------------------------------------------------------

Hi [~AidenZhang], our focusings of MV in recent months are two parts. One is the rewrite algothim optimization. Such as forbidding count distict post aggregation, avoid unnecessary rewrite when do relation replacement. Another is bugfix in MV refresh. Use a Spark listener to deliver the metastore events to refresh. Some parts depends on third part system. So maybe only interfaces are available in community Spark. I don't do the partial/incremental refresh since it's not a blocker for us. I am not sure the community are still interested the feature, but we are moving existing implementation to Spark3.0 now.


was (Author: cltlfcjin):
Hi [~AidenZhang], my focusings of MV in recent months are two parts. One is the rewrite algothim optimization. Such as forbidding count distict post aggregation, avoid unnecessary rewrite when do relation replacement. Another is bugfix in MV refresh. Use a Spark listener to deliver the metastore events to refresh. Some parts depends on third part system. So maybe only interfaces are available in community Spark. I don't do the partial/incremental refresh since it's not a blocker for us. I am not sure the community are still interested the feature, but we are moving existing implementation to Spark3.0 now.

> SPIP: Support Spark Materialized View
> -------------------------------------
>
>                 Key: SPARK-29038
>                 URL: https://issues.apache.org/jira/browse/SPARK-29038
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Lantao Jin
>            Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to accelerate queries. By creating a materialized view through SQL, the data that can be cached is very flexible, and needs to be configured arbitrarily according to specific usage scenarios. The Materialization Manager automatically updates the cache data according to changes in detail source tables, simplifying user work. When user submit query, Spark optimizer rewrites the execution plan based on the available materialized view to determine the optimal execution plan.
> Details in [design doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org