You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jackey Lee (JIRA)" <ji...@apache.org> on 2018/10/08 00:37:00 UTC

[jira] [Commented] (SPARK-24630) SPIP: Support SQLStreaming in Spark

    [ https://issues.apache.org/jira/browse/SPARK-24630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641258#comment-16641258 ] 

Jackey Lee commented on SPARK-24630:
------------------------------------

SQLStreaming is another interfaces for StructStreaming. Those, who don't know Streaming or Python and Scala APIs, can use Spark SQL to complete StructStreaming calculations. 
   *What's new in SQLStreaming:*
   *StructStreaming is combined with the data warehouse (Hive).*
   SQLStreaming stores the metadata information of Source or Sink in the data warehouse, and use data warehouse to manage the metadata. The upstream user writes out to the Stream Table, and the downstream user subscribes to it.
{code:sql}
-- read kafka stream table and insert into kafka table
insert into kafka_sql_test1
select stream 
        cast(key as string), 
        combine(get_json_object(cast(value as string), '$.name'), 
                get_json_object(cast(value as string), '$.phone')) as value,
        topic
from kafka_sql_test
-- read kafka stream table and insert into csv table
insert into csv_sql_test
select stream get_json_object(cast(value as string), '$.key') as userKey,
              count(*),
              window(timestamp, '10 seconds', '5 seconds') as timestamp
from kafka_sql_test1
group by window(timestamp, '10 seconds', '5 seconds'), 
get_json_object(cast(value as string), '$.key')
{code}

> SPIP: Support SQLStreaming in Spark
> -----------------------------------
>
>                 Key: SPARK-24630
>                 URL: https://issues.apache.org/jira/browse/SPARK-24630
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 2.2.0, 2.2.1
>            Reporter: Jackey Lee
>            Priority: Minor
>              Labels: SQLStreaming
>         Attachments: SQLStreaming SPIP.pdf
>
>
> At present, KafkaSQL, Flink SQL(which is actually based on Calcite), SQLStream, StormSQL all provide a stream type SQL interface, with which users with little knowledge about streaming,  can easily develop a flow system processing model. In Spark, we can also support SQL API based on StructStreamig.
> To support for SQL Streaming, there are two key points: 
> 1, Analysis should be able to parse streaming type SQL. 
> 2, Analyzer should be able to map metadata information to the corresponding 
> Relation. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org