You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2020/12/04 07:57:00 UTC
[jira] [Comment Edited] (SPARK-33638) Full support of V2 table creation in Structured Streaming writer path

    [ https://issues.apache.org/jira/browse/SPARK-33638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243803#comment-17243803 ] 

Jungtaek Lim edited comment on SPARK-33638 at 12/4/20, 7:56 AM:
----------------------------------------------------------------

I don't agree with handling this in DataStreamWriter, hence I changed the title. My claim is designing DataStreamWriterV2, nothing else.

I also don't agree that we need to deal with partition columns verification in such way. DataFrameWriterV2 does this nicely, via branching the path between appending/overwriting/truncating table vs creating/replacing table and enforce latter whenever the configuration for creating table is provided. I think this is pretty much clearer for end users, rather than letting they concern about the impact.

For sure, even we address it with DataStreamWriterV2, we still need to deal with the consistency in DataStreamWriter.toTable(). Given DataStreamWriterV2 is taking place and recommended for table write, that would be less important.


was (Author: kabhwan):
I don't agree with handling this in DataStreamWriter, hence I changed the title. My claim is designing DataStreamWriterV2, nothing else.

I also don't agree that we need to deal with partition columns verification in such way. DataFrameWriterV2 does this nicely, via branching the path between appending/overwriting/truncating table vs creating/replacing table and enforce latter whenever the configuration for creating table is provided. I think this is pretty much clearer for end users, rather than letting they concern about the impact.

For sure, even we address it with DataStreamWriterV2, we still need to deal with the consistency in DataStreamWriter.toTable(). Given DataStreamWriterV2 is taking place and recommended for table write, that would be less important.

> Full support of V2 table creation in Structured Streaming writer path
> ---------------------------------------------------------------------
>
>                 Key: SPARK-33638
>                 URL: https://issues.apache.org/jira/browse/SPARK-33638
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.1.0
>            Reporter: Yuanjian Li
>            Priority: Blocker
>
> Currently, we want to add support of creating if not exists in DataStreamWriter.toTable API. Since the file format in streaming doesn't support DSv2 for now, the current implementation mainly focuses on V1 support. We need more work to do for the full support of V2 table creation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org