You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by "Jan Filipiak (JIRA)" <ji...@apache.org> on 2018/11/20 07:44:00 UTC

[jira] [Commented] (KAFKA-7658) Add KStream#toTable to the Streams DSL

    [ https://issues.apache.org/jira/browse/KAFKA-7658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16692791#comment-16692791 ] 

Jan Filipiak commented on KAFKA-7658:
-------------------------------------

[~guozhang] I remember our conversations when DSL 1.0 was designed and you were heavily against this. Why the change now?

> Add KStream#toTable to the Streams DSL
> --------------------------------------
>
>                 Key: KAFKA-7658
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7658
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Priority: Major
>              Labels: needs-kip, newbie
>
> We'd like to add a new API to the KStream object of the Streams DSL:
> {code}
> KTable KStream.toTable()
> KTable KStream.toTable(Materialized)
> {code}
> The function re-interpret the event stream {{KStream}} as a changelog stream {{KTable}}. Note that this should NOT be treated as a syntax-sugar as a dummy {{KStream.reduce()}} function which always take the new value, as it has the following difference: 
> 1) an aggregation operator of {{KStream}} is for aggregating a event stream into an evolving table, which will drop null-values from the input event stream; whereas a {{toTable}} function will completely change the semantics of the input stream from event stream to changelog stream, and null-values will still be serialized, and if the resulted bytes are also null they will be interpreted as "deletes" to the materialized KTable (i.e. tombstones in the changelog stream).
> 2) the aggregation result {{KTable}} will always be materialized, whereas {{toTable}} resulted KTable may only be materialized if the overloaded function with Materialized is used (and if optimization is turned on it may still be only logically materialized if the queryable name is not set).
> Therefore, for users who want to take a event stream into a changelog stream (no matter why they cannot read from the source topic as a changelog stream {{KTable}} at the beginning), they should be using this new API instead of the dummy reduction function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)