You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Matthias J. Sax (JIRA)" <ji...@apache.org> on 2017/01/13 18:22:26 UTC

[jira] [Comment Edited] (KAFKA-4125) Provide low-level Processor API meta data in DSL layer

    [ https://issues.apache.org/jira/browse/KAFKA-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15822095#comment-15822095 ] 

Matthias J. Sax edited comment on KAFKA-4125 at 1/13/17 6:21 PM:
-----------------------------------------------------------------

Hey [~bbejeck]. I thus thought about this, and we should have some API discussion before you get started -- don't want you to waste your time.

The point is, that we do have {{transform}}, and {{transformValues}} in the API already, and we consider adding {{flatTransform}} and {{flatTransformValues}} similar to {{map}}, {{flatMap}} and {{mapValues}}, {{flatMapValues}}. Those functions allow to access all record meta data -- thus adding {{RichFunctions}} might be redundant. Even if "transform" was added to support stateful operators, you can use the in a stateless fashion, too.

This relates also to the idea to add parameter {{key}} to {{mapValues}} -- not sure if we need/want to add this.

I think it would be helpful to start a general API design discussion to see what we actually want to add and what not. WDYT? \cc [~guozhang] Right now, all those ideas are spread out over multiple JIRAs and I think we should consolidate all those ideas to get a sound API change instead of "fixing" random stuff here and there.


was (Author: mjsax):
Hey [~bbejeck]. I thus thought about this, and we should have some API discussion before you get started -- don't want you to waste your time.

The point is, that we do have `transform`, and `transformValues` in the API already, and we consider adding `flatTransform` and `flatTransformValues` similar to `map`, `flatMap` and `mapValues`, `flatMapValues`. Those functions allow to access all record meta data -- thus adding `RichFunctions` might be redundant. Even if "transform" was added to support stateful operators, you can use the in a stateless fashion, too.

This relates also to the idea to add parameter `key` to `mapValues` -- not sure if we need/want to add this.

I think it would be helpful to start a general API design discussion to see what we actually want to add and what not. WDYT? \cc [~guozhang] Right now, all those ideas are spread out over multiple JIRAs and I think we should consolidate all those ideas to get a sound API change instead of "fixing" random stuff here and there.

> Provide low-level Processor API meta data in DSL layer
> ------------------------------------------------------
>
>                 Key: KAFKA-4125
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4125
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: streams
>            Reporter: Matthias J. Sax
>            Assignee: Guozhang Wang
>            Priority: Minor
>
> For Processor API, user can get meta data like record offset, timestamp etc via the provided {{Context}} object. It might be useful to allow uses to access this information in DSL layer, too.
> The idea would be, to do it "the Flink way", ie, by providing
> RichFunctions; {{mapValue()}} for example.
> Is takes a {{ValueMapper<V1, V2>}} that only has method
> {noformat}
> V2 apply(V1 value);
> {noformat}
> Thus, you cannot get any meta data within apply (it's completely "blind").
> We would add two more interfaces: {{RichFunction}} with a method
> {{open(Context context)}} and
> {noformat}
> RichValueMapper<V1, V2> extends ValueMapper<V1, V2>, RichFunction
> {noformat}
> This way, the user can chose to implement Rich- or Standard-function and
> we do not need to change existing APIs. Both can be handed into
> {{KStream.mapValues()}} for example. Internally, we check if a Rich
> function is provided, and if yes, hand in the {{Context}} object once, to
> make it available to the user who can now access it within {{apply()}} -- or
> course, the user must set a member variable in {{open()}} to hold the
> reference to the Context object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)