You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by InfinitiesLoop <gi...@git.apache.org> on 2018/03/26 16:59:27 UTC

[GitHub] flink issue #3314: [FLINK-3679] DeserializationSchema should handle zero or ...

Github user InfinitiesLoop commented on the issue:

    https://github.com/apache/flink/pull/3314
  
    Sorry for necro'ing this thread, but where does the community land on the multiple record per kafka payload idea this PR originally intended to solve?
    
    I have this scenario, where a single payload in kafka can represent hundreds of logical records. It's fine to just flatMap() them out after the deserialization schema, but that does not let me deal with timestamps and watermarks correctly. It's possible the source is reading from 2 partitions that are out of sync with each other, but I can't assign a timestamp and watermark for a single message that contains many records that might span multiple timestamps. So I'm just using a timestamp and watermark extractor on the stream separate from the source, and just hoping that I never have out of sync partitions. If a solution is still desired I'd love to contribute, otherwise it looks like I will end up having to write my own custom kafka source..


---