You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2016/04/09 02:16:25 UTC

[jira] [Updated] (KAFKA-3534) Deserialize on demand when default time extractor used

     [ https://issues.apache.org/jira/browse/KAFKA-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guozhang Wang updated KAFKA-3534:
---------------------------------
    Fix Version/s: 0.10.1.0

> Deserialize on demand when default time extractor used
> ------------------------------------------------------
>
>                 Key: KAFKA-3534
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3534
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 0.9.0.1
>            Reporter: Michael Coon
>            Assignee: Guozhang Wang
>            Priority: Minor
>             Fix For: 0.10.1.0
>
>
> When records are added to the RecordQueue, they are deserialized at that time in order to extract the timestamp. But for some data flows where large messages are consumed (particularly compressed messages), this can result in large spikes in memory as all messages must be deserialized prior to processing (and getting out of memory). An optimization might be to only require deserialization at this stage if a non-default timestamp extractor is being used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)