You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Shubhanshu Nagar (JIRA)" <ji...@apache.org> on 2015/11/10 16:45:11 UTC

[jira] [Commented] (SAMZA-677) Support changelog for stores with TTL

    [ https://issues.apache.org/jira/browse/SAMZA-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14998774#comment-14998774 ] 

Shubhanshu Nagar commented on SAMZA-677:
----------------------------------------

1 is a good approach, however in practice it has two pitfalls: 
1. Bootup time in case of host-affinity miss can be quite high for update/delete heavy workloads since in Kafka TTL and log compaction are mutually exclusive policies.
2. It requires synchronization of two configs. I am not sure if we can achieve that automatically. Perhaps, the RocksDB TTL acts as a master config and sets up topic retention on bootup.

Some of the alternative approaches I can think of are:
1. Supporting TTL natively in Samza with Samza doing some heavy lifting for implementing the TTL. In this case, Samza would be responsible for scanning the store and evicting stale entries. With RocksDB's support of pluggable comparators and lazy deserialization something fairly performant could be achieved. One big advantage of this approach is we can apply this to the InMemory store as well.
2. Embracing RocksDB's BackupEngine and using that for persisting the database. For ex, instead of updating the change log on every write/delete we could siphon off the backup on every checkpoint to a Kafka topic. Since backups are incremental we don't have to worry about data amplification. We might also get a performance boost since we are not contacting kafka broker on every update.

> Support changelog for stores with TTL
> -------------------------------------
>
>                 Key: SAMZA-677
>                 URL: https://issues.apache.org/jira/browse/SAMZA-677
>             Project: Samza
>          Issue Type: Improvement
>    Affects Versions: 0.10.0
>            Reporter: Naveen Somasundaram
>
> We support TTL for RocksDB in Samza, but what we don't support is TTL in the changelog itself. It becomes a little tricky because we don't have a way to know from the underlying store that the record has expired. This will enable durable storage for TTL based stores. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)