You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/02/25 09:34:57 UTC

[GitHub] Myasuka opened a new pull request #7819: [FLINK-11313][checkpoint] Introduce LZ4 compression for keyed state in full checkpoints and savepoints

Myasuka opened a new pull request #7819: [FLINK-11313][checkpoint] Introduce LZ4 compression for keyed state in full checkpoints and savepoints
URL: https://github.com/apache/flink/pull/7819
 
 
   ## What is the purpose of the change
   
   This PR created based on the discussion within [PR-7515](https://github.com/apache/flink/pull/7515).
   
   LZ4 is a popular lightweight compression, which has better performance than Snappy in many cases, and also [recommended by RocksDB](https://github.com/facebook/rocksdb/wiki/Compression#configuration).
   
   Based on this, I introduce LZ4 except for now existing snappy compression for keyed state in full checkpoint and savepoints.
   
   ## Brief change log
   
     - Introduce new `StreamCompressionDecoratorSnapshot` interface. The relationship between `StreamCompressionDecorator` and it just like `TypeSerializerSnapshot` and `TypeSerializerSnapshot`. We serialize `StreamCompressionDecoratorSnapshot` within `KeyedBackendSerializationProxy` so that we even support user defined `StreamCompressionDecorator`.
     - Add new abstract method `setCompressionDecorator` in `AbstractStateBackend`.
     - Bump `KeyedBackendSerializationProxy` to a newer version to support customized compression decorator.
     - Migrated existing tests to use LZ4 compression.
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
     - Extended unit tests `SerializationProxiesTest` and `StateSnapshotCompressionTest` for newely added compression type.
     - Add unit test `testSetCompressionDecorator` within `StateBackendTestBase` to verify different state backends could set `StreamCompressionDecorator` well.
     - Migrate `EventTimeWindowCheckpointingITCase` IT cases to use LZ4 compression.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): **yes**, add lz4-java dependency.
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **yes**
     - The serializers: **no**, but changed the `KeyedBackendSerializationProxy`
     - The runtime per-record code paths (performance sensitive): **no**, should not affect topology task performance.
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: **no**
     - The S3 file system connector: **no**
   
   ## Documentation
   
     - Does this pull request introduce a new feature? **yes**
     - If yes, how is the feature documented? **docs**
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services