You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Ashutosh Mestry <am...@hortonworks.com> on 2017/08/18 23:03:59 UTC
Review Request 61761: ATLAS-2064: Compressed Messages Posted by Hooks
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61761/
-----------------------------------------------------------
Review request for atlas, Madhan Neethiraj and Nixon Rodrigues.
Bugs: ATLAS-2064
https://issues.apache.org/jira/browse/ATLAS-2064
Repository: atlas
Description
-------
**Analysis**
Kafka is not able handle messages generated by Atlas' hooks, as are over 1MB in size. Since messages produced are JSON strings, compression will yield huge savings.
Compared various compression options. Also referred to this [presentation](https://www.slideshare.net/oom65/file-format-benchmarks-avro-json-orc-parquet).
**Implementation**
- Extended _AtlasType_ to include methods to compress/decompress JSON.
- Messages now produces are GZIP compressed.
- _VersionedMessgaeSerializer_ now supports very old (no version), old (uncompressed) and new (compressed) deserialization.
- Added new _CompressedVersionedMessage_ message type.
- Modified message structure to add envelope message that holds messages of different types.
**Additional Information**
It is possible to add compression to Kafka message by configuring Kafka property: compression.codec=1
See [here](https://www.cloudera.com/documentation/kafka/latest/topics/kafka_performance.html#concept_gqw_rcz_yq) for details.
Diffs
-----
intg/src/main/java/org/apache/atlas/type/AtlasType.java c99eb7f
intg/src/test/java/org/apache/atlas/TestAtlasTypeJSONSerialize.java PRE-CREATION
notification/src/main/java/org/apache/atlas/notification/AbstractNotification.java cb44fc6
notification/src/main/java/org/apache/atlas/notification/CompressedVersionedMessage.java PRE-CREATION
notification/src/main/java/org/apache/atlas/notification/MessageVersion.java 6ef407a
notification/src/main/java/org/apache/atlas/notification/VersionedMessage.java 1929eb4
notification/src/main/java/org/apache/atlas/notification/VersionedMessageDeserializer.java cc2099e
notification/src/test/java/org/apache/atlas/notification/hook/HookNotificationTest.java dd3257e
Diff: https://reviews.apache.org/r/61761/diff/1/
Testing
-------
**Unit tests**
- New unit tests added to verify compression and decompression of strings across locales.
- Message de-serialization with backward compatibility.
**Functional tests**
- Simulated scenario where Atlas is on newer version and hooks are on older (uncompressed) version.
- Updated hive hook and Atlas to new version and verified hive hook functionality for Atlas.
Thanks,
Ashutosh Mestry