You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Ashutosh Mestry <am...@hortonworks.com> on 2017/08/18 23:03:59 UTC

Review Request 61761: ATLAS-2064: Compressed Messages Posted by Hooks

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61761/
-----------------------------------------------------------

Review request for atlas, Madhan Neethiraj and Nixon Rodrigues.


Bugs: ATLAS-2064
    https://issues.apache.org/jira/browse/ATLAS-2064


Repository: atlas


Description
-------

**Analysis**
Kafka is not able handle messages generated by Atlas' hooks, as are over 1MB in size. Since messages produced are JSON strings, compression will yield huge savings.
Compared various compression options. Also referred to this [presentation](https://www.slideshare.net/oom65/file-format-benchmarks-avro-json-orc-parquet).
 
**Implementation**
- Extended _AtlasType_ to include methods to compress/decompress JSON.
- Messages now produces are GZIP compressed.
- _VersionedMessgaeSerializer_ now supports very old (no version), old (uncompressed) and new (compressed) deserialization.
- Added new _CompressedVersionedMessage_ message type. 
- Modified message structure to add envelope message that holds messages of different types.
 
 
**Additional Information**
It is possible to add compression to Kafka message by configuring Kafka property: compression.codec=1
See [here](https://www.cloudera.com/documentation/kafka/latest/topics/kafka_performance.html#concept_gqw_rcz_yq) for details.


Diffs
-----

  intg/src/main/java/org/apache/atlas/type/AtlasType.java c99eb7f 
  intg/src/test/java/org/apache/atlas/TestAtlasTypeJSONSerialize.java PRE-CREATION 
  notification/src/main/java/org/apache/atlas/notification/AbstractNotification.java cb44fc6 
  notification/src/main/java/org/apache/atlas/notification/CompressedVersionedMessage.java PRE-CREATION 
  notification/src/main/java/org/apache/atlas/notification/MessageVersion.java 6ef407a 
  notification/src/main/java/org/apache/atlas/notification/VersionedMessage.java 1929eb4 
  notification/src/main/java/org/apache/atlas/notification/VersionedMessageDeserializer.java cc2099e 
  notification/src/test/java/org/apache/atlas/notification/hook/HookNotificationTest.java dd3257e 


Diff: https://reviews.apache.org/r/61761/diff/1/


Testing
-------

**Unit tests**
- New unit tests added to verify compression and decompression of strings across locales.
- Message de-serialization with backward compatibility.

**Functional tests**
- Simulated scenario where Atlas is on newer version and hooks are on older (uncompressed) version.
- Updated hive hook and Atlas to new version and verified hive hook functionality for Atlas.


Thanks,

Ashutosh Mestry