You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Adam Szita via Review Board <no...@reviews.apache.org> on 2019/10/02 20:15:13 UTC

Review Request 71575: HIVE-22284: Improve LLAP CacheContentsTracker to collect and display correct statistics

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71575/
-----------------------------------------------------------

Review request for hive.


Bugs: HIVE-22284
    https://issues.apache.org/jira/browse/HIVE-22284


Repository: hive-git


Description
-------

When keeping track of which buffers correspond to what Hive objects, CacheContentsTracker relies on cache tags.

Currently a tag is a simple String that ideally holds DB and table name, and a partition spec concatenated by . and / . The information here is derived from the Path of the file that is getting cached. Needless to say sometimes this produces a wrong tag especially for external tables.

Also there's a bug when calculating aggregated stats for a 'parent' tag (corresponding to the table of the partition) because the overall maxCount and maxSize do not add up to the sum of those in the partitions. This happens when buffers get removed from the cache.


Diffs
-----

  llap-common/src/java/org/apache/hadoop/hive/llap/LlapUtil.java a351a193c6bc558bb420049c54b7657cd7d04b7c 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/CacheContentsTracker.java 64c0125833af100fd7012b9751d075ab536ad1b0 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapCacheableBuffer.java f91a5d91a5b739dcbee98a1485ad4c59f6a9057b 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapDataBuffer.java 405fca2d4fae9fe0e3fd6d6d1345d55255d6df78 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCache.java 4dd3826a67dfff66ce9c90027d61a9012c0a15e8 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCacheImpl.java 62d7e5534486b53634de332875c5fd5d336c29b4 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/SerDeLowLevelCacheImpl.java 2a39d2d32807a51346baad28b04d87670381b6d5 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/SimpleBufferManager.java 41855e171eaa5bf8da638bc62bce3d0d49dc4bae 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java c63ee5f79b4f9fc356f033960e0af1a7b0058038 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java 1378a01f44ef774a15f769460833064c6305b2d6 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java 2a0c5ca92f3c7431f3c399f309a538f47eb27597 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 85a42f945624c3ca468790772f52363b4064d8fc 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java d414b1405b7672767196b3eaad02baa516169288 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/MetadataCache.java 8400fe98411ed07bd525a51a223fc35423136efb 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java 30dc1b9da2002689b8b1917f46ae3ca24194f3be 
  ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java af04a51b5536550b2d2f7d3e008cf2b2dea607d4 
  ql/src/java/org/apache/hadoop/hive/llap/LlapHiveUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 241a3001e6e0002377736d6d0e820fde004b0bac 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 210c987b7f580dacda5bdb487af9cf234a738b79 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java a9a9f101948a970e0dbf2f77eeb6f688a88d1cbd 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java 61e2556b08fe4247f35673f24378505ada20a605 
  storage-api/src/java/org/apache/hadoop/hive/common/io/CacheTag.java PRE-CREATION 
  storage-api/src/java/org/apache/hadoop/hive/common/io/DataCache.java 2ac0a18a5026e76e65c3c3a8b81d5a844c472ed2 
  storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java d7de3619380d24a1aeea2bac9a66485d7d468517 


Diff: https://reviews.apache.org/r/71575/diff/1/


Testing
-------


Thanks,

Adam Szita


Re: Review Request 71575: HIVE-22284: Improve LLAP CacheContentsTracker to collect and display correct statistics

Posted by Adam Szita via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71575/
-----------------------------------------------------------

(Updated Oct. 8, 2019, 7:53 a.m.)


Review request for hive.


Bugs: HIVE-22284
    https://issues.apache.org/jira/browse/HIVE-22284


Repository: hive-git


Description
-------

When keeping track of which buffers correspond to what Hive objects, CacheContentsTracker relies on cache tags.

Currently a tag is a simple String that ideally holds DB and table name, and a partition spec concatenated by . and / . The information here is derived from the Path of the file that is getting cached. Needless to say sometimes this produces a wrong tag especially for external tables.

Also there's a bug when calculating aggregated stats for a 'parent' tag (corresponding to the table of the partition) because the overall maxCount and maxSize do not add up to the sum of those in the partitions. This happens when buffers get removed from the cache.


Diffs (updated)
-----

  llap-common/src/java/org/apache/hadoop/hive/llap/LlapUtil.java a351a193c6bc558bb420049c54b7657cd7d04b7c 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/CacheContentsTracker.java 64c0125833af100fd7012b9751d075ab536ad1b0 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapCacheableBuffer.java f91a5d91a5b739dcbee98a1485ad4c59f6a9057b 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapDataBuffer.java 405fca2d4fae9fe0e3fd6d6d1345d55255d6df78 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCache.java 4dd3826a67dfff66ce9c90027d61a9012c0a15e8 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCacheImpl.java 62d7e5534486b53634de332875c5fd5d336c29b4 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/SerDeLowLevelCacheImpl.java 2a39d2d32807a51346baad28b04d87670381b6d5 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/SimpleBufferManager.java 41855e171eaa5bf8da638bc62bce3d0d49dc4bae 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java c63ee5f79b4f9fc356f033960e0af1a7b0058038 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java 1378a01f44ef774a15f769460833064c6305b2d6 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java 2a0c5ca92f3c7431f3c399f309a538f47eb27597 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 85a42f945624c3ca468790772f52363b4064d8fc 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java d414b1405b7672767196b3eaad02baa516169288 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/MetadataCache.java 8400fe98411ed07bd525a51a223fc35423136efb 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java 30dc1b9da2002689b8b1917f46ae3ca24194f3be 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestCacheContentsTracker.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java af04a51b5536550b2d2f7d3e008cf2b2dea607d4 
  ql/src/java/org/apache/hadoop/hive/llap/LlapHiveUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 241a3001e6e0002377736d6d0e820fde004b0bac 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 210c987b7f580dacda5bdb487af9cf234a738b79 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java a9a9f101948a970e0dbf2f77eeb6f688a88d1cbd 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java 61e2556b08fe4247f35673f24378505ada20a605 
  storage-api/src/java/org/apache/hadoop/hive/common/io/CacheTag.java PRE-CREATION 
  storage-api/src/java/org/apache/hadoop/hive/common/io/DataCache.java 2ac0a18a5026e76e65c3c3a8b81d5a844c472ed2 
  storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java d7de3619380d24a1aeea2bac9a66485d7d468517 


Diff: https://reviews.apache.org/r/71575/diff/4/

Changes: https://reviews.apache.org/r/71575/diff/3-4/


Testing
-------


Thanks,

Adam Szita


Re: Review Request 71575: HIVE-22284: Improve LLAP CacheContentsTracker to collect and display correct statistics

Posted by Adam Szita via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71575/
-----------------------------------------------------------

(Updated Oct. 7, 2019, 2:26 p.m.)


Review request for hive.


Changes
-------

Reworked CacheTag to minimize memory overhead footprint


Bugs: HIVE-22284
    https://issues.apache.org/jira/browse/HIVE-22284


Repository: hive-git


Description
-------

When keeping track of which buffers correspond to what Hive objects, CacheContentsTracker relies on cache tags.

Currently a tag is a simple String that ideally holds DB and table name, and a partition spec concatenated by . and / . The information here is derived from the Path of the file that is getting cached. Needless to say sometimes this produces a wrong tag especially for external tables.

Also there's a bug when calculating aggregated stats for a 'parent' tag (corresponding to the table of the partition) because the overall maxCount and maxSize do not add up to the sum of those in the partitions. This happens when buffers get removed from the cache.


Diffs (updated)
-----

  llap-common/src/java/org/apache/hadoop/hive/llap/LlapUtil.java a351a193c6bc558bb420049c54b7657cd7d04b7c 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/CacheContentsTracker.java 64c0125833af100fd7012b9751d075ab536ad1b0 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapCacheableBuffer.java f91a5d91a5b739dcbee98a1485ad4c59f6a9057b 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapDataBuffer.java 405fca2d4fae9fe0e3fd6d6d1345d55255d6df78 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCache.java 4dd3826a67dfff66ce9c90027d61a9012c0a15e8 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCacheImpl.java 62d7e5534486b53634de332875c5fd5d336c29b4 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/SerDeLowLevelCacheImpl.java 2a39d2d32807a51346baad28b04d87670381b6d5 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/SimpleBufferManager.java 41855e171eaa5bf8da638bc62bce3d0d49dc4bae 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java c63ee5f79b4f9fc356f033960e0af1a7b0058038 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java 1378a01f44ef774a15f769460833064c6305b2d6 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java 2a0c5ca92f3c7431f3c399f309a538f47eb27597 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 85a42f945624c3ca468790772f52363b4064d8fc 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java d414b1405b7672767196b3eaad02baa516169288 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/MetadataCache.java 8400fe98411ed07bd525a51a223fc35423136efb 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java 30dc1b9da2002689b8b1917f46ae3ca24194f3be 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestCacheContentsTracker.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java af04a51b5536550b2d2f7d3e008cf2b2dea607d4 
  ql/src/java/org/apache/hadoop/hive/llap/LlapHiveUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 241a3001e6e0002377736d6d0e820fde004b0bac 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 210c987b7f580dacda5bdb487af9cf234a738b79 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java a9a9f101948a970e0dbf2f77eeb6f688a88d1cbd 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java 61e2556b08fe4247f35673f24378505ada20a605 
  storage-api/src/java/org/apache/hadoop/hive/common/io/CacheTag.java PRE-CREATION 
  storage-api/src/java/org/apache/hadoop/hive/common/io/DataCache.java 2ac0a18a5026e76e65c3c3a8b81d5a844c472ed2 
  storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java d7de3619380d24a1aeea2bac9a66485d7d468517 


Diff: https://reviews.apache.org/r/71575/diff/3/

Changes: https://reviews.apache.org/r/71575/diff/2-3/


Testing
-------


Thanks,

Adam Szita


Re: Review Request 71575: HIVE-22284: Improve LLAP CacheContentsTracker to collect and display correct statistics

Posted by Adam Szita via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71575/
-----------------------------------------------------------

(Updated Oct. 3, 2019, 3:20 p.m.)


Review request for hive.


Changes
-------

Added test, fixed findbugs and checkstyle errors


Bugs: HIVE-22284
    https://issues.apache.org/jira/browse/HIVE-22284


Repository: hive-git


Description
-------

When keeping track of which buffers correspond to what Hive objects, CacheContentsTracker relies on cache tags.

Currently a tag is a simple String that ideally holds DB and table name, and a partition spec concatenated by . and / . The information here is derived from the Path of the file that is getting cached. Needless to say sometimes this produces a wrong tag especially for external tables.

Also there's a bug when calculating aggregated stats for a 'parent' tag (corresponding to the table of the partition) because the overall maxCount and maxSize do not add up to the sum of those in the partitions. This happens when buffers get removed from the cache.


Diffs (updated)
-----

  llap-common/src/java/org/apache/hadoop/hive/llap/LlapUtil.java a351a193c6bc558bb420049c54b7657cd7d04b7c 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/CacheContentsTracker.java 64c0125833af100fd7012b9751d075ab536ad1b0 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapCacheableBuffer.java f91a5d91a5b739dcbee98a1485ad4c59f6a9057b 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapDataBuffer.java 405fca2d4fae9fe0e3fd6d6d1345d55255d6df78 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCache.java 4dd3826a67dfff66ce9c90027d61a9012c0a15e8 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCacheImpl.java 62d7e5534486b53634de332875c5fd5d336c29b4 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/SerDeLowLevelCacheImpl.java 2a39d2d32807a51346baad28b04d87670381b6d5 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/SimpleBufferManager.java 41855e171eaa5bf8da638bc62bce3d0d49dc4bae 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java c63ee5f79b4f9fc356f033960e0af1a7b0058038 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java 1378a01f44ef774a15f769460833064c6305b2d6 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java 2a0c5ca92f3c7431f3c399f309a538f47eb27597 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 85a42f945624c3ca468790772f52363b4064d8fc 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java d414b1405b7672767196b3eaad02baa516169288 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/MetadataCache.java 8400fe98411ed07bd525a51a223fc35423136efb 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java 30dc1b9da2002689b8b1917f46ae3ca24194f3be 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestCacheContentsTracker.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java af04a51b5536550b2d2f7d3e008cf2b2dea607d4 
  ql/src/java/org/apache/hadoop/hive/llap/LlapHiveUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 241a3001e6e0002377736d6d0e820fde004b0bac 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 210c987b7f580dacda5bdb487af9cf234a738b79 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java a9a9f101948a970e0dbf2f77eeb6f688a88d1cbd 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java 61e2556b08fe4247f35673f24378505ada20a605 
  storage-api/src/java/org/apache/hadoop/hive/common/io/CacheTag.java PRE-CREATION 
  storage-api/src/java/org/apache/hadoop/hive/common/io/DataCache.java 2ac0a18a5026e76e65c3c3a8b81d5a844c472ed2 
  storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java d7de3619380d24a1aeea2bac9a66485d7d468517 


Diff: https://reviews.apache.org/r/71575/diff/2/

Changes: https://reviews.apache.org/r/71575/diff/1-2/


Testing
-------


Thanks,

Adam Szita