You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sergey Shelukhin <se...@hortonworks.com> on 2017/06/21 20:31:40 UTC

Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/
-----------------------------------------------------------

Review request for hive, Gopal V and Prasanth_J.


Repository: hive-git


Description
-------

see jira


Diffs
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java be38f381e6 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c73f1a1a7d 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 53c9bae5c1 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 2a76f5c4da 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 0ef7c758d4 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java d5807b77e2 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 31b0609b83 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java 4856fb3ceb 
  ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
  ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
  ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out cd7a392e08 
  ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 


Diff: https://reviews.apache.org/r/60289/diff/1/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On Sept. 8, 2017, 6:24 p.m., Prasanth_J wrote:
> > llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
> > Lines 52 (patched)
> > <https://reviews.apache.org/r/60289/diff/4/?file=1809787#file1809787line54>
> >
> >     This file looks renamed from Parquet to generic MetadataCache but contains OrcSpecific objects. If it is generic remove orc related stuff or rename the class if it is orc specific?

It's a generic class for buffers, and contains some format-specific sub-parts


> On Sept. 8, 2017, 6:24 p.m., Prasanth_J wrote:
> > llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
> > Lines 106 (patched)
> > <https://reviews.apache.org/r/60289/diff/4/?file=1809787#file1809787line108>
> >
> >     why lock and unlock notification back to back?

To "use" the item as far as eviction policy is concerned.


> On Sept. 8, 2017, 6:24 p.m., Prasanth_J wrote:
> > llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
> > Lines 134 (patched)
> > <https://reviews.apache.org/r/60289/diff/4/?file=1809787#file1809787line136>
> >
> >     can you create follow up? will be useful for debugging. or this could be jmx info. something that can be looked easily instead of logs.

HIVE-17524


> On Sept. 8, 2017, 6:24 p.m., Prasanth_J wrote:
> > llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
> > Lines 235 (patched)
> > <https://reviews.apache.org/r/60289/diff/4/?file=1809787#file1809787line237>
> >
> >     is readFully fixed in hadoop 2.8? if so, now that hive moved to 2.8.0 can that be used here and other places?

We may be running against 2.7; also it anyway does the same thing and requires a cast, so there's no point


> On Sept. 8, 2017, 6:24 p.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java
> > Lines 148 (patched)
> > <https://reviews.apache.org/r/60289/diff/4/?file=1809792#file1809792line148>
> >
> >     May be we should start using TypeDescription everywhere. OrcProto.Type can be huge object when compared to TypeDescription.

added HIVE-17525


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/#review184992
-----------------------------------------------------------


On Sept. 1, 2017, 12:41 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60289/
> -----------------------------------------------------------
> 
> (Updated Sept. 1, 2017, 12:41 a.m.)
> 
> 
> Review request for hive, Gopal V and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e4b09a2cdd 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c5248ceb5f 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java f42622b892 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java 6edd84b8b0 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java b5db3029d1 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java b61a8ca022 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 69a9f9f35e 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 690cce798e 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java cdd58df370 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java d47ba6b31a 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
>   ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out 77b7f5a2f7 
>   ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 
>   storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java 403c3ada61 
> 
> 
> Diff: https://reviews.apache.org/r/60289/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/#review184992
-----------------------------------------------------------




llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
Lines 52 (patched)
<https://reviews.apache.org/r/60289/#comment261216>

    This file looks renamed from Parquet to generic MetadataCache but contains OrcSpecific objects. If it is generic remove orc related stuff or rename the class if it is orc specific?



llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
Lines 106 (patched)
<https://reviews.apache.org/r/60289/#comment261217>

    why lock and unlock notification back to back?



llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
Lines 134 (patched)
<https://reviews.apache.org/r/60289/#comment261218>

    can you create follow up? will be useful for debugging. or this could be jmx info. something that can be looked easily instead of logs.



llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
Lines 225 (patched)
<https://reviews.apache.org/r/60289/#comment261221>

    smallBuffer is not added LlapMetadataBuffers?



llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
Lines 235 (patched)
<https://reviews.apache.org/r/60289/#comment261222>

    is readFully fixed in hadoop 2.8? if so, now that hive moved to 2.8.0 can that be used here and other places?



llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java
Lines 244 (patched)
<https://reviews.apache.org/r/60289/#comment261223>

    remove of change to debug/trace log



ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java
Lines 148 (patched)
<https://reviews.apache.org/r/60289/#comment261224>

    May be we should start using TypeDescription everywhere. OrcProto.Type can be huge object when compared to TypeDescription.


- Prasanth_J


On Sept. 1, 2017, 12:41 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60289/
> -----------------------------------------------------------
> 
> (Updated Sept. 1, 2017, 12:41 a.m.)
> 
> 
> Review request for hive, Gopal V and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e4b09a2cdd 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c5248ceb5f 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java f42622b892 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java 6edd84b8b0 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java b5db3029d1 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java b61a8ca022 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 69a9f9f35e 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 690cce798e 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java cdd58df370 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java d47ba6b31a 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
>   ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out 77b7f5a2f7 
>   ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 
>   storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java 403c3ada61 
> 
> 
> Diff: https://reviews.apache.org/r/60289/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/
-----------------------------------------------------------

(Updated Sept. 13, 2017, 1:58 a.m.)


Review request for hive, Gopal V and Prasanth_J.


Repository: hive-git


Description
-------

see jira


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 24c5db0e47 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c5248ceb5f 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java f42622b892 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java 6edd84b8b0 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java b5db3029d1 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java b61a8ca022 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 69a9f9f35e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 690cce798e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java cdd58df370 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java d47ba6b31a 
  ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
  ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
  ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out 77b7f5a2f7 
  ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 
  storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java 403c3ada61 


Diff: https://reviews.apache.org/r/60289/diff/5/

Changes: https://reviews.apache.org/r/60289/diff/4-5/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/
-----------------------------------------------------------

(Updated Sept. 1, 2017, 12:41 a.m.)


Review request for hive, Gopal V and Prasanth_J.


Repository: hive-git


Description
-------

see jira


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e4b09a2cdd 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c5248ceb5f 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java f42622b892 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java 6edd84b8b0 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java b5db3029d1 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/ParquetMetadataCacheImpl.java b61a8ca022 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 69a9f9f35e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 690cce798e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java cdd58df370 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java d47ba6b31a 
  ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
  ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
  ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out 77b7f5a2f7 
  ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 
  storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java 403c3ada61 


Diff: https://reviews.apache.org/r/60289/diff/4/

Changes: https://reviews.apache.org/r/60289/diff/3-4/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/
-----------------------------------------------------------

(Updated July 25, 2017, 9:32 p.m.)


Review request for hive, Gopal V and Prasanth_J.


Repository: hive-git


Description
-------

see jira


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java dd9ad71b5a 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java 0cbc8f6f4c 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 35b9d1f942 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java a2eb82947f 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java de49fc84bb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 3f5f99ce35 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java cdd58df370 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java d47ba6b31a 
  ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
  ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
  ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out 77b7f5a2f7 
  ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 


Diff: https://reviews.apache.org/r/60289/diff/3/

Changes: https://reviews.apache.org/r/60289/diff/2-3/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/
-----------------------------------------------------------

(Updated June 29, 2017, 9:21 p.m.)


Review request for hive, Gopal V and Prasanth_J.


Repository: hive-git


Description
-------

see jira


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 34a663d45b 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java 0cbc8f6f4c 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCacheImpl.java 82ea0c0e5e 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 53c9bae5c1 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 2a76f5c4da 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
  llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 0ef7c758d4 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java d5807b77e2 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 31b0609b83 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java 4856fb3ceb 
  ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
  ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
  ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out cd7a392e08 
  ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 


Diff: https://reviews.apache.org/r/60289/diff/2/

Changes: https://reviews.apache.org/r/60289/diff/1-2/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On June 29, 2017, 6:38 a.m., Prasanth_J wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Line 2974 (original)
> > <https://reviews.apache.org/r/60289/diff/1/?file=1757644#file1757644line2974>
> >
> >     Do we need this for offheap cache? For smaller cache, we don't want metadata objects taking up 100% of cache. As long as we are storing only the serialized footers, this shouldn't be a problem.

I think it's ok to have metadata taking up all of the cache. If anything it might be better to have all metadata and less of the data...


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/#review179152
-----------------------------------------------------------


On June 21, 2017, 8:31 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60289/
> -----------------------------------------------------------
> 
> (Updated June 21, 2017, 8:31 p.m.)
> 
> 
> Review request for hive, Gopal V and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java be38f381e6 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c73f1a1a7d 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 53c9bae5c1 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 2a76f5c4da 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 0ef7c758d4 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java d5807b77e2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 31b0609b83 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java 4856fb3ceb 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
>   ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out cd7a392e08 
>   ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 
> 
> 
> Diff: https://reviews.apache.org/r/60289/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On June 29, 2017, 6:38 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java
> > Lines 1685 (patched)
> > <https://reviews.apache.org/r/60289/diff/1/?file=1757656#file1757656line1779>
> >
> >     This looks complicated. It will be better if ORC can provide API that returns stripe footer and indexes as ByteBuffer which can be directly cached. Stripe footers and Indexes could be stored with medium priority. 
> >     
> >     Priorities could be:
> >     Serialized file footer - HIGH (this is required to not choke NN, with config change this is already part of split)
> >     Index + Stripe footer - MEDIUM (with locality re-reading these will not be a problem)
> >     Data - LOW (same as reading index, stripe footer)
> >     
> >     Since backward seeks no longer close connections for cloud storage, reading index and stripe could be done faster. 
> >     
> >     I think it would be easier,
> >     if ORC and parquet readers can provide 2 high level interfaces
> >     - Interface to read footers, index as ByteBuffers which LLAP will cache
> >     - Reader interface to accept ByteBuffer from which footers and index can be read which LLAP or file will provide

I filed some ORC JIRAs to improve APIs... for now a no-op


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/#review179152
-----------------------------------------------------------


On June 21, 2017, 8:31 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60289/
> -----------------------------------------------------------
> 
> (Updated June 21, 2017, 8:31 p.m.)
> 
> 
> Review request for hive, Gopal V and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java be38f381e6 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c73f1a1a7d 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 53c9bae5c1 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 2a76f5c4da 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 0ef7c758d4 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java d5807b77e2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 31b0609b83 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java 4856fb3ceb 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
>   ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out cd7a392e08 
>   ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 
> 
> 
> Diff: https://reviews.apache.org/r/60289/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On June 29, 2017, 6:38 a.m., Prasanth_J wrote:
> > llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
> > Lines 614 (patched)
> > <https://reviews.apache.org/r/60289/diff/1/?file=1757647#file1757647line665>
> >
> >     can this all be baked in OrcFileMetadata class? since stats, stripes are derived from tail, will be useful if we can just store OrcTail in OrcFileMetadata and lazily construct stats and stripes when getter is invoked.

why? I'm assuming the reader would always need those and they are already in the object


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/#review179152
-----------------------------------------------------------


On June 21, 2017, 8:31 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60289/
> -----------------------------------------------------------
> 
> (Updated June 21, 2017, 8:31 p.m.)
> 
> 
> Review request for hive, Gopal V and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java be38f381e6 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c73f1a1a7d 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 53c9bae5c1 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 2a76f5c4da 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 0ef7c758d4 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java d5807b77e2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 31b0609b83 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java 4856fb3ceb 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
>   ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out cd7a392e08 
>   ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 
> 
> 
> Diff: https://reviews.apache.org/r/60289/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 60289: HIVE-15665 LLAP: OrcFileMetadata objects in cache can impact heap usage

Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60289/#review179152
-----------------------------------------------------------




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Line 2974 (original)
<https://reviews.apache.org/r/60289/#comment253877>

    Do we need this for offheap cache? For smaller cache, we don't want metadata objects taking up 100% of cache. As long as we are storing only the serialized footers, this shouldn't be a problem.



llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
Lines 614 (patched)
<https://reviews.apache.org/r/60289/#comment253691>

    can this all be baked in OrcFileMetadata class? since stats, stripes are derived from tail, will be useful if we can just store OrcTail in OrcFileMetadata and lazily construct stats and stripes when getter is invoked.



llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
Lines 715 (patched)
<https://reviews.apache.org/r/60289/#comment253764>

    nit: remove or add to trace log?



llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java
Line 142 (original), 116 (patched)
<https://reviews.apache.org/r/60289/#comment253878>

    nit: can metadata and stripeMetadata be merged to single CHM? buffer needs to implement a getKey() interface method. notifyEvicted could be merged then.



llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java
Lines 143 (patched)
<https://reviews.apache.org/r/60289/#comment253879>

    same as below. get/put methods looks duplicated which could be merged if Object and OrcBatchKey can implement common interface.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java
Lines 133 (patched)
<https://reviews.apache.org/r/60289/#comment253880>

    nit: rename to reader or file schema?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java
Lines 212 (patched)
<https://reviews.apache.org/r/60289/#comment253881>

    get enum length instead? adding new streams to ORC will break this.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java
Lines 1685 (patched)
<https://reviews.apache.org/r/60289/#comment253882>

    This looks complicated. It will be better if ORC can provide API that returns stripe footer and indexes as ByteBuffer which can be directly cached. Stripe footers and Indexes could be stored with medium priority. 
    
    Priorities could be:
    Serialized file footer - HIGH (this is required to not choke NN, with config change this is already part of split)
    Index + Stripe footer - MEDIUM (with locality re-reading these will not be a problem)
    Data - LOW (same as reading index, stripe footer)
    
    Since backward seeks no longer close connections for cloud storage, reading index and stripe could be done faster. 
    
    I think it would be easier,
    if ORC and parquet readers can provide 2 high level interfaces
    - Interface to read footers, index as ByteBuffers which LLAP will cache
    - Reader interface to accept ByteBuffer from which footers and index can be read which LLAP or file will provide



ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java
Lines 52 (patched)
<https://reviews.apache.org/r/60289/#comment253883>

    These interfaces are already in org.apache.orc.Reader


- Prasanth_J


On June 21, 2017, 8:31 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60289/
> -----------------------------------------------------------
> 
> (Updated June 21, 2017, 8:31 p.m.)
> 
> 
> Review request for hive, Gopal V and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java be38f381e6 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java c73f1a1a7d 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 53c9bae5c1 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java 2a76f5c4da 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java dc053ee7cf 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java b9d7a77d5b 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java 601b622b49 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java 4565d11988 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java 13c7767a3b 
>   llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java 03a955c6f7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 0ef7c758d4 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 7540e72b53 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java d5807b77e2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 31b0609b83 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java 4856fb3ceb 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 
>   ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 
>   ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out cd7a392e08 
>   ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out b799527e30 
> 
> 
> Diff: https://reviews.apache.org/r/60289/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>