You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Jalpan Randeri <ja...@gmail.com> on 2019/05/18 07:04:12 UTC

Review Request 70674: Thread Safety and Memory Leaks in HCatRecordObjectInspectorFactory

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70674/
-----------------------------------------------------------

Review request for hive.


Bugs: https://issues.apache.org/jira/browse/HIVE-21752
    https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-21752


Repository: hive-git


Description
-------

Thread Safety and Memory Leaks in HCatRecordObjectInspectorFactory


Summary
======= 
There are a couple of issues in HCatRecordObjectInspectorFactory[1] because it uses a static Java HashMap to cache objects:

1. Java HashMap is not thread safe. This can lead to data corruptions and race conditions in multithreaded servers when two threads update the ObjectInspector.
2. There is no eviction policy and as a result, this can result in memory leaks. If user reads a lot of different schemas, Hive server will start seeing memory pressure, once it start going to have a lot of cached record and object inspectors.

This patch propose to replace the cache using a Guava cache which enables cache evictions and thread safety. Guava cache is already used in Hive ObjectInspectorFactory [2], so this change is consistent with the rest of Hive.


References:
=======
1.  https://github.com/apache/hive/blob/b58d50cb73a1f79a5d079e0a2c5ac33d2efc33a0/hcatalog/core/src/main/java/org/apache/hive/hcatalog/data/HCatRecordObjectInspectorFactory.java#L44-L47
2.  https://github.com/apache/hive/blob/b58d50cb73a1f79a5d079e0a2c5ac33d2efc33a0/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java#L68-L87


Diffs
-----

  hcatalog/core/src/main/java/org/apache/hive/hcatalog/data/HCatRecordObjectInspectorFactory.java 18bf3a4058973bce3f5239d585c713fe256ac78a 


Diff: https://reviews.apache.org/r/70674/diff/1/


Testing
-------

Unit tests


```
$ mvn test -Dtest=TestJsonSerDe
...
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hive.hcatalog.data.TestJsonSerDe
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.169 s - in org.apache.hive.hcatalog.data.TestJsonSerDe
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  28.030 s
[INFO] Finished at: 2019-05-17T23:50:40-07:00
[INFO] ------------------------------------------------------------------------
```

```
$ mvn test -Dtest=TestLazyHCatRecord
...
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hive.hcatalog.data.TestLazyHCatRecord
[INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.174 s - in org.apache.hive.hcatalog.data.TestLazyHCatRecord
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  24.033 s
[INFO] Finished at: 2019-05-17T23:51:58-07:00
[INFO] ------------------------------------------------------------------------
randerij:core/ (master) $
```


Thanks,

Jalpan Randeri