You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ranger.apache.org by Madhan Neethiraj <ma...@apache.org> on 2022/09/28 05:44:56 UTC

Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/
-----------------------------------------------------------

(Updated Sept. 28, 2022, 5:44 a.m.)


Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.


Summary (updated)
-----------------

RANGER-3934: optimization in loading of tags into cache


Bugs: RANGER-3934
    https://issues.apache.org/jira/browse/RANGER-3934


Repository: ranger


Description
-------

- updated several retrieval of XXService JPA object by replacing with retrival of serviceId
- replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
- avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
- updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query


Diffs
-----

  agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
  security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
  security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
  security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
  security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
  security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
  security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
  security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 


Diff: https://reviews.apache.org/r/74144/diff/1/


Testing
-------

- with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
  -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
  -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
- there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds


Thanks,

Madhan Neethiraj


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Abhay Kulkarni <ak...@hortonworks.com>.

> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
> > Line 102 (original), 105 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270242#file2270242line108>
> >
> >     This needs a review to reduce the M x N complexity. If there is a map of <resource-id, service-resource> maintained (in memory) for existing service-tags, then the loop at line 115 may reduce to a map lookup by resource-id.
> 
> Madhan Neethiraj wrote:
>     Yes. Updating this block to replace inner loop with a map look up will result in significant improvement. I was planning to address that in a subsequent patch. Without the current fix, delta processing didn't complete even after 1h35m; with the the delta processing gets to completion within few minutes.
>     
>     Also, I think we should review in-place updating of serviceTags instance passed as argument. This would be an issue if that instance is used in another thread while it is being updated.

I dont think serviceTags instance passed as an argument will be used in other thread, given that it is called within a lock.tryLock block from getLatestOrCached() function. (Caveat: Of course, there is a case where the tryLock times out while the updates to cached service-tags are going on. It is also called from RangerTagRefresher.populateTag() in the plugins. If the tag download interval is too small, then there is a possibility of two threads accessing service-tags concurrently.)


> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
> > Line 1251 (original), 1250 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270243#file2270243line1255>
> >
> >     Although this change reduces dependency on JPA cache behavior, it also removes one consistency check on the sanity of deltas and the state of the database. Perhaps, performing all of these computations/database interactions in a dedicated, read-only new transaction will obviate need for such peep-hole optimizations. This also applies to changes around line #1269.
> 
> Madhan Neethiraj wrote:
>     Yes. Use of read-only transactions can help avoid JPA overheads. However, wouldn't the optimizations introduced in this patch still be helpful to avoid overheads?

Yes, the current optimizations will help, but for the sake of maintainability in the face of future changes, the real solution may be to explore using separate, read-only transaction for delta creation.


- Abhay


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224714
-----------------------------------------------------------


On Sept. 28, 2022, 5:44 a.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2022, 5:44 a.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
>   security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
>   security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/1/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
> - there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Madhan Neethiraj <ma...@apache.org>.

> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
> > Line 102 (original), 105 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270242#file2270242line108>
> >
> >     This needs a review to reduce the M x N complexity. If there is a map of <resource-id, service-resource> maintained (in memory) for existing service-tags, then the loop at line 115 may reduce to a map lookup by resource-id.
> 
> Madhan Neethiraj wrote:
>     Yes. Updating this block to replace inner loop with a map look up will result in significant improvement. I was planning to address that in a subsequent patch. Without the current fix, delta processing didn't complete even after 1h35m; with the the delta processing gets to completion within few minutes.
>     
>     Also, I think we should review in-place updating of serviceTags instance passed as argument. This would be an issue if that instance is used in another thread while it is being updated.
> 
> Abhay Kulkarni wrote:
>     I dont think serviceTags instance passed as an argument will be used in other thread, given that it is called within a lock.tryLock block from getLatestOrCached() function. (Caveat: Of course, there is a case where the tryLock times out while the updates to cached service-tags are going on. It is also called from RangerTagRefresher.populateTag() in the plugins. If the tag download interval is too small, then there is a possibility of two threads accessing service-tags concurrently.)

Consider the following case:
- thread-1 called getLatestOrCached() and obtained ServiceTags instance; and is using this instance
- thread-2 calls getLatestOrCached() within a very short period, and there were few tag details to be applied. In this case, ServiceTags instance being used by thread-1 will be updated by thread-2 - right?


- Madhan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224714
-----------------------------------------------------------


On Sept. 28, 2022, 5:44 a.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2022, 5:44 a.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
>   security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
>   security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/1/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
> - there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Madhan Neethiraj <ma...@apache.org>.

> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
> > Line 102 (original), 105 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270242#file2270242line108>
> >
> >     This needs a review to reduce the M x N complexity. If there is a map of <resource-id, service-resource> maintained (in memory) for existing service-tags, then the loop at line 115 may reduce to a map lookup by resource-id.

Yes. Updating this block to replace inner loop with a map look up will result in significant improvement. I was planning to address that in a subsequent patch. Without the current fix, delta processing didn't complete even after 1h35m; with the the delta processing gets to completion within few minutes.

Also, I think we should review in-place updating of serviceTags instance passed as argument. This would be an issue if that instance is used in another thread while it is being updated.


> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
> > Lines 1217 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270243#file2270243line1219>
> >
> >     This change may result in processing the same tag or resource id multiple times if the same object appears multiple times in the list of deltas. Does the change cause significant performance improvement? If not, please consider either reverting it or replacing the Map to a HashSet to ensure that each object appears at most once in the subsequent loop.

This change replaces use of Map<Long, Long> with Set<Long> - since the map was populated with the same value for key and value. This change doesn't introduce any change in behavior.


> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
> > Line 1251 (original), 1250 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270243#file2270243line1255>
> >
> >     Although this change reduces dependency on JPA cache behavior, it also removes one consistency check on the sanity of deltas and the state of the database. Perhaps, performing all of these computations/database interactions in a dedicated, read-only new transaction will obviate need for such peep-hole optimizations. This also applies to changes around line #1269.

Yes. Use of read-only transactions can help avoid JPA overheads. However, wouldn't the optimizations introduced in this patch still be helpful to avoid overheads?


> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
> > Lines 1282 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270243#file2270243line1290>
> >
> >     Please review if this code fragment is required after reviewing the following API and its callers. The  changes to each of tags, resources and tag-resource-mappings may be completely kept track in the change-records created in the delta-tables.
> >     
> >     ---
> >     
> >     XXServiceVersionInfoDao.updateTagVersionAndTagUpdateTime().

Earlier version used rangerTagResourceMapService.getTagIdsForResourceId() to get IDs of tags associated with a given resources. This patch replaces this logic to obtain tag IDs from XXServiceResource.tags field.

It is not clear hwo updateTagVersionAndTagUpdateTime() can help here. Can you please add details? Thanks.


- Madhan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224714
-----------------------------------------------------------


On Sept. 28, 2022, 5:44 a.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2022, 5:44 a.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
>   security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
>   security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/1/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
> - there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Abhay Kulkarni <ak...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224714
-----------------------------------------------------------




agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
Line 102 (original), 105 (patched)
<https://reviews.apache.org/r/74144/#comment313494>

    This needs a review to reduce the M x N complexity. If there is a map of <resource-id, service-resource> maintained (in memory) for existing service-tags, then the loop at line 115 may reduce to a map lookup by resource-id.



security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
Lines 1217 (patched)
<https://reviews.apache.org/r/74144/#comment313490>

    This change may result in processing the same tag or resource id multiple times if the same object appears multiple times in the list of deltas. Does the change cause significant performance improvement? If not, please consider either reverting it or replacing the Map to a HashSet to ensure that each object appears at most once in the subsequent loop.



security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
Line 1251 (original), 1250 (patched)
<https://reviews.apache.org/r/74144/#comment313491>

    Although this change reduces dependency on JPA cache behavior, it also removes one consistency check on the sanity of deltas and the state of the database. Perhaps, performing all of these computations/database interactions in a dedicated, read-only new transaction will obviate need for such peep-hole optimizations. This also applies to changes around line #1269.



security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
Lines 1282 (patched)
<https://reviews.apache.org/r/74144/#comment313492>

    Please review if this code fragment is required after reviewing the following API and its callers. The  changes to each of tags, resources and tag-resource-mappings may be completely kept track in the change-records created in the delta-tables.
    
    ---
    
    XXServiceVersionInfoDao.updateTagVersionAndTagUpdateTime().


- Abhay Kulkarni


On Sept. 28, 2022, 5:44 a.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2022, 5:44 a.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
>   security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
>   security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/1/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
> - there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Abhay Kulkarni <ak...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224734
-----------------------------------------------------------


Ship it!




Please update the tests run and the improvements seen after optimizing applyDelta() function.

- Abhay Kulkarni


On Oct. 1, 2022, 12:38 a.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Oct. 1, 2022, 12:38 a.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/model/RangerTagDef.java c787beca5 
>   agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
>   security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
>   security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/6/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
> - there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/
-----------------------------------------------------------

(Updated Oct. 3, 2022, 6:34 p.m.)


Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.


Changes
-------

updated test stats


Bugs: RANGER-3934
    https://issues.apache.org/jira/browse/RANGER-3934


Repository: ranger


Description
-------

- updated several retrieval of XXService JPA object by replacing with retrival of serviceId
- replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
- avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
- updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query


Diffs
-----

  agents-common/src/main/java/org/apache/ranger/plugin/model/RangerTagDef.java c787beca5 
  agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
  security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
  security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
  security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
  security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
  security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
  security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
  security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 


Diff: https://reviews.apache.org/r/74144/diff/6/


Testing (updated)
-------

- with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
  -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
  -- with these optmizations, tag-cache update from the same delta completed within 15 seconds


Thanks,

Madhan Neethiraj


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/
-----------------------------------------------------------

(Updated Oct. 1, 2022, 12:38 a.m.)


Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.


Changes
-------

addressed review comments


Bugs: RANGER-3934
    https://issues.apache.org/jira/browse/RANGER-3934


Repository: ranger


Description
-------

- updated several retrieval of XXService JPA object by replacing with retrival of serviceId
- replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
- avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
- updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query


Diffs (updated)
-----

  agents-common/src/main/java/org/apache/ranger/plugin/model/RangerTagDef.java c787beca5 
  agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
  security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
  security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
  security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
  security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
  security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
  security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
  security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 


Diff: https://reviews.apache.org/r/74144/diff/5/

Changes: https://reviews.apache.org/r/74144/diff/4-5/


Testing
-------

- with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
  -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
  -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
- there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds


Thanks,

Madhan Neethiraj


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/
-----------------------------------------------------------

(Updated Sept. 29, 2022, 8:07 p.m.)


Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.


Changes
-------

- updated to capture tagDef changes as well in tag-delta
- earlier update missed adding new resources to the result list; fixed


Bugs: RANGER-3934
    https://issues.apache.org/jira/browse/RANGER-3934


Repository: ranger


Description
-------

- updated several retrieval of XXService JPA object by replacing with retrival of serviceId
- replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
- avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
- updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query


Diffs (updated)
-----

  agents-common/src/main/java/org/apache/ranger/plugin/model/RangerTagDef.java c787beca5 
  agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
  security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
  security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
  security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
  security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
  security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
  security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
  security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 


Diff: https://reviews.apache.org/r/74144/diff/4/

Changes: https://reviews.apache.org/r/74144/diff/3-4/


Testing
-------

- with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
  -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
  -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
- there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds


Thanks,

Madhan Neethiraj


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Abhay Kulkarni <ak...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224722
-----------------------------------------------------------




agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
Line 122 (original), 115 (patched)
<https://reviews.apache.org/r/74144/#comment313508>

    If the updated resource was added in this set of deltas, then it would exist in the resourcesToAdd map. It may be more efficient to check if it exists in the resourcesToAdd map, and if it does, then do not add it to resourcesToRemove map. Please review.



agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
Lines 123 (patched)
<https://reviews.apache.org/r/74144/#comment313507>

    If the resource is added and removed in the same set of deltas, then it may be more efficient to check its existence in the resourcesToAdd map before removing it. If it existed in resourcesToAdd then there is no need to add it to resourcesToRemove map. That can potentially save the execution of loop at line 138.


- Abhay Kulkarni


On Sept. 29, 2022, 6:46 a.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Sept. 29, 2022, 6:46 a.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
>   security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
>   security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/3/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
> - there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/
-----------------------------------------------------------

(Updated Sept. 29, 2022, 6:46 a.m.)


Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.


Bugs: RANGER-3934
    https://issues.apache.org/jira/browse/RANGER-3934


Repository: ranger


Description
-------

- updated several retrieval of XXService JPA object by replacing with retrival of serviceId
- replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
- avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
- updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query


Diffs (updated)
-----

  agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
  security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
  security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
  security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
  security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
  security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
  security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
  security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 


Diff: https://reviews.apache.org/r/74144/diff/3/

Changes: https://reviews.apache.org/r/74144/diff/2-3/


Testing
-------

- with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
  -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
  -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
- there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds


Thanks,

Madhan Neethiraj


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Madhan Neethiraj <ma...@apache.org>.

> On Sept. 29, 2022, 12:31 a.m., Abhay Kulkarni wrote:
> > agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
> > Line 107 (original), 101 (patched)
> > <https://reviews.apache.org/r/74144/diff/1-2/?file=2270242#file2270242line109>
> >
> >     This conversion will be done every time applyDeltas() is called. This may be expensive operation. Is it possible to cache this map in ServiceTags but ignore it when serializing ServiceTags object?

Holding this map in ServiceTags will result in memory be not released until ServiceTag instance goes away. Currently this map is destroyed after applyDeltas() method returns.

With a standalone program, found the following stats to create and populate maps:
 - 1m entries: memory=74mb, time=90ms
 - 2m entries: memory=141mb, time=130ms
 - 3m entries: memory=197mb, time=168ms

I think it will help to release the memory after completion of applyDelats().


> On Sept. 29, 2022, 12:31 a.m., Abhay Kulkarni wrote:
> > agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
> > Line 142 (original), 130 (patched)
> > <https://reviews.apache.org/r/74144/diff/1-2/?file=2270242#file2270242line151>
> >
> >     If any resources are deleted or updated, this code iterates over all service-resources. Will this be expensive, as before?

This is a single iteration on serviceResources, not a nested one; hence will not be as expesive as before.


- Madhan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224719
-----------------------------------------------------------


On Sept. 29, 2022, 6:46 a.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Sept. 29, 2022, 6:46 a.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
>   security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
>   security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/3/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
> - there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Abhay Kulkarni <ak...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224719
-----------------------------------------------------------




agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
Line 107 (original), 101 (patched)
<https://reviews.apache.org/r/74144/#comment313502>

    This conversion will be done every time applyDeltas() is called. This may be expensive operation. Is it possible to cache this map in ServiceTags but ignore it when serializing ServiceTags object?



agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
Line 142 (original), 130 (patched)
<https://reviews.apache.org/r/74144/#comment313503>

    If any resources are deleted or updated, this code iterates over all service-resources. Will this be expensive, as before?


- Abhay Kulkarni


On Sept. 28, 2022, 10:58 p.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2022, 10:58 p.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
>   security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
>   security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/2/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
> - there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>


Re: Review Request 74144: RANGER-3934: optimization in loading of tags into cache

Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/
-----------------------------------------------------------

(Updated Sept. 28, 2022, 10:58 p.m.)


Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, Subhrat Chaudhary, and Velmurugan Periasamy.


Changes
-------

addressed review comment to further optimize RangerServiceTagsDeltaUtil.applyDelta()


Bugs: RANGER-3934
    https://issues.apache.org/jira/browse/RANGER-3934


Repository: ranger


Description
-------

- updated several retrieval of XXService JPA object by replacing with retrival of serviceId
- replaced several instances of Map<Long, Long> with Set<Long>, as these instances had same value for key and value
- avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
- updated several methods in XXTagResourceMapDao to avoid loading of XXTagResourceMap JPA object; instead created XXTagResourceMap object from individial fields retrieved from query


Diffs (updated)
-----

  agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java 8d9241c1c 
  security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java d8154b7de 
  security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java afa754ba2 
  security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 3cc3d9cef 
  security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 3f8b5b718 
  security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java b0f71e138 
  security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
  security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 


Diff: https://reviews.apache.org/r/74144/diff/2/

Changes: https://reviews.apache.org/r/74144/diff/1-2/


Testing
-------

- with Ranger database having ~1m service-resources, ~2m tags and delta of 20k resource & 40k tags:
  -- before these optmizations, tag-cache update from delta didn't complete event after a long time (1h35m)
  -- with these optmizations, tag-cache update from the same delta completed within 10 minutes
- there is scope for further optimization RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in updating the cache (588 seconds); compare this to TagDBStore.getServiceTagsDelta() which completed in 13 seeconds


Thanks,

Madhan Neethiraj