You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Sarath Subramanian <sa...@apache.org> on 2021/07/27 18:32:24 UTC

Re: Review Request 73430: ATLAS-4340: Set Solr wait-searcher property to false by default to make Solr commits async

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73430/
-----------------------------------------------------------

(Updated July 27, 2021, 11:32 a.m.)


Review request for atlas, Ashutosh Mestry, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, and Pinal Shah.


Bugs: ATLAS-4340
    https://issues.apache.org/jira/browse/ATLAS-4340


Repository: atlas


Description
-------

In Atlas when a transaction is committed, the entries are committed to HBase (primary storage) and Solr (indexing storage). A transaction is rolled-back if the primary storage commit fails, on the other hand when the secondary commit fails (solr), the transaction is not-rolled back and logged as warning and it is recommended to use reindex to repair the missing index documents. This behavior is due to the fact that the primary storage is the source of truth and indexes can be rebuild.

In Janusgraph, there is a property for Solr to make solr commits async. This is set to true in Atlas making every commit to wait until the solr commit is successful. This will have a negative impact on performance and is recommended to be false by default.

Property: index.[X].solr.wait-searcher

When mutating - wait for the index to reflect new mutations before returning. This can have a negative impact on performance.
 

This Jira is about setting the default value for above property to FALSE and can be overridden if need arises.


Diffs (updated)
-----

  addons/falcon-bridge/src/test/resources/atlas-application.properties 898b69c99 
  addons/hbase-bridge/src/test/resources/atlas-application.properties 898b69c99 
  addons/kafka-bridge/src/test/resources/atlas-application.properties 91fd8b092 
  authorization/src/test/resources/atlas-application.properties 2e02678a6 
  distro/pom.xml d84f5e7b1 
  distro/src/bin/atlas_config.py 493a34ad8 
  distro/src/bin/atlas_start.py 7cf35a92a 
  distro/src/test/python/scripts/TestMetadata.py 662fbddba 
  graphdb/janus/src/test/resources/atlas-application.properties a355234e9 
  intg/src/main/java/org/apache/atlas/ApplicationProperties.java bf97ab146 
  intg/src/test/resources/atlas-application.properties 50ce01e70 
  webapp/src/test/resources/atlas-application.properties 1d45e78f3 


Diff: https://reviews.apache.org/r/73430/diff/2/

Changes: https://reviews.apache.org/r/73430/diff/1-2/


Testing
-------

1. Precommit Test: https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/688/
2. Build Atlas with embedded Hbase/Solr profile and validated basic sanity tests - running quick start, basic search, tag propagation
3. Performance Test details:


Run with default settings - Solr wait-searcher property enabled - true (without patch)
----------------------------------------------------------------------
Start Time         : Tue Jun 15 22:26:58 PDT 2021
End Time           : Fri Jun 18 02:32:34 PDT 2021
Messages Processed : 91,225
Time Taken         : 52 hours 5 mins
Rate               : ~ 29.2 messages/minute


Run with disabled Solr wait-searcher property (will improve solr commit time making it async) - with patch
-----------------------------------------------------------------------------------------------------------
Start Time         : Mon Jun 14 13:30:04 PDT 2021
End Time           : Tue Jun 15 17:23:56 PDT 2021
Messages Processed : 91,225
Time Taken         : 27 hours 54 mins
Rate               : ~ 54.5 messages/minute


We see almost 50% perf imrpovement with this change.


Thanks,

Sarath Subramanian


Re: Review Request 73430: ATLAS-4340: Set Solr wait-searcher property to false by default to make Solr commits async

Posted by Ashutosh Mestry via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73430/#review223274
-----------------------------------------------------------




distro/src/bin/atlas_start.py
Line 113 (original), 113 (patched)
<https://reviews.apache.org/r/73430/#comment312352>

    nit: What is the reason for CRLF?



intg/src/main/java/org/apache/atlas/ApplicationProperties.java
Line 346 (original), 346 (patched)
<https://reviews.apache.org/r/73430/#comment312351>

    This comment is no longer relevant.


- Ashutosh Mestry


On July 27, 2021, 6:32 p.m., Sarath Subramanian wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73430/
> -----------------------------------------------------------
> 
> (Updated July 27, 2021, 6:32 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, and Pinal Shah.
> 
> 
> Bugs: ATLAS-4340
>     https://issues.apache.org/jira/browse/ATLAS-4340
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> In Atlas when a transaction is committed, the entries are committed to HBase (primary storage) and Solr (indexing storage). A transaction is rolled-back if the primary storage commit fails, on the other hand when the secondary commit fails (solr), the transaction is not-rolled back and logged as warning and it is recommended to use reindex to repair the missing index documents. This behavior is due to the fact that the primary storage is the source of truth and indexes can be rebuild.
> 
> In Janusgraph, there is a property for Solr to make solr commits async. This is set to true in Atlas making every commit to wait until the solr commit is successful. This will have a negative impact on performance and is recommended to be false by default.
> 
> Property: index.[X].solr.wait-searcher
> 
> When mutating - wait for the index to reflect new mutations before returning. This can have a negative impact on performance.
>  
> 
> This Jira is about setting the default value for above property to FALSE and can be overridden if need arises.
> 
> 
> Diffs
> -----
> 
>   addons/falcon-bridge/src/test/resources/atlas-application.properties 898b69c99 
>   addons/hbase-bridge/src/test/resources/atlas-application.properties 898b69c99 
>   addons/kafka-bridge/src/test/resources/atlas-application.properties 91fd8b092 
>   authorization/src/test/resources/atlas-application.properties 2e02678a6 
>   distro/pom.xml d84f5e7b1 
>   distro/src/bin/atlas_config.py 493a34ad8 
>   distro/src/bin/atlas_start.py 7cf35a92a 
>   distro/src/test/python/scripts/TestMetadata.py 662fbddba 
>   graphdb/janus/src/test/resources/atlas-application.properties a355234e9 
>   intg/src/main/java/org/apache/atlas/ApplicationProperties.java bf97ab146 
>   intg/src/test/resources/atlas-application.properties 50ce01e70 
>   webapp/src/test/resources/atlas-application.properties 1d45e78f3 
> 
> 
> Diff: https://reviews.apache.org/r/73430/diff/2/
> 
> 
> Testing
> -------
> 
> 1. Precommit Test: https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/688/
> 2. Build Atlas with embedded Hbase/Solr profile and validated basic sanity tests - running quick start, basic search, tag propagation
> 3. Performance Test details:
> 
> 
> Run with default settings - Solr wait-searcher property enabled - true (without patch)
> ----------------------------------------------------------------------
> Start Time         : Tue Jun 15 22:26:58 PDT 2021
> End Time           : Fri Jun 18 02:32:34 PDT 2021
> Messages Processed : 91,225
> Time Taken         : 52 hours 5 mins
> Rate               : ~ 29.2 messages/minute
> 
> 
> Run with disabled Solr wait-searcher property (will improve solr commit time making it async) - with patch
> -----------------------------------------------------------------------------------------------------------
> Start Time         : Mon Jun 14 13:30:04 PDT 2021
> End Time           : Tue Jun 15 17:23:56 PDT 2021
> Messages Processed : 91,225
> Time Taken         : 27 hours 54 mins
> Rate               : ~ 54.5 messages/minute
> 
> 
> We see almost 50% perf imrpovement with this change.
> 
> Precommit (updated): https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/772/console
> 
> 
> Thanks,
> 
> Sarath Subramanian
> 
>


Re: Review Request 73430: ATLAS-4340: Set Solr wait-searcher property to false by default to make Solr commits async

Posted by Ashutosh Mestry via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73430/#review223275
-----------------------------------------------------------


Ship it!




Ship It!

- Ashutosh Mestry


On July 27, 2021, 6:32 p.m., Sarath Subramanian wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73430/
> -----------------------------------------------------------
> 
> (Updated July 27, 2021, 6:32 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, and Pinal Shah.
> 
> 
> Bugs: ATLAS-4340
>     https://issues.apache.org/jira/browse/ATLAS-4340
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> In Atlas when a transaction is committed, the entries are committed to HBase (primary storage) and Solr (indexing storage). A transaction is rolled-back if the primary storage commit fails, on the other hand when the secondary commit fails (solr), the transaction is not-rolled back and logged as warning and it is recommended to use reindex to repair the missing index documents. This behavior is due to the fact that the primary storage is the source of truth and indexes can be rebuild.
> 
> In Janusgraph, there is a property for Solr to make solr commits async. This is set to true in Atlas making every commit to wait until the solr commit is successful. This will have a negative impact on performance and is recommended to be false by default.
> 
> Property: index.[X].solr.wait-searcher
> 
> When mutating - wait for the index to reflect new mutations before returning. This can have a negative impact on performance.
>  
> 
> This Jira is about setting the default value for above property to FALSE and can be overridden if need arises.
> 
> 
> Diffs
> -----
> 
>   addons/falcon-bridge/src/test/resources/atlas-application.properties 898b69c99 
>   addons/hbase-bridge/src/test/resources/atlas-application.properties 898b69c99 
>   addons/kafka-bridge/src/test/resources/atlas-application.properties 91fd8b092 
>   authorization/src/test/resources/atlas-application.properties 2e02678a6 
>   distro/pom.xml d84f5e7b1 
>   distro/src/bin/atlas_config.py 493a34ad8 
>   distro/src/bin/atlas_start.py 7cf35a92a 
>   distro/src/test/python/scripts/TestMetadata.py 662fbddba 
>   graphdb/janus/src/test/resources/atlas-application.properties a355234e9 
>   intg/src/main/java/org/apache/atlas/ApplicationProperties.java bf97ab146 
>   intg/src/test/resources/atlas-application.properties 50ce01e70 
>   webapp/src/test/resources/atlas-application.properties 1d45e78f3 
> 
> 
> Diff: https://reviews.apache.org/r/73430/diff/2/
> 
> 
> Testing
> -------
> 
> 1. Precommit Test: https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/688/
> 2. Build Atlas with embedded Hbase/Solr profile and validated basic sanity tests - running quick start, basic search, tag propagation
> 3. Performance Test details:
> 
> 
> Run with default settings - Solr wait-searcher property enabled - true (without patch)
> ----------------------------------------------------------------------
> Start Time         : Tue Jun 15 22:26:58 PDT 2021
> End Time           : Fri Jun 18 02:32:34 PDT 2021
> Messages Processed : 91,225
> Time Taken         : 52 hours 5 mins
> Rate               : ~ 29.2 messages/minute
> 
> 
> Run with disabled Solr wait-searcher property (will improve solr commit time making it async) - with patch
> -----------------------------------------------------------------------------------------------------------
> Start Time         : Mon Jun 14 13:30:04 PDT 2021
> End Time           : Tue Jun 15 17:23:56 PDT 2021
> Messages Processed : 91,225
> Time Taken         : 27 hours 54 mins
> Rate               : ~ 54.5 messages/minute
> 
> 
> We see almost 50% perf imrpovement with this change.
> 
> Precommit (updated): https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/772/console
> 
> 
> Thanks,
> 
> Sarath Subramanian
> 
>