You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@unomi.apache.org by "David Griffon (Jira)" <ji...@apache.org> on 2023/01/31 17:04:00 UTC

[jira] [Comment Edited] (UNOMI-724) Session/Event index rollover: Implement ElasticSearch rollover recommandation and system to manage current monthly indices

    [ https://issues.apache.org/jira/browse/UNOMI-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17682636#comment-17682636 ] 

David Griffon edited comment on UNOMI-724 at 1/31/23 5:03 PM:
--------------------------------------------------------------

The POC has been clean up by:
- adding new configuration parameter for rollover. Note that {{max_primary_shard_size}}  is not supported in our current client version {{7.4.2}}. Note that the default setting is now 365 days (one year), this will be improved by the purge task. 
- adding new configuration parameter to replace all monthly index parameters. Note that all new parameters are empty, replacing any old parameter used.
- deprecating and remove usage of DateHint parameter. 
- purge test with date has been disabled. 

A dedicated story has been created UNOMI-735 to remove deprecated code 

Testing part:
- Global non regresion test
- test the rollover configuration. rollover configuration can be queried by using the following endpoint on elasticsearch:
{code}

{code}
- Test that configuration match the following order =>
{code}
org.apache.unomi.elasticsearch.rollover.nbShards=${env:UNOMI_ELASTICSEARCH_ROLLOVER_SHARDS}
org.apache.unomi.elasticsearch.rollover.nbReplicas=${env:UNOMI_ELASTICSEARCH_ROLLOVER_REPLICAS}
org.apache.unomi.elasticsearch.rollover.indexMappingTotalFieldsLimit=${env:UNOMI_ELASTICSEARCH_ROLLOVER_MAPPINGTOTALFIELDSLIMIT}
org.apache.unomi.elasticsearch.rollover.indexMaxDocValueFieldsSearch=${env:UNOMI_ELASTICSEARCH_ROLLOVER_MAXDOCVALUEFIELDSSEARCH}
{code}
Then
{code}
org.apache.unomi.elasticsearch.monthlyIndex.nbShards=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_SHARDS:-2}
org.apache.unomi.elasticsearch.monthlyIndex.nbReplicas=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_REPLICAS:-0}
org.apache.unomi.elasticsearch.monthlyIndex.indexMappingTotalFieldsLimit=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_MAPPINGTOTALFIELDSLIMIT:-1000}
org.apache.unomi.elasticsearch.monthlyIndex.indexMaxDocValueFieldsSearch=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_MAXDOCVALUEFIELDSSEARCH:-1000}
{code}
Meaning if a value is set in the first list, it should be used to create the rollover indices. 


was (Author: dgriffon):
The POC has been clean up by:
- adding new configuration parameter for rollover. Note that {{max_primary_shard_size}}  is not supported in our current client version {{7.4.2}}. Note that the default setting is now 365 days (one year), this will be improved by the purge task. 
- adding new configuration parameter to replace all monthly index parameters. Note that all new parameters are empty, replacing any old parameter used.
- deprecating and remove usage of DateHint parameter. 
- purge test with date has been disabled. 

A dedicated story has been created UNOMI-735 to remove deprecated code 

Testing part:
- Global non regresion test
- test the rollover configuration. rollover configuration can be queried by using the following endpoint on elasticsearch:
{code}

{code}
- Test that configuration match the following order =>
{code}
org.apache.unomi.elasticsearch.rollover.nbShards=${env:UNOMI_ELASTICSEARCH_ROLLOVER_SHARDS}
org.apache.unomi.elasticsearch.rollover.nbReplicas=${env:UNOMI_ELASTICSEARCH_ROLLOVER_REPLICAS}
org.apache.unomi.elasticsearch.rollover.indexMappingTotalFieldsLimit=${env:UNOMI_ELASTICSEARCH_ROLLOVER_MAPPINGTOTALFIELDSLIMIT}
org.apache.unomi.elasticsearch.rollover.indexMaxDocValueFieldsSearch=${env:UNOMI_ELASTICSEARCH_ROLLOVER_MAXDOCVALUEFIELDSSEARCH}
{code}
Then
{code}
org.apache.unomi.elasticsearch.monthlyIndex.nbShards=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_SHARDS:-2}
org.apache.unomi.elasticsearch.monthlyIndex.nbReplicas=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_REPLICAS:-0}
org.apache.unomi.elasticsearch.monthlyIndex.indexMappingTotalFieldsLimit=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_MAPPINGTOTALFIELDSLIMIT:-1000}
org.apache.unomi.elasticsearch.monthlyIndex.indexMaxDocValueFieldsSearch=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_MAXDOCVALUEFIELDSSEARCH:-1000}

Meaning if a value is set in the first list, it should be used to create the rollover indices. 

> Session/Event index rollover: Implement ElasticSearch rollover recommandation and system to manage current monthly indices
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: UNOMI-724
>                 URL: https://issues.apache.org/jira/browse/UNOMI-724
>             Project: Apache Unomi
>          Issue Type: Improvement
>    Affects Versions: unomi-2.1.0
>            Reporter: Kevan Jahanshahi
>            Assignee: David Griffon
>            Priority: Major
>             Fix For: unomi-2.2.0
>
>
> A POC have been done to validate the usage of Rollover ES API that combine a lifecycle policy and aliases to replace the current monthly indices rotation.
> The goal behind that is to apply current ElasticSearch recommandation on Index/Shards sizing to reduce costs related to monthly indices not being sized correctly due to unpredictable amount of events/sessions per months.
> POC PR: [https://github.com/apache/unomi/pull/567]
> h2. Prerequisite before starting this ticket:
> Read carefully ES documentation regarding:
>  * Rollover API: [https://www.elastic.co/guide/en/elasticsearch/reference/7.17/ilm-rollover.html] 
> h2. What remain to be done in current ticket ?
>  * Cleanup the POC
>  ** cleanup used old functions from previous implem
>  ** deprecate old param not used like: *dateHint*
>  ** global code review and cleanup of the new implem (if necessary contact me directly to discuss improvement point)
>  * make the rollover configurable using configuration file ({*}max_age, max_size, max_primary_shard_size, max_docs.{*} See documentations regarding the rollover capabilities: [https://www.elastic.co/guide/en/elasticsearch/reference/7.17/ilm-rollover.html] )
>  * Disable/quarantine tests associated with the purge, since these tests will be broken as a result of this story. Create ticket to re-enable the tests once epic is complete and before releasing the next version of Unomi.
> Out of current ticket scope: 
>  - check performance for session loaded using query: https://issues.apache.org/jira/browse/UNOMI-725
>  - fixing the purge system: https://issues.apache.org/jira/browse/UNOMI-726
>  - fixing the merge system: https://issues.apache.org/jira/browse/UNOMI-727
>  - handle migration: https://issues.apache.org/jira/browse/UNOMI-728
> h2. Automated test
> Query Elasticsearch to verify lifecycle policy once the configuration has been applied in Unomi (this way we do not test Elasticsearch rollover mechanism but ensure the configuration is properly applied).
> h2. Manual Test Case: 
>  - Pick one of the configuration (such as max_docs), configure Unomi with a very low value
>  - Create anonymous session
>  - Verify rotation is happening when sessions are created and get above the configured threshold.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)