You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@unomi.apache.org by "David Griffon (Jira)" <ji...@apache.org> on 2023/01/31 17:04:00 UTC
[jira] [Comment Edited] (UNOMI-724) Session/Event index rollover: Implement ElasticSearch rollover recommandation and system to manage current monthly indices
[ https://issues.apache.org/jira/browse/UNOMI-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17682636#comment-17682636 ]
David Griffon edited comment on UNOMI-724 at 1/31/23 5:03 PM:
--------------------------------------------------------------
The POC has been clean up by:
- adding new configuration parameter for rollover. Note that {{max_primary_shard_size}} is not supported in our current client version {{7.4.2}}. Note that the default setting is now 365 days (one year), this will be improved by the purge task.
- adding new configuration parameter to replace all monthly index parameters. Note that all new parameters are empty, replacing any old parameter used.
- deprecating and remove usage of DateHint parameter.
- purge test with date has been disabled.
A dedicated story has been created UNOMI-735 to remove deprecated code
Testing part:
- Global non regresion test
- test the rollover configuration. rollover configuration can be queried by using the following endpoint on elasticsearch:
{code}
{code}
- Test that configuration match the following order =>
{code}
org.apache.unomi.elasticsearch.rollover.nbShards=${env:UNOMI_ELASTICSEARCH_ROLLOVER_SHARDS}
org.apache.unomi.elasticsearch.rollover.nbReplicas=${env:UNOMI_ELASTICSEARCH_ROLLOVER_REPLICAS}
org.apache.unomi.elasticsearch.rollover.indexMappingTotalFieldsLimit=${env:UNOMI_ELASTICSEARCH_ROLLOVER_MAPPINGTOTALFIELDSLIMIT}
org.apache.unomi.elasticsearch.rollover.indexMaxDocValueFieldsSearch=${env:UNOMI_ELASTICSEARCH_ROLLOVER_MAXDOCVALUEFIELDSSEARCH}
{code}
Then
{code}
org.apache.unomi.elasticsearch.monthlyIndex.nbShards=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_SHARDS:-2}
org.apache.unomi.elasticsearch.monthlyIndex.nbReplicas=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_REPLICAS:-0}
org.apache.unomi.elasticsearch.monthlyIndex.indexMappingTotalFieldsLimit=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_MAPPINGTOTALFIELDSLIMIT:-1000}
org.apache.unomi.elasticsearch.monthlyIndex.indexMaxDocValueFieldsSearch=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_MAXDOCVALUEFIELDSSEARCH:-1000}
{code}
Meaning if a value is set in the first list, it should be used to create the rollover indices.
was (Author: dgriffon):
The POC has been clean up by:
- adding new configuration parameter for rollover. Note that {{max_primary_shard_size}} is not supported in our current client version {{7.4.2}}. Note that the default setting is now 365 days (one year), this will be improved by the purge task.
- adding new configuration parameter to replace all monthly index parameters. Note that all new parameters are empty, replacing any old parameter used.
- deprecating and remove usage of DateHint parameter.
- purge test with date has been disabled.
A dedicated story has been created UNOMI-735 to remove deprecated code
Testing part:
- Global non regresion test
- test the rollover configuration. rollover configuration can be queried by using the following endpoint on elasticsearch:
{code}
{code}
- Test that configuration match the following order =>
{code}
org.apache.unomi.elasticsearch.rollover.nbShards=${env:UNOMI_ELASTICSEARCH_ROLLOVER_SHARDS}
org.apache.unomi.elasticsearch.rollover.nbReplicas=${env:UNOMI_ELASTICSEARCH_ROLLOVER_REPLICAS}
org.apache.unomi.elasticsearch.rollover.indexMappingTotalFieldsLimit=${env:UNOMI_ELASTICSEARCH_ROLLOVER_MAPPINGTOTALFIELDSLIMIT}
org.apache.unomi.elasticsearch.rollover.indexMaxDocValueFieldsSearch=${env:UNOMI_ELASTICSEARCH_ROLLOVER_MAXDOCVALUEFIELDSSEARCH}
{code}
Then
{code}
org.apache.unomi.elasticsearch.monthlyIndex.nbShards=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_SHARDS:-2}
org.apache.unomi.elasticsearch.monthlyIndex.nbReplicas=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_REPLICAS:-0}
org.apache.unomi.elasticsearch.monthlyIndex.indexMappingTotalFieldsLimit=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_MAPPINGTOTALFIELDSLIMIT:-1000}
org.apache.unomi.elasticsearch.monthlyIndex.indexMaxDocValueFieldsSearch=${env:UNOMI_ELASTICSEARCH_MONTHLYINDEX_MAXDOCVALUEFIELDSSEARCH:-1000}
Meaning if a value is set in the first list, it should be used to create the rollover indices.
> Session/Event index rollover: Implement ElasticSearch rollover recommandation and system to manage current monthly indices
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: UNOMI-724
> URL: https://issues.apache.org/jira/browse/UNOMI-724
> Project: Apache Unomi
> Issue Type: Improvement
> Affects Versions: unomi-2.1.0
> Reporter: Kevan Jahanshahi
> Assignee: David Griffon
> Priority: Major
> Fix For: unomi-2.2.0
>
>
> A POC have been done to validate the usage of Rollover ES API that combine a lifecycle policy and aliases to replace the current monthly indices rotation.
> The goal behind that is to apply current ElasticSearch recommandation on Index/Shards sizing to reduce costs related to monthly indices not being sized correctly due to unpredictable amount of events/sessions per months.
> POC PR: [https://github.com/apache/unomi/pull/567]
> h2. Prerequisite before starting this ticket:
> Read carefully ES documentation regarding:
> * Rollover API: [https://www.elastic.co/guide/en/elasticsearch/reference/7.17/ilm-rollover.html]
> h2. What remain to be done in current ticket ?
> * Cleanup the POC
> ** cleanup used old functions from previous implem
> ** deprecate old param not used like: *dateHint*
> ** global code review and cleanup of the new implem (if necessary contact me directly to discuss improvement point)
> * make the rollover configurable using configuration file ({*}max_age, max_size, max_primary_shard_size, max_docs.{*} See documentations regarding the rollover capabilities: [https://www.elastic.co/guide/en/elasticsearch/reference/7.17/ilm-rollover.html] )
> * Disable/quarantine tests associated with the purge, since these tests will be broken as a result of this story. Create ticket to re-enable the tests once epic is complete and before releasing the next version of Unomi.
> Out of current ticket scope:
> - check performance for session loaded using query: https://issues.apache.org/jira/browse/UNOMI-725
> - fixing the purge system: https://issues.apache.org/jira/browse/UNOMI-726
> - fixing the merge system: https://issues.apache.org/jira/browse/UNOMI-727
> - handle migration: https://issues.apache.org/jira/browse/UNOMI-728
> h2. Automated test
> Query Elasticsearch to verify lifecycle policy once the configuration has been applied in Unomi (this way we do not test Elasticsearch rollover mechanism but ensure the configuration is properly applied).
> h2. Manual Test Case:
> - Pick one of the configuration (such as max_docs), configure Unomi with a very low value
> - Create anonymous session
> - Verify rotation is happening when sessions are created and get above the configured threshold.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)