You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by bt...@apache.org on 2019/11/20 07:31:44 UTC

[james-project] 01/41: Changelog for recent ElasticSearch changes

This is an automated email from the ASF dual-hosted git repository.

btellier pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/james-project.git

commit 016ef250d083960354d3137640a3692285dc2863
Author: Benoit Tellier <bt...@linagora.com>
AuthorDate: Fri Nov 1 17:59:07 2019 +0700

    Changelog for recent ElasticSearch changes
---
 CHANGELOG.md            | 3 +++
 upgrade-instructions.md | 7 +++++++
 2 files changed, 10 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index e10d720..97d3e99 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -8,6 +8,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
 ### Changed
 - Multiple changes have been made to enhance ElasticSearch performance:
   - Use of routing keys to collocate documents per mailbox
+  - Under some configuration, html was not extracted before document indexing
+  - Removed unnecessary fields from mailbox mapping
+  - Disable dynamic mapping thanks to a change of the header structure 
   - Read related [upgrade instructions](upgrade-instructions.md)
 
 ### Removed
diff --git a/upgrade-instructions.md b/upgrade-instructions.md
index 53e745c..0566952 100644
--- a/upgrade-instructions.md
+++ b/upgrade-instructions.md
@@ -40,6 +40,9 @@ SHA-1 0d72783ff4
 
 JIRAS:
  - https://issues.apache.org/jira/browse/JAMES-2917
+ - https://issues.apache.org/jira/browse/JAMES-2078
+ - https://issues.apache.org/jira/browse/JAMES-2079
+ - https://issues.apache.org/jira/browse/JAMES-2910
 
 Concerned product: Guice product relying on ElasticSearch
 
@@ -47,6 +50,10 @@ We significantly improved our usage of ElasticSearch. Underlying changes include
 
  - The use of routing to collocate emails of a same mailbox within a same shard. This enables search queries to avoid cluster
  level synchronisation, and thus enhance throughput, latencies and scalability.
+ - Disabling dynamic mapping. We now represent headers as nested objects.
+ - Removing some not needed fields from the mapping
+ - No longer index raw HTML. This was possible under some configuration combinaison, and caused the data stored in elasticSearch
+ to be significantly larger than required.
 
 The downside of these changes is that a reindex is needed, implying a downtime on search:
  - Delete the indexes used by James


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org