You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@james.apache.org by GitBox <gi...@apache.org> on 2022/05/10 05:43:47 UTC

[GitHub] [james-project] chibenwa opened a new pull request, #993: WIP Allow defining search overrides

chibenwa opened a new pull request, #993:
URL: https://github.com/apache/james-project/pull/993

    ## Motivation
    
    IMAP SEARCH command is used by some clients to resynchronize data and are often performed queries.
    
    The use of ElasticSearch pose several issues:
    
     - Long lasting queries: it's not uncommon to see some search hanged for 60+ seconds
     - Many results returned kicking in ES driver paging
     - ElasticSearch is not the primary data store and can get out of date, reindexed. Indexing is a background process. Acceptable for user search, not really for resynchronisation.
     
   The scope of this ticket is to allow execution of well chosen search requests against existing Cassandra datastructures. Optional and extensible.
    
    ## TODO
    
    - [ ] Unit tests for search overrides
    - [ ] Documentation
    - [ ] Unit test for search override loading
    - [ ] Gatling tests for performance results
    - [ ] JIRA ticket and ADR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org


[GitHub] [james-project] chibenwa commented on pull request #993: JAMES-3769 Allow defining search overrides

Posted by GitBox <gi...@apache.org>.
chibenwa commented on PR #993:
URL: https://github.com/apache/james-project/pull/993#issuecomment-1132705471

   > What is this empty file server/container/guice/elasticsearch/src/test/resources/recursive/.keep for btw?
   
   Adding the folder into the git structure which would otherwise not be kept.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org


[GitHub] [james-project] quantranhong1999 commented on a diff in pull request #993: JAMES-3769 Allow defining search overrides

Posted by GitBox <gi...@apache.org>.
quantranhong1999 commented on code in PR #993:
URL: https://github.com/apache/james-project/pull/993#discussion_r877820910


##########
server/container/guice/elasticsearch/src/main/java/org/apache/james/modules/mailbox/ElasticSearchMailboxModule.java:
##########
@@ -111,6 +118,16 @@ protected void configure() {
             .to(ElasticSearchStartUpCheck.class);
     }
 
+    @Provides
+    Set<SearchOverride> provideSearchOverrides(GuiceGenericLoader loader, ElasticSearchConfiguration configuration) {
+        return configuration.getSearchOverrides()
+            .stream()
+            .map(ClassName::new)
+            .map(Throwing.function(loader.<SearchOverride>withNamingSheme(NamingScheme.IDENTITY)::instantiate))
+            .peek(routes -> LOGGER.info("Loading WebAdmin route extension {}", routes.getClass().getCanonicalName()))

Review Comment:
   Seem unrelated.



##########
server/apps/cassandra-app/sample-configuration/elasticsearch.properties:
##########
@@ -82,3 +82,24 @@ elasticsearch.http.port=9200
 elasticsearch.metrics.reports.enabled=true
 elasticsearch.metrics.reports.period=30
 elasticsearch.metrics.reports.index=james-metrics
+
+# Search overrides allow resolution of predefined search queries against alternative sources of data
+# and allow bypassing ElasticSearch. This is useful to handle most resynchronisation queries that
+# are simple enough to be resolved against Cassandra.
+#
+# Possible values are:
+#  - `org.apache.james.mailbox.cassandra.search.AllSearchOverride` Some IMAP clients uses SEARCH ALL to fully list messages in
+# a mailbox and detect deletions. This is typically done by clients not supporting QRESYNC and from am IMAP perspective

Review Comment:
   ```suggestion
   # a mailbox and detect deletions. This is typically done by clients not supporting QRESYNC and from an IMAP perspective
   ```



##########
server/apps/distributed-app/docs/modules/ROOT/pages/configure/elasticsearch.adoc:
##########
@@ -208,3 +208,31 @@ You can configure to use which HostNameVerifier in the client.
 accept_any_hostname: accept any host (not recommended).
 
 |===
+
+== Search overrides
+
+*Search overrides* allow resolution of predefined search queries against alternative sources of data
+and allow bypassing ElasticSearch. This is useful to handle most resynchronisation queries that
+are simple enough to be resolved against Cassandra.
+
+Possible values are:
+  - `org.apache.james.mailbox.cassandra.search.AllSearchOverride` Some IMAP clients uses SEARCH ALL to fully list messages in
+ a mailbox and detect deletions. This is typically done by clients not supporting QRESYNC and from am IMAP perspective

Review Comment:
   ```suggestion
    a mailbox and detect deletions. This is typically done by clients not supporting QRESYNC and from an IMAP perspective
   ```



##########
server/apps/distributed-app/sample-configuration/elasticsearch.properties:
##########
@@ -81,3 +81,24 @@ elasticsearch.http.port=9200
 elasticsearch.metrics.reports.enabled=true
 elasticsearch.metrics.reports.period=30
 elasticsearch.metrics.reports.index=james-metrics
+
+# Search overrides allow resolution of predefined search queries against alternative sources of data
+# and allow bypassing ElasticSearch. This is useful to handle most resynchronisation queries that
+# are simple enough to be resolved against Cassandra.
+#
+# Possible values are:
+#  - `org.apache.james.mailbox.cassandra.search.AllSearchOverride` Some IMAP clients uses SEARCH ALL to fully list messages in
+# a mailbox and detect deletions. This is typically done by clients not supporting QRESYNC and from am IMAP perspective

Review Comment:
   ```suggestion
   # a mailbox and detect deletions. This is typically done by clients not supporting QRESYNC and from an IMAP perspective
   ```



##########
src/adr/0054-elasticsearch-search-overrides.md:
##########
@@ -0,0 +1,76 @@
+# 54. ElasticSearch search overrides
+
+Date: 2022-05-16
+
+## Status
+
+Accepted (lazy consensus).
+
+Not yet implemented.
+
+## Context
+
+IMAP SEARCH requests can of course be used for user searches. But many applications (eg Samsung email client) also uses
+IMAP SEARCH for resynchronization.
+
+The use of ElasticSearch in the context of resynchronization is a problem as:
+
+ - Resynchronization incurs many requests that ElasticSearch is not made to cope with.
+ - This have a performance impact and some 60+ seconds can be spotted.
+ - ElasticSearch indexing is asynchronous wich is not ideal for resynchronization

Review Comment:
   ```suggestion
    - ElasticSearch indexing is asynchronous which is not ideal for resynchronization
   ```



##########
src/site/xdoc/server/config-elasticsearch.xml:
##########
@@ -289,6 +289,35 @@
             </dd>
         </dl>
     </section>
+    <section name="Search overrides">
+
+        <p><b>Search overrides</b> allow resolution of predefined search queries against alternative sources of data
+            and allow bypassing ElasticSearch. This is useful to handle most resynchronisation queries that
+            are simple enough to be resolved against Cassandra.</p>
+
+        <ul>Possible values are:<br/>
+            <li><pre>org.apache.james.mailbox.cassandra.search.AllSearchOverride</pre> Some IMAP clients uses SEARCH ALL to fully list messages in
+                a mailbox and detect deletions. This is typically done by clients not supporting QRESYNC and from am IMAP perspective

Review Comment:
   ```suggestion
                   a mailbox and detect deletions. This is typically done by clients not supporting QRESYNC and from an IMAP perspective
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org


[GitHub] [james-project] chibenwa commented on pull request #993: JAMES-3769 Allow defining search overrides

Posted by GitBox <gi...@apache.org>.
chibenwa commented on PR #993:
URL: https://github.com/apache/james-project/pull/993#issuecomment-1131286095

   ![Screenshot from 2022-05-19 13-31-01](https://user-images.githubusercontent.com/6928740/169225520-72ddcd36-152c-45ba-a03e-c1c056457103.png)
   
   Waaaay less ElasticSearch business going on without introducing too much new Cassandra business.
   
   I observed a 10% resource consumption on a 18.000 users gatling run.
   
   No performance degradation in the gatling runs I saw so far.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org


[GitHub] [james-project] Arsnael commented on a diff in pull request #993: JAMES-3769 Allow defining search overrides

Posted by GitBox <gi...@apache.org>.
Arsnael commented on code in PR #993:
URL: https://github.com/apache/james-project/pull/993#discussion_r877892648


##########
src/adr/0054-elasticsearch-search-overrides.md:
##########
@@ -0,0 +1,76 @@
+# 54. ElasticSearch search overrides
+
+Date: 2022-05-16
+
+## Status
+
+Accepted (lazy consensus).
+
+Not yet implemented.

Review Comment:
   Isn't that PR implementing it? :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org


[GitHub] [james-project] chibenwa merged pull request #993: JAMES-3769 Allow defining search overrides

Posted by GitBox <gi...@apache.org>.
chibenwa merged PR #993:
URL: https://github.com/apache/james-project/pull/993


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org