You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@james.apache.org by "chibenwa (via GitHub)" <gi...@apache.org> on 2023/04/19 05:09:14 UTC

[GitHub] [james-project] chibenwa commented on pull request #1530: JAMES-3777 Incremental events for Filters

chibenwa commented on PR #1530:
URL: https://github.com/apache/james-project/pull/1530#issuecomment-1514138889

   ## Benchmarks
   
   ### The bench
   
   Create 1.000 sequencially, one by one:
   
   ```
       @Test
       void test(EventStore eventStore) {
           FilteringManagement testee = instantiateFilteringManagement(eventStore);
   
           ImmutableList.Builder<Rule> rules = ImmutableList.builder();
   
           final Stopwatch gstopwatch = Stopwatch.createStarted();
           for (int i = 0; i < 1000; i++) {
               final Stopwatch stopwatch = Stopwatch.createStarted();
               rules.add( Rule.builder()
                   .id(Rule.Id.of("id-subject" + i))
                   .name("name")
                   .action(Rule.Action.of(Rule.Action.AppendInMailboxes.withMailboxIds("mbx1")))
                   .condition(Rule.Condition.of(
                       Rule.Condition.Field.SUBJECT,
                       Rule.Condition.Comparator.NOT_CONTAINS,
                       "A value to match 2"))
                   .build());
   
               Mono.from(testee.defineRulesForUser(USERNAME, rules.build(), Optional.empty())).block();
   
               System.out.println("Step " + i + ": " + stopwatch.elapsed(TimeUnit.MILLISECONDS));
           }
           System.out.println("Total: " + gstopwatch.elapsed(TimeUnit.MILLISECONDS));
       }
   ```
   
   ### Results: before
   
   Failed at step 964 after 36 minutes (!!!) -> Cassandra timeout (2s)
   
   Step 963 took 14.7s (for storing a single rule!)
   
   Cassandra storage is very well compressed (biiiiig redundancy in the data) yet still the aggregate is big: 7MB on-disk, 210 MB uncompressed.
   
   As such, long histories is clearly non-viable.
   
   ### Results: after
   
   Storing the 1.000 rules sequentially took roughly 1 minute with last step taking only `95 ms`.
   
   Cassandra storage is well compressed (twice less) yet still the aggregate is smaller: 26KB on-disk, 400KB uncompressed.
   
   Event running 5000 updates the 5000+ update, the latest step is done in ~500ms.
   
   Long histories for filter are thus more viable.
   
   Further improvements in long histories can be unlocked via complementary approaches like **event sourcing snapshoting**.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org