You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@unomi.apache.org by "Kevan Jahanshahi (Jira)" <ji...@apache.org> on 2023/05/12 08:01:00 UTC

[jira] [Assigned] (UNOMI-430) Make Unomi batchProfilesUpdate use ES scroll query

     [ https://issues.apache.org/jira/browse/UNOMI-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevan Jahanshahi reassigned UNOMI-430:
--------------------------------------

    Assignee: Kevan Jahanshahi

> Make Unomi batchProfilesUpdate use ES scroll query
> --------------------------------------------------
>
>                 Key: UNOMI-430
>                 URL: https://issues.apache.org/jira/browse/UNOMI-430
>             Project: Apache Unomi
>          Issue Type: Improvement
>            Reporter: romain.gauthier
>            Assignee: Kevan Jahanshahi
>            Priority: Major
>             Fix For: unomi-2.3.0, unomi-1.9.0
>
>
> *As a developer* 
> *I want to ensure good performances when calling batchProfilesUpdate described here* https://unomi.incubator.apache.org/rest-api-doc/#-244007327*
> h3. Acceptance criteria
> When I call batchProfilesUpdate
> Then Elasticsearch scrollquery should be used to ensure good performances
> When I call batchProfilesUpdate 
> Then I should be able to configure the window size (1000) and the duration of the scroll validity
> h3. Designer notes
>  
>  
> h3. Developer notes
> This method
> {code:java}
>     public void batchProfilesUpdate(BatchUpdate update) {
>         ParserHelper.resolveConditionType(definitionsService, update.getCondition());
>         List<Profile> profiles = persistenceService.query(update.getCondition(), null, Profile.class);
>         for (Profile profile : profiles) {
>             if (PropertyHelper.setProperty(profile, update.getPropertyName(), update.getPropertyValue(), update.getStrategy())) {
>                 save(profile);
>             }
>         }
>     }
> {code}
> should be updated to something like:
> {code:java}
>     public void batchProfilesUpdate(BatchUpdate update) {
>         ParserHelper.resolveConditionType(definitionsService, update.getCondition());
>         PartialList<Profile> profiles = persistenceService.query(update.getCondition(), null, Profile.class, 0,1000, "10m");
>         while (profiles.getList().size() > 0) {
>             for (Profile profile : profiles.getList()) {
>                 if (PropertyHelper.setProperty(profile, update.getPropertyName(), update.getPropertyValue(), update.getStrategy())) {
>                     save(profile);
>                 }
>             }
>             profiles = persistenceService.continueScrollQuery(Profile.class, profiles.getScrollIdentifier(), profiles.getScrollTimeValidity());
>             if (profiles == null || profiles.getList().size() == 0) {
>                 break;
>             }
>         }
>     }
> {code}
> because in the existing version of this method if the condition matches a large number of profiles they will all be loaded into memory which can be a (big) problem. For example if we request all the profiles of a set of 20 millions profiles, all those profiles will be loaded in memory. By switching to scroll queries, only the "window" of profiles will be loaded in memory.
> Integration tests to validate this change should also be added



--
This message was sent by Atlassian Jira
(v8.20.10#820010)