You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@unomi.apache.org by "Kevan Jahanshahi (Jira)" <ji...@apache.org> on 2023/05/12 08:01:00 UTC
[jira] [Assigned] (UNOMI-430) Make Unomi batchProfilesUpdate use ES scroll query
[ https://issues.apache.org/jira/browse/UNOMI-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevan Jahanshahi reassigned UNOMI-430:
--------------------------------------
Assignee: Kevan Jahanshahi
> Make Unomi batchProfilesUpdate use ES scroll query
> --------------------------------------------------
>
> Key: UNOMI-430
> URL: https://issues.apache.org/jira/browse/UNOMI-430
> Project: Apache Unomi
> Issue Type: Improvement
> Reporter: romain.gauthier
> Assignee: Kevan Jahanshahi
> Priority: Major
> Fix For: unomi-2.3.0, unomi-1.9.0
>
>
> *As a developer*
> *I want to ensure good performances when calling batchProfilesUpdate described here* https://unomi.incubator.apache.org/rest-api-doc/#-244007327*
> h3. Acceptance criteria
> When I call batchProfilesUpdate
> Then Elasticsearch scrollquery should be used to ensure good performances
> When I call batchProfilesUpdate
> Then I should be able to configure the window size (1000) and the duration of the scroll validity
> h3. Designer notes
>
>
> h3. Developer notes
> This method
> {code:java}
> public void batchProfilesUpdate(BatchUpdate update) {
> ParserHelper.resolveConditionType(definitionsService, update.getCondition());
> List<Profile> profiles = persistenceService.query(update.getCondition(), null, Profile.class);
> for (Profile profile : profiles) {
> if (PropertyHelper.setProperty(profile, update.getPropertyName(), update.getPropertyValue(), update.getStrategy())) {
> save(profile);
> }
> }
> }
> {code}
> should be updated to something like:
> {code:java}
> public void batchProfilesUpdate(BatchUpdate update) {
> ParserHelper.resolveConditionType(definitionsService, update.getCondition());
> PartialList<Profile> profiles = persistenceService.query(update.getCondition(), null, Profile.class, 0,1000, "10m");
> while (profiles.getList().size() > 0) {
> for (Profile profile : profiles.getList()) {
> if (PropertyHelper.setProperty(profile, update.getPropertyName(), update.getPropertyValue(), update.getStrategy())) {
> save(profile);
> }
> }
> profiles = persistenceService.continueScrollQuery(Profile.class, profiles.getScrollIdentifier(), profiles.getScrollTimeValidity());
> if (profiles == null || profiles.getList().size() == 0) {
> break;
> }
> }
> }
> {code}
> because in the existing version of this method if the condition matches a large number of profiles they will all be loaded into memory which can be a (big) problem. For example if we request all the profiles of a set of 20 millions profiles, all those profiles will be loaded in memory. By switching to scroll queries, only the "window" of profiles will be loaded in memory.
> Integration tests to validate this change should also be added
--
This message was sent by Atlassian Jira
(v8.20.10#820010)