You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by "Christian Tzolov (JIRA)" <ji...@apache.org> on 2017/03/03 14:49:45 UTC

[jira] [Comment Edited] (GEODE-2588) OQL's ORDER BY takes 13x (1300%) more time compared to plain java sort for the same amount of data and same resources

    [ https://issues.apache.org/jira/browse/GEODE-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893786#comment-15893786 ] 

Christian Tzolov edited comment on GEODE-2588 at 3/3/17 2:48 PM:
-----------------------------------------------------------------

Demo project [^gemfire-oql-orderby-vs-on-client-sort-test-cases.zip]  to illustrate the issue. 
1. Run with ORDER BY:
{code}
java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xmx4G -Dgemfire.jmx-manager-port=1199 -Dgemfire.jmx-manager=true -Dgemfire.jmx-manager-start=true  -Dgemfire.locators=localhost[10334] -Dgemfire.start-locator=localhost[10334] -Dgemfire.enable-cluster-configuration=false -Dgemfire.statistic-sampling-enabled=true -Dgemfire.statistic-archive-file=myStats.gfs -Dgemfire.enable-time-statistics=true -Dgemfire.jmx-manager-update-rate=2000 -jar ./target/gemfire-tests-0.0.1-SNAPSHOT.jar
{code}
2. Run with on client sort
{code}
java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xmx4G -Dgemfire.jmx-manager-port=1199 -Dgemfire.jmx-manager=true -Dgemfire.jmx-manager-start=true  -Dgemfire.locators=localhost[10334] -Dgemfire.start-locator=localhost[10334] -Dgemfire.enable-cluster-configuration=false -Dgemfire.statistic-sampling-enabled=true -Dgemfire.statistic-archive-file=myStats.gfs -Dgemfire.enable-time-statistics=true -Dgemfire.jmx-manager-update-rate=2000 -jar ./target/gemfire-tests-0.0.1-SNAPSHOT.jar on-client-sort
{code}


was (Author: tzolov):
Demo project to illustrate the issue. 
1. Run with ORDER BY:
{code}
java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xmx4G -Dgemfire.jmx-manager-port=1199 -Dgemfire.jmx-manager=true -Dgemfire.jmx-manager-start=true  -Dgemfire.locators=localhost[10334] -Dgemfire.start-locator=localhost[10334] -Dgemfire.enable-cluster-configuration=false -Dgemfire.statistic-sampling-enabled=true -Dgemfire.statistic-archive-file=myStats.gfs -Dgemfire.enable-time-statistics=true -Dgemfire.jmx-manager-update-rate=2000 -jar ./target/gemfire-tests-0.0.1-SNAPSHOT.jar
{code}
2. Run with on client sort
{code}
java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xmx4G -Dgemfire.jmx-manager-port=1199 -Dgemfire.jmx-manager=true -Dgemfire.jmx-manager-start=true  -Dgemfire.locators=localhost[10334] -Dgemfire.start-locator=localhost[10334] -Dgemfire.enable-cluster-configuration=false -Dgemfire.statistic-sampling-enabled=true -Dgemfire.statistic-archive-file=myStats.gfs -Dgemfire.enable-time-statistics=true -Dgemfire.jmx-manager-update-rate=2000 -jar ./target/gemfire-tests-0.0.1-SNAPSHOT.jar on-client-sort
{code}

> OQL's ORDER BY takes 13x (1300%) more time compared to plain java sort for the same amount of data and same resources
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-2588
>                 URL: https://issues.apache.org/jira/browse/GEODE-2588
>             Project: Geode
>          Issue Type: Bug
>          Components: querying
>            Reporter: Christian Tzolov
>         Attachments: flight_recording_OQL_ORDER_BY.jfr, gemfire_OQL_ORDER_BY.log, gemfire-oql-orderby-vs-on-client-sort-test-cases.zip, myStats_OQL_ORDER_BY.gfs, oql_with_order_by_hot_methods.png
>
>
> For Partition Region with 1 500 000 entries running on a single Geode member.
> The OQL query *SELECT DISTINCT a, b FROM /region ORDER BY b* takes *13x* times (*1300%*) more time compared to OQL *SELECT a, b FROM /region* +  manual Java sort of the result for the same dataset.
> Setup: Geode 1.0.0 with Partition region with 1 500 000 objects, 4GB memory
> 1. OQL with DISTINCT/ORDER BY
> {code}SELECT DISTINCT e.key,e.day FROM /partitionRegion e ORDER BY e.day{code}
> OQL execution time: 64899 ms = *~65 sec*
> 2. OQL with manual sort
> {code}SELECT e.key,e.day FROM /partitionRegion e{code}
> and then
> {code}
> //OQL all -> 3058 ms
> SelectResults result = (SelectResults) query.execute(bindings);
> //Client-side sort -> 1830 ms
> List<?> result2 = (List<?>) result.asList().parallelStream().sorted((o1, o2) -> {
>     Struct st1 = (Struct) o1;
>     Struct st2 = (Struct) o2;
>     return ((Date) st1.get("day")).compareTo((Date) st2.get("day"));
> }).collect(toList());
> {code}
> OQL execution time: 3058 ms,
> Client-side sort time: 1830 ms
> Total time: 4888 ms = *~5 sec*
> Attached [^gemfire-oql-orderby-vs-on-client-sort-test-cases.zip] can demo the problem (check the comments below).
> Attached are also the JMC profiler [^flight_recording_OQL_ORDER_BY.jfr], logs and vsd stats
> The profiler suggests that most  of the CPU goes to the *OrderByComparator#evaluateSortCriteria* method:    !oql_with_order_by_hot_methods.png!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)