You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by HanumathRao <gi...@git.apache.org> on 2018/02/09 23:52:31 UTC

[GitHub] drill pull request #1120: DILL-6148: TestSortSpillWithException is sometimes...

GitHub user HanumathRao opened a pull request:

    https://github.com/apache/drill/pull/1120

    DILL-6148: TestSortSpillWithException is sometimes failing.

    I have changed the test case to use the EXTERNAL_SORT_MAX_MEMORY to configure the sort operator to use less memory instead of relying on MAX_QUERY_MEMORY_PER_NODE_KEY.
    
    This way the test case passes all the time (it is failing intermittently in my branch).
    
    @paul-rogers  can you please review this change.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HanumathRao/drill DRILL-6148

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1120.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1120
    
----
commit 1d7798848c114c3ee1e50aef8a62ac37e4f5e596
Author: Hanumath Rao Maduri <ha...@...>
Date:   2018-02-09T22:13:04Z

    DILL-6148: TestSortSpillWithException is sometimes failing.

----


---

[GitHub] drill pull request #1120: DRILL-6148: TestSortSpillWithException is sometime...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1120#discussion_r167768794
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/TestSortSpillWithException.java ---
    @@ -59,6 +59,7 @@ public static void setup() throws Exception {
         ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher)
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_THRESHOLD, 1) // Unmanaged
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_GROUP_SIZE, 1) // Unmanaged
    +        .configProperty(ExecConstants.EXTERNAL_SORT_MAX_MEMORY, 10 * 1024 * 1024) //use less memory for sorting.
    --- End diff --
    
    @paul-rogers  I am sorry for not being clear in my previous reply. This query has only one sort and also one minor fragment. If we set 60MB as MAX_QUERY_MEMORY_PER_NODE_KEY, then the result of the computeOperatorMemory() is 60MB.
    
    However, as I said this memory is sufficient for the query to run successfully without spilling.
    
    Then I tried out setting MAX_QUERY_MEMORY_PER_NODE_KEY to as low as 10MB so that this query should spill and fail with an exception (for this testcase). But even after setting so low the output of the computeOperatorMemory is 40MB. This is because of the default value of MIN_MEMORY_PER_BUFFERED_OP is 40MB. Hence, 40MB is used by the sort to sort the rows. This memory is enough for the sort to successfully run without an exception.
    
    Thats when I used the setting EXTERNAL_SORT_MAX_MEMORY to set low values so that Sort for sure can spill and throw the exception which is being tested by this testcase.


---

[GitHub] drill pull request #1120: DRILL-6148: TestSortSpillWithException is sometime...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1120#discussion_r167743599
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/TestSortSpillWithException.java ---
    @@ -59,6 +59,7 @@ public static void setup() throws Exception {
         ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher)
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_THRESHOLD, 1) // Unmanaged
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_GROUP_SIZE, 1) // Unmanaged
    +        .configProperty(ExecConstants.EXTERNAL_SORT_MAX_MEMORY, 10 * 1024 * 1024) //use less memory for sorting.
    --- End diff --
    
    The need to set this property suggest something is wrong with the next one. We set max query memory per node to 60 MB. Then, the `MemoryAllocationUtilitites` is supposed to divide that up. If you have to override that, then something is amiss. We should track down that problem.
    
    A good place to start is to turn on logging for this test (using the log fixture.) The sort will dump its configured memory. If it is still 2 GB, then debug the memory allocation utilities.
    
    Perhaps it is as simple as using the wrong option below...


---

[GitHub] drill issue #1120: DILL-6148: TestSortSpillWithException is sometimes failin...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on the issue:

    https://github.com/apache/drill/pull/1120
  
    @Ben-Zvi  Can you please review this change.


---

[GitHub] drill pull request #1120: DRILL-6148: TestSortSpillWithException is sometime...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/1120


---

[GitHub] drill pull request #1120: DRILL-6148: TestSortSpillWithException is sometime...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1120#discussion_r167957209
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/TestSortSpillWithException.java ---
    @@ -59,6 +59,7 @@ public static void setup() throws Exception {
         ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher)
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_THRESHOLD, 1) // Unmanaged
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_GROUP_SIZE, 1) // Unmanaged
    +        .configProperty(ExecConstants.EXTERNAL_SORT_MAX_MEMORY, 10 * 1024 * 1024) //use less memory for sorting.
    --- End diff --
    
    @paul-rogers  Thank you. I have added the comment for the testcase. Please let me know if any other changes are required.


---

[GitHub] drill pull request #1120: DRILL-6148: TestSortSpillWithException is sometime...

Posted by HanumathRao <gi...@git.apache.org>.
Github user HanumathRao commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1120#discussion_r167760816
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/TestSortSpillWithException.java ---
    @@ -59,6 +59,7 @@ public static void setup() throws Exception {
         ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher)
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_THRESHOLD, 1) // Unmanaged
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_GROUP_SIZE, 1) // Unmanaged
    +        .configProperty(ExecConstants.EXTERNAL_SORT_MAX_MEMORY, 10 * 1024 * 1024) //use less memory for sorting.
    --- End diff --
    
    @paul-rogers  Thank you for the review. In this case, the memory configured per node is 60MB and the maxParallelization is set to 1. The min buffered op memory is 40MB by default.
       * Minimum memory allocated to each buffered operator instance.
       * <p/>
       * DEFAULT: 40 MB
    
    In my branch (which has got other changes also) 40MB is being correctly set for the sort to use. However, this memory is sufficient for the concerned query to not to spill the data to disk and it is executing fine without triggering the exception mechanism.
    
    I tried to set max_per_node memory to 10MB but even after setting the memorycomputation utilities will only set sort memory to 40MB. Hence it is of no use in this case.
    
    Therefore, I have used this option so that this test will consistently pass. Please let me know if I am missing anything here.


---

[GitHub] drill pull request #1120: DRILL-6148: TestSortSpillWithException is sometime...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1120#discussion_r167929421
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/TestSortSpillWithException.java ---
    @@ -59,6 +59,7 @@ public static void setup() throws Exception {
         ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher)
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_THRESHOLD, 1) // Unmanaged
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_GROUP_SIZE, 1) // Unmanaged
    +        .configProperty(ExecConstants.EXTERNAL_SORT_MAX_MEMORY, 10 * 1024 * 1024) //use less memory for sorting.
    --- End diff --
    
    Please insert a comment to this effect in the code. This is a regression introduced when we added the memory floor feature. Explain that we need to turn off the floor (or lower it) for this test case. Thanks.


---

[GitHub] drill pull request #1120: DRILL-6148: TestSortSpillWithException is sometime...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1120#discussion_r167766107
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/TestSortSpillWithException.java ---
    @@ -59,6 +59,7 @@ public static void setup() throws Exception {
         ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher)
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_THRESHOLD, 1) // Unmanaged
             .configProperty(ExecConstants.EXTERNAL_SORT_SPILL_GROUP_SIZE, 1) // Unmanaged
    +        .configProperty(ExecConstants.EXTERNAL_SORT_MAX_MEMORY, 10 * 1024 * 1024) //use less memory for sorting.
    --- End diff --
    
    There is something very wrong here. This means that the memory calcs are not working and users will try to sort in unusually small amounts of memory. The query probably has one sort. It should have just a few threads. (We might want to control this if we don't now.) Take the 60 MB of memory, divide by the number of sorts (1) and you get 60 MB for the sort in this case. If we are getting 40, then something is wrong. Let's find out where things went off the rails.
    
    The minimum buffer op memory is supposed to set a floor for those cases in which a query has, say, 5 sorts on a machine with 30 cores and we try to divide the 2 GB by 150 and end up trying to sort in, say, 13 MB of memory. 


---