You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "cryptoe (via GitHub)" <gi...@apache.org> on 2023/03/24 12:22:00 UTC

[GitHub] [druid] cryptoe opened a new pull request, #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

cryptoe opened a new pull request, #13974:
URL: https://github.com/apache/druid/pull/13974

   While using `intermediateSuperSorterStorageMaxLocalBytes` the super sorter was retaining references of the memory allocator. 
   ![Screenshot 2023-03-24 at 5 50 43 PM](https://user-images.githubusercontent.com/2260045/227519789-acff9bb3-6e81-4f20-a5e2-682dba52e8a4.png)
   
   The fix clears the current `outputChannel` when close() is called on the `[ComposingWritableFrameChannel.java](https://github.com/apache/druid/compare/master...cryptoe:ComposingOutputChannelFIx?expand=1#diff-7b8f1d92788e97dfa04896856c99acc08bb2b62b2bc56ec556967c8d61973e04)`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg commented on code in PR #13974:
URL: https://github.com/apache/druid/pull/13974#discussion_r1147781573


##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -105,6 +100,16 @@ public void write(FrameWithPartition frameWithPartition) throws IOException
     }
   }
 
+  private void convertChannelSuppliersToReadOnly(int index)

Review Comment:
   could create a common interface for `PartitionedOutputChannel` and `OutputChannel` which encapsulates `convertToReadOnly` - that way we can only give a single instance of the common interface to this object. 
   Also, is there a need to have `convertToReadOnly` in the `ComposingOutputChannelFactory` now that we close the channels as soon as the writes for them are done? cc @LakshSingla 
   These things could be followups.



##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -118,14 +123,17 @@ public void close() throws IOException
   {
     if (currentIndex < writableChannelSuppliers.size()) {
       writableChannelSuppliers.get(currentIndex).get().close();
+      convertChannelSuppliersToReadOnly(currentIndex);
       currentIndex = writableChannelSuppliers.size();
     }
   }
 
   @Override
   public boolean isClosed()
   {
-    return currentIndex == writableChannelSuppliers.size();
+    return currentIndex == writableChannelSuppliers.size() || writableChannelSuppliers.get(currentIndex)

Review Comment:
   Can you please explain the reason for adding the extra check?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "LakshSingla (via GitHub)" <gi...@apache.org>.
LakshSingla commented on code in PR #13974:
URL: https://github.com/apache/druid/pull/13974#discussion_r1147820956


##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -118,14 +123,17 @@ public void close() throws IOException
   {
     if (currentIndex < writableChannelSuppliers.size()) {
       writableChannelSuppliers.get(currentIndex).get().close();
+      convertChannelSuppliersToReadOnly(currentIndex);
       currentIndex = writableChannelSuppliers.size();
     }
   }
 
   @Override
   public boolean isClosed()
   {
-    return currentIndex == writableChannelSuppliers.size();
+    return currentIndex == writableChannelSuppliers.size() || writableChannelSuppliers.get(currentIndex)

Review Comment:
   While discussing with @cryptoe, we had the same assumption, however on revisiting the implementation of close(), we are manually setting the index to `writableChannelSuppliers.size()` on calling it, so this extra check isn't reached / needed. 



##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -105,6 +100,16 @@ public void write(FrameWithPartition frameWithPartition) throws IOException
     }
   }
 
+  private void convertChannelSuppliersToReadOnly(int index)

Review Comment:
   Regarding the common interface, I assume that there can be a lot more common code paths. OutputChannel is essentially a PartitionedOutputChannel where the partition is fixed. This would require refactoring of ReadableFrameChannel as well.
   
   We can do without storing the readOnly() version in the `ComposingOutputChannelFactory`, though I think we can leave it as is, since its immutable and conveys the intent that we on fetching the readable channel, we don't want other references to be stored (if they are). WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg commented on code in PR #13974:
URL: https://github.com/apache/druid/pull/13974#discussion_r1149157434


##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -105,6 +100,16 @@ public void write(FrameWithPartition frameWithPartition) throws IOException
     }
   }
 
+  private void convertChannelSuppliersToReadOnly(int index)

Review Comment:
   The reason for extracting out `convertToReadOnly` to an interface was that currently we supply the whole output channel objects to a writable frame channel, which seems like an overkill. Since the only thing we want is the ability to release resources of the output channel, only that part can be exposed to the writable channel.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe merged pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "cryptoe (via GitHub)" <gi...@apache.org>.
cryptoe merged PR #13974:
URL: https://github.com/apache/druid/pull/13974


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "cryptoe (via GitHub)" <gi...@apache.org>.
cryptoe commented on code in PR #13974:
URL: https://github.com/apache/druid/pull/13974#discussion_r1147790176


##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -105,6 +100,16 @@ public void write(FrameWithPartition frameWithPartition) throws IOException
     }
   }
 
+  private void convertChannelSuppliersToReadOnly(int index)

Review Comment:
   Yes that is part of a refactor that can be done as a follow up. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg commented on code in PR #13974:
URL: https://github.com/apache/druid/pull/13974#discussion_r1149157434


##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -105,6 +100,16 @@ public void write(FrameWithPartition frameWithPartition) throws IOException
     }
   }
 
+  private void convertChannelSuppliersToReadOnly(int index)

Review Comment:
   The reason for suggesting extracting out `convertToReadOnly` to an interface was that currently we supply the whole output channel objects to a writable frame channel, which seems like an overkill. Since the only thing we want is the ability to release resources of the output channel, only that part can be exposed to the writable channel.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "cryptoe (via GitHub)" <gi...@apache.org>.
cryptoe commented on code in PR #13974:
URL: https://github.com/apache/druid/pull/13974#discussion_r1150650059


##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -118,14 +123,17 @@ public void close() throws IOException
   {
     if (currentIndex < writableChannelSuppliers.size()) {
       writableChannelSuppliers.get(currentIndex).get().close();
+      convertChannelSuppliersToReadOnly(currentIndex);
       currentIndex = writableChannelSuppliers.size();
     }
   }
 
   @Override
   public boolean isClosed()
   {
-    return currentIndex == writableChannelSuppliers.size();
+    return currentIndex == writableChannelSuppliers.size() || writableChannelSuppliers.get(currentIndex)

Review Comment:
   I have removed the additional check for now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "cryptoe (via GitHub)" <gi...@apache.org>.
cryptoe commented on code in PR #13974:
URL: https://github.com/apache/druid/pull/13974#discussion_r1147789654


##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -118,14 +123,17 @@ public void close() throws IOException
   {
     if (currentIndex < writableChannelSuppliers.size()) {
       writableChannelSuppliers.get(currentIndex).get().close();
+      convertChannelSuppliersToReadOnly(currentIndex);
       currentIndex = writableChannelSuppliers.size();
     }
   }
 
   @Override
   public boolean isClosed()
   {
-    return currentIndex == writableChannelSuppliers.size();
+    return currentIndex == writableChannelSuppliers.size() || writableChannelSuppliers.get(currentIndex)

Review Comment:
   Since this class does not own the `WritableFrameChannel` which are passed by the suppliers, the passed channel might be closed, externally and hence this check . 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on a diff in pull request #13974: OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set

Posted by "rohangarg (via GitHub)" <gi...@apache.org>.
rohangarg commented on code in PR #13974:
URL: https://github.com/apache/druid/pull/13974#discussion_r1149154930


##########
processing/src/main/java/org/apache/druid/frame/channel/ComposingWritableFrameChannel.java:
##########
@@ -118,14 +123,17 @@ public void close() throws IOException
   {
     if (currentIndex < writableChannelSuppliers.size()) {
       writableChannelSuppliers.get(currentIndex).get().close();
+      convertChannelSuppliersToReadOnly(currentIndex);
       currentIndex = writableChannelSuppliers.size();
     }
   }
 
   @Override
   public boolean isClosed()
   {
-    return currentIndex == writableChannelSuppliers.size();
+    return currentIndex == writableChannelSuppliers.size() || writableChannelSuppliers.get(currentIndex)

Review Comment:
   I think if we want to check for passed channel being closed, we could check that before initialising them for writes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org