You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/28 18:44:11 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4406: Add integration test for erroring when memory limits are hit

alamb commented on code in PR #4406:
URL: https://github.com/apache/arrow-datafusion/pull/4406#discussion_r1033924657


##########
datafusion/core/src/physical_plan/sorts/sort.rs:
##########
@@ -272,6 +273,13 @@ impl MemoryConsumer for ExternalSorter {
     }
 
     async fn spill(&self) -> Result<usize> {
+        let partition = self.partition_id();
+        let mut in_mem_batches = self.in_mem_batches.lock().await;
+        // we could always get a chance to free some memory as long as we are holding some
+        if in_mem_batches.len() == 0 {
+            return Ok(0);
+        }
+
         debug!(

Review Comment:
   I just moved the debug message down as it was confusing that spill didn't actually happen if `in_mem_batches` was empty



##########
datafusion/core/src/execution/memory_manager/proxy.rs:
##########
@@ -88,9 +88,10 @@ impl MemoryConsumer for MemoryConsumerProxy {
     }
 
     async fn spill(&self) -> Result<usize, DataFusionError> {
-        Err(DataFusionError::ResourcesExhausted(
-            "Cannot spill AggregationState".to_owned(),
-        ))
+        Err(DataFusionError::ResourcesExhausted(format!(
+            "Cannot spill {}",

Review Comment:
   Improved error message here



##########
datafusion/core/tests/join_fuzz.rs:
##########
@@ -200,24 +199,14 @@ fn make_staggered_batches(len: usize) -> Vec<RecordBatch> {
     let input4 = Int32Array::from_iter_values(input4.into_iter());
 
     // split into several record batches
-    let mut remainder = RecordBatch::try_from_iter(vec![
+    let batch = RecordBatch::try_from_iter(vec![
         ("a", Arc::new(input1) as ArrayRef),
         ("b", Arc::new(input2) as ArrayRef),
         ("x", Arc::new(input3) as ArrayRef),
         ("y", Arc::new(input4) as ArrayRef),
     ])
     .unwrap();
 
-    let mut batches = vec![];
-
     // use a random number generator to pick a random sized output
-    let mut rng = StdRng::seed_from_u64(42);

Review Comment:
   This piece of code was copied in several places so I refactored it into `stagger_batch`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org