You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/02/06 18:19:11 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5197: fix(MemTable): make it cancel-safe and fix parallelism

alamb commented on code in PR #5197:
URL: https://github.com/apache/arrow-datafusion/pull/5197#discussion_r1097757843


##########
datafusion/core/src/datasource/memory.rs:
##########
@@ -76,20 +77,26 @@ impl MemTable {
             .map(|part_i| {
                 let task = state.task_ctx();
                 let exec = exec.clone();
-                tokio::spawn(async move {
+                let task = tokio::spawn(async move {
                     let stream = exec.execute(part_i, task)?;
                     common::collect(stream).await
-                })
+                });
+
+                AbortOnDropSingle::new(task)
             })
             // this collect *is needed* so that the join below can
             // switch between tasks
             .collect::<Vec<_>>();
 
         let mut data: Vec<Vec<RecordBatch>> =
             Vec::with_capacity(exec.output_partitioning().partition_count());
-        for task in tasks {
-            let result = task.await.expect("MemTable::load could not join task")?;
-            data.push(result);
+
+        for result in futures::future::join_all(tasks).await {
+            data.push(result.map_err(|_| {
+                DataFusionError::Internal(
+                    "MemTable::load could not join task".to_string(),
+                )

Review Comment:
   I think it would help if we could preserve the original error here rather than turning it into an internal error
   
   Perhaps using `DataFusionError::Context`
   ```suggestion
               data.push(result.map_err(|e| {
                   DataFusionError::Context(
                       "MemTable::load could not join task".to_string(), Box::new(e)
                   )
   ```
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org