You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/09/29 09:29:51 UTC

[GitHub] [iceberg] Heltman opened a new pull request, #5887: Core: Clear queue and future task when close ParallelIterable

Heltman opened a new pull request, #5887:
URL: https://github.com/apache/iceberg/pull/5887

   Fixes #5886 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#issuecomment-1276821865

   Thanks, @Heltman! And thanks for reviewing, @nastra!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#issuecomment-1276377042

   Looks good to me. Will merge when tests are passing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Heltman commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
Heltman commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991088413


##########
core/src/test/java/org/apache/iceberg/TestManifestGroupPlanFiles.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.concurrent.ConcurrentLinkedQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import org.apache.iceberg.io.CloseableIterator;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+@RunWith(Parameterized.class)
+public class TestManifestGroupPlanFiles extends TestManifestReader {
+  @Parameterized.Parameters(name = "formatVersion = {0}")
+  public static Object[] parameters() {
+    return new Object[] {1, 2};
+  }
+
+  public TestManifestGroupPlanFiles(int formatVersion) {
+    super(formatVersion);
+  }
+
+  @Test
+  public void testCloseParallelIteratorWithoutCompleteIteration()

Review Comment:
   @nastra I push new commit, just like your suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991019523


##########
core/src/test/java/org/apache/iceberg/TestManifestGroupPlanFiles.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.concurrent.ConcurrentLinkedQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import org.apache.iceberg.io.CloseableIterator;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+@RunWith(Parameterized.class)
+public class TestManifestGroupPlanFiles extends TestManifestReader {
+  @Parameterized.Parameters(name = "formatVersion = {0}")
+  public static Object[] parameters() {
+    return new Object[] {1, 2};
+  }
+
+  public TestManifestGroupPlanFiles(int formatVersion) {
+    super(formatVersion);
+  }
+
+  @Test
+  public void testCloseParallelIteratorWithoutCompleteIteration()

Review Comment:
   I think it would be better to create a `TestParallelIterable` class (similar to how we have `TestCloseableIterable`) and add the test there without any dependencoes to `ManiFestFile` / `ManifestGroup`.
   It would then look like this:
   
   ```
   @Test
     public void closeParallelIteratorWithoutCompleteIteration()
         throws IOException, IllegalAccessException, NoSuchFieldException {
       ExecutorService executor = Executors.newFixedThreadPool(1);
   
       Iterable<CloseableIterable<Integer>> transform =
           Iterables.transform(
               Lists.newArrayList(1, 2, 3, 4, 5),
               item ->
                   new CloseableIterable<Integer>() {
                     @Override
                     public void close() {}
   
                     @Override
                     public CloseableIterator<Integer> iterator() {
                       return CloseableIterator.withClose(Collections.singletonList(item).iterator());
                     }
                   });
   
       ParallelIterable<Integer> parallelIterable = new ParallelIterable<>(transform, executor);
       CloseableIterator<Integer> iterator = parallelIterable.iterator();
       Field queueField = iterator.getClass().getDeclaredField("queue");
       queueField.setAccessible(true);
       ConcurrentLinkedQueue<FileScanTask> queue = (ConcurrentLinkedQueue<FileScanTask>) queueField.get(iterator);
   
       assertThat(iterator.hasNext()).isTrue();
       assertThat(iterator.next()).isNotNull();
       assertThat(queue).isNotEmpty();
   
       iterator.close();
       assertThat(queue).isEmpty();
     }
   ```
   
   Note that this uses assertJ assertions (`assertThat(queue).isEmpty()`) as this will print the content of the queue in case the assertion ever fails (which makes debugging much easier)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Heltman commented on pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
Heltman commented on PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#issuecomment-1262066354

   > @Heltman thanks for your contribution. Could you please add some unit tests?
   Sure, I will try.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Heltman commented on pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
Heltman commented on PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#issuecomment-1272751881

   @nastra I add a unit test for clear queue result, do I need more test or any change?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lirui-apache commented on pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by "lirui-apache (via GitHub)" <gi...@apache.org>.
lirui-apache commented on PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#issuecomment-1693101384

   Hey @Heltman , I wonder whether the changes made here are still just best effort to cancel the future tasks. Suppose `close()` and `submitNextTask()` are called concurrently, it's possible that `close()` is invoked after `submitNextTask()` checks the `closed` flag, but before  it can really submit a new task. Then the newly submitted task can run and add elements to the queue, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991035716


##########
core/src/main/java/org/apache/iceberg/util/ParallelIterable.java:
##########
@@ -81,13 +81,18 @@ private ParallelIterator(
 
     @Override
     public void close() {
+      // close first, avoid new task submit
+      this.closed = true;
+
       // cancel background tasks
       for (int i = 0; i < taskFutures.length; i += 1) {
-        if (taskFutures[i] != null && !taskFutures[i].isDone()) {
-          taskFutures[i].cancel(true);
+        Future<?> taskFuture = taskFutures[i];
+        if (taskFuture != null && !taskFuture.isDone()) {
+          taskFuture.cancel(true);
         }
       }
-      this.closed = true;
+      // clean queue
+      this.queue.clear();

Review Comment:
   @rdblue do you remember maybe why the queue wasn't cleared originally when close was called?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Heltman commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
Heltman commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991088413


##########
core/src/test/java/org/apache/iceberg/TestManifestGroupPlanFiles.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.concurrent.ConcurrentLinkedQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import org.apache.iceberg.io.CloseableIterator;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+@RunWith(Parameterized.class)
+public class TestManifestGroupPlanFiles extends TestManifestReader {
+  @Parameterized.Parameters(name = "formatVersion = {0}")
+  public static Object[] parameters() {
+    return new Object[] {1, 2};
+  }
+
+  public TestManifestGroupPlanFiles(int formatVersion) {
+    super(formatVersion);
+  }
+
+  @Test
+  public void testCloseParallelIteratorWithoutCompleteIteration()

Review Comment:
   @nastra I push new commit, just like your suggestion. But I put it in core `org.apache.iceberg.io` dir like `TestCloseableIterable`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Heltman commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
Heltman commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991100201


##########
core/src/test/java/org/apache/iceberg/io/TestParallelIterable.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.io;

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991082129


##########
core/src/test/java/org/apache/iceberg/TestManifestGroupPlanFiles.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.concurrent.ConcurrentLinkedQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import org.apache.iceberg.io.CloseableIterator;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+@RunWith(Parameterized.class)
+public class TestManifestGroupPlanFiles extends TestManifestReader {
+  @Parameterized.Parameters(name = "formatVersion = {0}")
+  public static Object[] parameters() {
+    return new Object[] {1, 2};
+  }
+
+  public TestManifestGroupPlanFiles(int formatVersion) {
+    super(formatVersion);
+  }
+
+  @Test
+  public void testCloseParallelIteratorWithoutCompleteIteration()

Review Comment:
   it's here: https://github.com/apache/iceberg/blob/master/api/src/test/java/org/apache/iceberg/io/TestCloseableIterable.java#L38. Given that `ParallelIterable` is in the `iceberg-core` module under `org.apache.iceberg.util` you can just add `TestParallelIterable` under `org.apache.iceberg.util` in the `iceberg-core` test module



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991100685


##########
core/src/test/java/org/apache/iceberg/util/TestParallelIterable.java:
##########
@@ -0,0 +1,72 @@
+/*
+ *
+ *  * Licensed to the Apache Software Foundation (ASF) under one

Review Comment:
   can you run `./gradlew spotlessApply` to get the license header fixed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Heltman commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
Heltman commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991058325


##########
core/src/test/java/org/apache/iceberg/TestManifestGroupPlanFiles.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.concurrent.ConcurrentLinkedQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import org.apache.iceberg.io.CloseableIterator;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+@RunWith(Parameterized.class)
+public class TestManifestGroupPlanFiles extends TestManifestReader {
+  @Parameterized.Parameters(name = "formatVersion = {0}")
+  public static Object[] parameters() {
+    return new Object[] {1, 2};
+  }
+
+  public TestManifestGroupPlanFiles(int formatVersion) {
+    super(formatVersion);
+  }
+
+  @Test
+  public void testCloseParallelIteratorWithoutCompleteIteration()

Review Comment:
   This is a good idea. At first, I thought about `TestParallelIterable `, but I couldn't find `TestCloseableIterable` as reference. 
   Because I don't know much about the iceberg test specification, so I just wrote a test referring to `TestManifestReader`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
nastra commented on PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#issuecomment-1262055257

   @Heltman thanks for your contribution. Could you please add some unit tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
nastra commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991095816


##########
core/src/test/java/org/apache/iceberg/io/TestParallelIterable.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.io;

Review Comment:
   should be under `org.apache.iceberg.util`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Heltman commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
Heltman commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r991103327


##########
core/src/test/java/org/apache/iceberg/util/TestParallelIterable.java:
##########
@@ -0,0 +1,72 @@
+/*
+ *
+ *  * Licensed to the Apache Software Foundation (ASF) under one

Review Comment:
   Sorry I forget run `./gradlew :iceberg-core:spotlessApply`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #5887:
URL: https://github.com/apache/iceberg/pull/5887


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #5887: Core: Clear queue and future task when close ParallelIterable

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #5887:
URL: https://github.com/apache/iceberg/pull/5887#discussion_r993611914


##########
core/src/main/java/org/apache/iceberg/util/ParallelIterable.java:
##########
@@ -81,13 +81,18 @@ private ParallelIterator(
 
     @Override
     public void close() {
+      // close first, avoid new task submit
+      this.closed = true;
+
       // cancel background tasks
       for (int i = 0; i < taskFutures.length; i += 1) {
-        if (taskFutures[i] != null && !taskFutures[i].isDone()) {
-          taskFutures[i].cancel(true);
+        Future<?> taskFuture = taskFutures[i];
+        if (taskFuture != null && !taskFuture.isDone()) {
+          taskFuture.cancel(true);
         }
       }
-      this.closed = true;
+      // clean queue
+      this.queue.clear();

Review Comment:
   I think the reason is that we didn't expect the iterator to last very long before getting garbage collected. It seems reasonable to clean up as early as possible though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org