You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/02/17 18:10:00 UTC

[jira] [Commented] (DRILL-8139) Parquet CodecFactory thread safety bug

    [ https://issues.apache.org/jira/browse/DRILL-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494132#comment-17494132 ] 

ASF GitHub Bot commented on DRILL-8139:
---------------------------------------

vdiravka commented on a change in pull request #2463:
URL: https://github.com/apache/drill/pull/2463#discussion_r809331597



##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/compression/AirliftBytesInputCompressor.java
##########
@@ -159,13 +160,16 @@ public void decompress(ByteBuffer input, int compressedSize, ByteBuffer output,
 
   @Override
   public void release() {
-    logger.debug(
-        "will release {} allocated buffers.",
-        this.allocatedBuffers.size()
-    );
+    int bufCount  = allocatedBuffers.size();
 
-    while (!this.allocatedBuffers.isEmpty()) {
-      this.allocator.release(allocatedBuffers.pop());
+    // LIFO release order to try to reduce memory fragmentation.
+    int i = 0;
+    while (!allocatedBuffers.isEmpty()) {
+      allocator.release(allocatedBuffers.pop());
+      i++;
     }
+    assert bufCount == i;

Review comment:
       It is better to write assertion message always. It helps to understand the error in StackTrace better




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Parquet CodecFactory thread safety bug
> --------------------------------------
>
>                 Key: DRILL-8139
>                 URL: https://issues.apache.org/jira/browse/DRILL-8139
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 1.19.0
>            Reporter: James Turton
>            Assignee: James Turton
>            Priority: Blocker
>         Attachments: recording.mp4
>
>
> Update: PARQUET-2126 has been opened to describe the underlying thread safety problem in parquet-mr.  The rdblue/brotli-codec is also affected.
> In previously released versions of Drill, back to at least 1.17, this bug only appears under the combination of the async column reader and the _sync_ page reader, as per the reproduction script below.  In master, the bug appears under the async column reader and both the sync and async page readers.
> {code:java}
> set `store.parquet.compression` = 'gzip';
> drop table if exists dfs.tmp.m;
> create table dfs.tmp.m as select * from cp.`tpch/supplier.parquet`;
> set `store.parquet.reader.pagereader.async` = false;
> set `store.parquet.reader.columnreader.async` = true;
> select * from dfs.tmp.m order by s_suppkey; – repeat this last query and watch the returned data. Eventually you will also failed queries or JVM crashes
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)