You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@parquet.apache.org by ju...@apache.org on 2016/12/05 23:27:20 UTC

parquet-mr git commit: PARQUET-783: Close the underlying stream when an H2SeekableInputStream is closed

Repository: parquet-mr
Updated Branches:
  refs/heads/master 4453aa3bf -> 09d28fe79


PARQUET-783: Close the underlying stream when an H2SeekableInputStream is closed

This PR addresses https://issues.apache.org/jira/browse/PARQUET-783.

`ParquetFileReader` opens a `SeekableInputStream` to read a footer. In the process, it opens a new `FSDataInputStream` and wraps it. However, `H2SeekableInputStream` does not override the `close` method. Therefore, when `ParquetFileReader` closes it, the underlying `FSDataInputStream` is not closed. As a result, these stale connections can exhaust a clusters' data nodes' connection resources and lead to mysterious HDFS read failures in HDFS clients, e.g.

```
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517
```

Author: Michael Allman <mi...@videoamp.com>

Closes #388 from mallman/parquet-783-close_underlying_inputstream and squashes the following commits:

f4b27c1 [Michael Allman] PARQUET-783 Close the underlying stream when an H2SeekableInputStream is closed


Project: http://git-wip-us.apache.org/repos/asf/parquet-mr/repo
Commit: http://git-wip-us.apache.org/repos/asf/parquet-mr/commit/09d28fe7
Tree: http://git-wip-us.apache.org/repos/asf/parquet-mr/tree/09d28fe7
Diff: http://git-wip-us.apache.org/repos/asf/parquet-mr/diff/09d28fe7

Branch: refs/heads/master
Commit: 09d28fe7995db1a4da2c651d362007d2082c663c
Parents: 4453aa3
Author: Michael Allman <mi...@videoamp.com>
Authored: Mon Dec 5 15:27:14 2016 -0800
Committer: Julien Le Dem <ju...@dremio.com>
Committed: Mon Dec 5 15:27:14 2016 -0800

----------------------------------------------------------------------
 .../org/apache/parquet/hadoop/util/H2SeekableInputStream.java   | 5 +++++
 1 file changed, 5 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/parquet-mr/blob/09d28fe7/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java
----------------------------------------------------------------------
diff --git a/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java b/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java
index a706546..ec4567e 100644
--- a/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java
+++ b/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java
@@ -45,6 +45,11 @@ class H2SeekableInputStream extends SeekableInputStream {
   }
 
   @Override
+  public void close() throws IOException {
+    stream.close();
+  }
+
+  @Override
   public long getPos() throws IOException {
     return stream.getPos();
   }