You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2019/06/28 18:45:00 UTC
[jira] [Assigned] (ORC-525) Users must close Readers in ORC 1.5.6
[ https://issues.apache.org/jira/browse/ORC-525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley reassigned ORC-525:
---------------------------------
Assignee: Owen O'Malley
Summary: Users must close Readers in ORC 1.5.6 (was: OrcFile.createReader may leave open stream)
Spark users noticed that Spark was generating twice the number of file opens on the NameNode compared to Hive in this case:
{code}
Reader reader = OrcFile.createReader(...);
RecordReader rows = reader.rows(...);
{code}
This was fixed in ORC-498, but now applications must close the Reader objects that aren't used to create RecordReaders.
> Users must close Readers in ORC 1.5.6
> -------------------------------------
>
> Key: ORC-525
> URL: https://issues.apache.org/jira/browse/ORC-525
> Project: ORC
> Issue Type: Bug
> Components: Java
> Affects Versions: 1.5.6
> Reporter: Dongjoon Hyun
> Assignee: Owen O'Malley
> Priority: Major
>
> Unlike 1.5.5, ORC 1.5.6-rc1 seems to have a following issue which causes Apache Spark unit test failures.
> {code}
> $ build/sbt "sql/testOnly *.SQLQuerySuite"
> ...
> [info] Cause: java.lang.Throwable:
> [info] at org.apache.spark.DebugFilesystem$.addOpenStream(DebugFilesystem.scala:36)
> [info] at org.apache.spark.DebugFilesystem.open(DebugFilesystem.scala:70)
> [info] at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
> [info] at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:535)
> [info] at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:368)
> [info] at org.apache.orc.OrcFile.createReader(OrcFile.java:343)
> {code}
> ORC-498 (ReaderImpl and RecordReaderImpl open separate file handles) seems to be the reason of this change.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)