You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2019/06/28 18:45:00 UTC

[jira] [Assigned] (ORC-525) Users must close Readers in ORC 1.5.6

     [ https://issues.apache.org/jira/browse/ORC-525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley reassigned ORC-525:
---------------------------------

    Assignee: Owen O'Malley
     Summary: Users must close Readers in ORC 1.5.6  (was: OrcFile.createReader may leave open stream)

Spark users noticed that Spark was generating twice the number of file opens on the NameNode compared to Hive in this case:

{code}
Reader reader = OrcFile.createReader(...);
RecordReader rows = reader.rows(...);
{code} 

This was fixed in ORC-498, but now applications must close the Reader objects that aren't used to create RecordReaders.

> Users must close Readers in ORC 1.5.6
> -------------------------------------
>
>                 Key: ORC-525
>                 URL: https://issues.apache.org/jira/browse/ORC-525
>             Project: ORC
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 1.5.6
>            Reporter: Dongjoon Hyun
>            Assignee: Owen O'Malley
>            Priority: Major
>
> Unlike 1.5.5, ORC 1.5.6-rc1 seems to have a following issue which causes Apache Spark unit test failures.
> {code}
> $ build/sbt "sql/testOnly *.SQLQuerySuite"
> ...
> [info]   Cause: java.lang.Throwable:
> [info]   at org.apache.spark.DebugFilesystem$.addOpenStream(DebugFilesystem.scala:36)
> [info]   at org.apache.spark.DebugFilesystem.open(DebugFilesystem.scala:70)
> [info]   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
> [info]   at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:535)
> [info]   at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:368)
> [info]   at org.apache.orc.OrcFile.createReader(OrcFile.java:343)
> {code}
> ORC-498 (ReaderImpl and RecordReaderImpl open separate file handles) seems to be the reason of this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)