You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/09/17 11:33:21 UTC
[jira] [Commented] (SPARK-17573) Why don't we close the
input/output Streams
[ https://issues.apache.org/jira/browse/SPARK-17573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15498842#comment-15498842 ]
Sean Owen commented on SPARK-17573:
-----------------------------------
These are byte array streams, and have no resources to close. There is no memory issue, because close() does not unallocate memory here. Unless you have an instance where an I/O stream isn't closed, let's close this please. This may also be better to ask on dev@ first before filing JIRAs as you are becoming accustomed to Spark.
> Why don't we close the input/output Streams
> -------------------------------------------
>
> Key: SPARK-17573
> URL: https://issues.apache.org/jira/browse/SPARK-17573
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.0.0
> Reporter: Jianfei Wang
> Labels: performance
>
> I find that there are many places in spark that we don't close the input/output Streams manually, if so ,there will potential "OOM" errors and some other errors happen
> such as:
> {code}
> private[sql] def bytesToRow(bytes: Array[Byte], schema: StructType): Row = {
> val bis = new ByteArrayInputStream(bytes)
> val dis = new DataInputStream(bis)
> val num = SerDe.readInt(dis)
> Row.fromSeq((0 until num).map { i =>
> doConversion(SerDe.readObject(dis), schema.fields(i).dataType)
> })
> }
> private[sql] def rowToRBytes(row: Row): Array[Byte] = {
> val bos = new ByteArrayOutputStream()
> val dos = new DataOutputStream(bos)
> val cols = (0 until row.length).map(row(_).asInstanceOf[Object]).toArray
> SerDe.writeObject(dos, cols)
> bos.toByteArray()
> }
> override def deserialize(storageFormat: Array[Byte]): MaxValue = {
> val in = new ByteArrayInputStream(storageFormat)
> val stream = new DataInputStream(in)
> val isValueSet = stream.readBoolean()
> val value = stream.readInt()
> new MaxValue(value, isValueSet)
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org