You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/08/31 10:48:20 UTC

[jira] [Created] (SPARK-17331) Avoid allocating 0-length arrays

Sean Owen created SPARK-17331:
---------------------------------

Summary: Avoid allocating 0-length arrays
Key: SPARK-17331
URL: https://issues.apache.org/jira/browse/SPARK-17331
Project: Spark
Issue Type: Improvement
Components: MLlib, Spark Core
Affects Versions: 2.0.0
Reporter: Sean Owen
Assignee: Sean Owen
Priority: Trivial

I've noticed a number of places in the code that allocate 0-length arrays. Since all 0-length arrays of a type are equivalent, it's often possible to avoid these allocations.

Where it actually likely matters is {{UTF8String}}, which does it in a several places and which can even be replaced by {{UTF8String.EMPTY_UTF8}}, saving even more allocations.

It _could_ be worth refactoring other occurrences, mostly of "new byte[0]", to simply use a reference to one fixed static instance of it. But I avoided that in the Java code on the grounds that it's a little clunky and can be added if it proves to be a hotspot.

Same in Scala, where {{Array[T]()}} can be replaced by {{Array.empty}}. This actually still allocates a 0-length array. However the former call actually allocates *two* empty arrays because of varargs. The latter is simpler and widely used in the code, so, seems worth touching up to save some garbage.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org