You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2015/06/18 02:08:01 UTC

[jira] [Created] (SPARK-8419) Statistics.colStats could avoid an extra count()

Joseph K. Bradley created SPARK-8419:
----------------------------------------

             Summary: Statistics.colStats could avoid an extra count()
                 Key: SPARK-8419
                 URL: https://issues.apache.org/jira/browse/SPARK-8419
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
            Reporter: Joseph K. Bradley
            Priority: Trivial


Statistics.colStats goes through RowMatrix to compute the stats.  But RowMatrix.computeColumnSummaryStatistics does an extra count() which could be avoided.  Not going through RowMatrix would skip this extra pass over the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org