You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Bryan Cutler (JIRA)" <ji...@apache.org> on 2016/07/07 17:36:11 UTC

[jira] [Created] (SPARK-16421) Improve output from ML examples

Bryan Cutler created SPARK-16421:
------------------------------------

             Summary: Improve output from ML examples
                 Key: SPARK-16421
                 URL: https://issues.apache.org/jira/browse/SPARK-16421
             Project: Spark
          Issue Type: Improvement
          Components: Examples, ML
            Reporter: Bryan Cutler
            Priority: Trivial


In many ML examples, the output is useless.  Sometimes {{show()}} is called and any pertinent results are hidden.  For example, here is the output of max_abs_scaler

{noformat}
$ bin/spark-submit examples/src/main/python/ml/max_abs_scaler_example.py 
+-----+--------------------+--------------------+
|label|            features|      scaledFeatures|
+-----+--------------------+--------------------+
|  0.0|(692,[127,128,129...|(692,[127,128,129...|
|  1.0|(692,[158,159,160...|(692,[158,159,160...|
|  1.0|(692,[124,125,126...|(692,[124,125,126...|
{noformat}

Other times a few rows are printed out when {{show}} might be more appropriate.  Here is the output from binarizer_example

{noformat}
$ bin/spark-submit examples/src/main/python/ml/binarizer_example.py 
0.0                                                                             
1.0
0.0
{noformat}

But would be much more useful to just {{show()}} the transformed DataFrame

{noformat}
+-----+-------+-----------------+
|label|feature|binarized_feature|
+-----+-------+-----------------+
|    0|    0.1|              0.0|
|    1|    0.8|              1.0|
|    2|    0.2|              0.0|
+-----+-------+-----------------+
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org