You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (JIRA)" <ji...@apache.org> on 2017/06/01 05:38:04 UTC

[jira] [Resolved] (SPARK-20244) Incorrect input size in UI with pyspark

     [ https://issues.apache.org/jira/browse/SPARK-20244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-20244.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0

> Incorrect input size in UI with pyspark
> ---------------------------------------
>
>                 Key: SPARK-20244
>                 URL: https://issues.apache.org/jira/browse/SPARK-20244
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 2.0.0, 2.1.0
>            Reporter: Artur Sukhenko
>            Assignee: Saisai Shao
>            Priority: Minor
>             Fix For: 2.2.0
>
>         Attachments: pyspark_incorrect_inputsize.png, sparkshell_correct_inputsize.png
>
>
> In Spark UI (Details for Stage) Input Size is  64.0 KB when running in PySparkShell. 
> Also it is incorrect in Tasks table:
> 64.0 KB / 132120575 in pyspark
> 252.0 MB / 132120575 in spark-shell
> I will attach screenshots.
> Reproduce steps:
> Run this  to generate big file (press Ctrl+C after 5-6 seconds)
> $ yes > /tmp/yes.txt
> $ hadoop fs -copyFromLocal /tmp/yes.txt /tmp/
> $ ./bin/pyspark
> {code}
> Python 2.7.5 (default, Nov  6 2016, 00:28:07) 
> [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
> Welcome to
>       ____              __
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /__ / .__/\_,_/_/ /_/\_\   version 2.1.0
>       /_/
> Using Python version 2.7.5 (default, Nov  6 2016 00:28:07)
> SparkSession available as 'spark'.{code}
> >>> a = sc.textFile("/tmp/yes.txt")
> >>> a.count()
> Open Spark UI and check Stage 0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org