You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Tao Li (JIRA)" <ji...@apache.org> on 2015/12/07 13:44:11 UTC

[jira] [Created] (SPARK-12179) Spark SQL get different result with the same code

Tao Li created SPARK-12179:
------------------------------

             Summary: Spark SQL get different result with the same code
                 Key: SPARK-12179
                 URL: https://issues.apache.org/jira/browse/SPARK-12179
             Project: Spark
          Issue Type: Bug
          Components: Spark Core, SQL
    Affects Versions: 1.5.2, 1.5.1, 1.5.0, 1.4.1, 1.4.0, 1.3.1, 1.3.0, 1.3.2, 1.4.2, 1.5.3
         Environment: hadoop version: 2.5.0-cdh5.3.2
spark version: 1.5.3
run mode: yarn-client
            Reporter: Tao Li


I run the sql in yarn-client mode, but get different result each time.

As you can see the example, I get the different shuffle write with the same shuffle read in two jobs.

Some of my spark app runs well, but some always met this problem. And I met this problem on spark 1.3, 1.4 and 1.5 version.

Can you git me some suggestion about possible causes or how to figure out the problem?

1. First Run
Details for Stage 9 (Attempt 0)
Total Time Across All Tasks: 5.8 min
Shuffle Read: 24.4 MB / 205399
Shuffle Write: 6.8 MB / 54934

2. Second Run
Details for Stage 9 (Attempt 0)
Total Time Across All Tasks: 5.6 min
Shuffle Read: 24.4 MB / 205399
Shuffle Write: 6.8 MB / 54905



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org