You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Fan Yunbo (JIRA)" <ji...@apache.org> on 2019/05/09 03:12:00 UTC

[jira] [Commented] (SPARK-27663) Task accomplished incompletely but marked as success

    [ https://issues.apache.org/jira/browse/SPARK-27663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836044#comment-16836044 ] 

Fan Yunbo commented on SPARK-27663:
-----------------------------------

The incomplete task's id is 17.0 in tage 98517.0

!incomplte-task-1.png!

the input size is 23.5 MB, and finished in 1 s !incomplte-task-2.png!

and the log shows the input split size is
{code:java}
Input split: hdfs://cqocdc/user/hive/warehouse/dw_user_useage_privilege_dt_yyyymmdd/month_id=201904/day_id=20190422/000017_0.snappy:0+326992763{code}
{code:java}
19/04/23 12:09:18 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 6835988
19/04/23 12:09:18 INFO executor.Executor: Running task 17.0 in stage 98517.0 (TID 6835988)
19/04/23 12:09:18 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 173456
19/04/23 12:09:18 INFO memory.MemoryStore: Block broadcast_173456_piece0 stored as bytes in memory (estimated size 13.4 KB, free 15.2 GB)
19/04/23 12:09:18 INFO broadcast.TorrentBroadcast: Reading broadcast variable 173456 took 4 ms
19/04/23 12:09:18 INFO memory.MemoryStore: Block broadcast_173456 stored as values in memory (estimated size 30.3 KB, free 15.2 GB)
19/04/23 12:09:18 INFO rdd.HadoopRDD: Input split: hdfs://cqocdc/user/hive/warehouse/dw_user_useage_privilege_dt_yyyymmdd/month_id=201904/day_id=20190422/000017_0.snappy:0+326992763
19/04/23 12:09:18 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 173452
19/04/23 12:09:18 INFO memory.MemoryStore: Block broadcast_173452_piece0 stored as bytes in memory (estimated size 30.8 KB, free 15.2 GB)
19/04/23 12:09:18 INFO broadcast.TorrentBroadcast: Reading broadcast variable 173452 took 3 ms
19/04/23 12:09:18 INFO memory.MemoryStore: Block broadcast_173452 stored as values in memory (estimated size 365.1 KB, free 15.3 GB)
19/04/23 12:09:18 INFO codegen.CodeGenerator: Code generated in 6.949728 ms
19/04/23 12:09:18 INFO codegen.CodeGenerator: Code generated in 20.909883 ms
19/04/23 12:09:18 INFO output.FileOutputCommitter: Saved output of task 'attempt_20190423120856_98508_m_000047_0' to hdfs://cqocdc/tmp/.staging/hive_hive_2019-04-23_12-08-56_154_3110404551071203558-1370/-ext-10000/_temporary/0/task_20190423120856_98508_m_000047
19/04/23 12:09:18 INFO mapred.SparkHadoopMapRedUtil: attempt_20190423120856_98508_m_000047_0: Committed
19/04/23 12:09:18 INFO executor.Executor: Finished task 47.0 in stage 98508.0 (TID 6835975). 3217 bytes result sent to driver
19/04/23 12:09:19 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
19/04/23 12:09:19 INFO storage.DiskBlockManager: Shutdown hook called
19/04/23 12:09:19 INFO util.ShutdownHookManager: Shutdown hook called
19/04/23 12:09:19 INFO executor.Executor: Finished task 17.0 in stage 98517.0 (TID 6835988). 3188 bytes result sent to driver
{code}
The file size and last modified time:

!image-2019-05-09-11-10-04-602.png!

The stage of the query total input is 14.9 G:

!incomplte-task-0.png!

 

> Task accomplished incompletely but marked as success
> ----------------------------------------------------
>
>                 Key: SPARK-27663
>                 URL: https://issues.apache.org/jira/browse/SPARK-27663
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 2.1.0
>            Reporter: Fan Yunbo
>            Priority: Major
>         Attachments: image-2019-05-09-11-10-04-602.png, incomplte-task-0.png, incomplte-task-1.png, incomplte-task-2.png
>
>
> It happens when running sql queries using spark sql.
> The task was accomplished incompletely but marked as success since there were not any  exceptions and failed or killed tasks.
> When I checked the query result, it missed about 4000 records.
> The history web ui shows that the task input size is 23.5 MB, but the log in the executor shows the split size is 326992763, about 300 MB.
> And this task was finished in 1 second, but others’ duration was about 15 seconds.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org