You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:37:31 UTC

[jira] [Resolved] (SPARK-15673) Indefinite hanging issue with combination of cache, sort and unionAll

     [ https://issues.apache.org/jira/browse/SPARK-15673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-15673.
----------------------------------
    Resolution: Incomplete

> Indefinite hanging issue with combination of cache, sort and unionAll
> ---------------------------------------------------------------------
>
>                 Key: SPARK-15673
>                 URL: https://issues.apache.org/jira/browse/SPARK-15673
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.0, 1.6.1
>         Environment: I am running the test code on both a hortonworks sandbox and also on AWS EMR / EC2. 
>            Reporter: Jamie Hutton
>            Priority: Major
>              Labels: bulk-closed
>         Attachments: HangingTest.scala
>
>
> I have raised a couple of bugs to do with spark hanging. One of the previous ones (https://issues.apache.org/jira/browse/SPARK-15000) has been resolved in 1.6.1 but the following example is still an issue in 1.6.1. 
> The code below is a self-contained test case which generates some data and will lead to the hanging behaviour when run in spark-submit in 1.6.0 or 1.6.1. Strangely the code also hangs in spark-shell in 1.6.0 but it doesnt seem to in 1.6.1 (hence providing the main method test below). I run this using:
> spark-submit --class HangingTest --master local <path-to-compiled-jar>
> The hanging doesnt occur if you remove either of the first two cache steps OR the sort steps (I have added comments to this affect below). We have hit quite a few indefinite hanging issues with spark (another is this: https://issues.apache.org/jira/browse/SPARK-15002). There seems to be a rather fundamental issue with chaining steps together and using the cache call. 
> The bug seems to be confined to reading data out of hadoop - if we put the data onto a local drive (using file://) then the hanging stops happening. 
> This may seem rather a convoluted test case but that is mainly because I have stripped the code back to the simplest possible code that causes the issue.
> *CODE ATTACHED*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org