You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "François Garillot (JIRA)" <ji...@apache.org> on 2015/07/22 14:59:04 UTC

[jira] [Comment Edited] (SPARK-9236) Left Outer Join with empty JavaPairRDD returns empty RDD

    [ https://issues.apache.org/jira/browse/SPARK-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636872#comment-14636872 ] 

François Garillot edited comment on SPARK-9236 at 7/22/15 12:58 PM:
--------------------------------------------------------------------

Note that of my two added tests, one is a Java test, which does not fail.

I'd appreciate a spark branch with that test added to one of the java suites (hence without its dependencies on assertJ & the lobok package), so that we can talk specifics.



was (Author: huitseeker):
Note that of my two added tests, one is a Java test, which does not fail.

I'd appreciate a branch with that test added (without its dependencies on assertJ & the lobok package), so that we can talk specifics.


> Left Outer Join with empty JavaPairRDD returns empty RDD
> --------------------------------------------------------
>
>                 Key: SPARK-9236
>                 URL: https://issues.apache.org/jira/browse/SPARK-9236
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API
>    Affects Versions: 1.3.1, 1.4.1
>            Reporter: Vitalii Slobodianyk
>
> When the *left outer join* is performed on a non-empty {{JavaPairRDD}} with a {{JavaPairRDD}} which was created with the {{emptyRDD()}} method the resulting RDD is empty. In the following unit test the latest assert fails.
> {code}
> import static org.assertj.core.api.Assertions.assertThat;
> import java.util.Collections;
> import lombok.val;
> import org.apache.spark.SparkConf;
> import org.apache.spark.api.java.JavaSparkContext;
> import org.junit.Test;
> import scala.Tuple2;
> public class SparkTest {
>   @Test
>   public void joinEmptyRDDTest() {
>     val sparkConf = new SparkConf().setAppName("test").setMaster("local");
>     try (val sparkContext = new JavaSparkContext(sparkConf)) {
>       val oneRdd = sparkContext.parallelize(Collections.singletonList("one"));
>       val twoRdd = sparkContext.parallelize(Collections.singletonList("two"));
>       val threeRdd = sparkContext.emptyRDD();
>       val onePair = oneRdd.mapToPair(t -> new Tuple2<Integer, String>(1, t));
>       val twoPair = twoRdd.groupBy(t -> 1);
>       val threePair = threeRdd.groupBy(t -> 1);
>       assertThat(onePair.leftOuterJoin(twoPair).collect()).isNotEmpty();
>       assertThat(onePair.leftOuterJoin(threePair).collect()).isNotEmpty();
>     }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org