You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2022/01/12 03:51:02 UTC

[jira] [Updated] (BEAM-6719) Allow multiple Joins in the same pipeline

     [ https://issues.apache.org/jira/browse/BEAM-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kenneth Knowles updated BEAM-6719:
----------------------------------

This Jira ticket has a pull request attached to it, but is still open. Did the pull request resolve the issue? If so, could you please mark it resolved? This will help the project have a clear view of its open issues.

> Allow multiple Joins in the same pipeline
> -----------------------------------------
>
>                 Key: BEAM-6719
>                 URL: https://issues.apache.org/jira/browse/BEAM-6719
>             Project: Beam
>          Issue Type: Improvement
>          Components: extensions-java-join-library
>            Reporter: Daniel Mescheder
>            Priority: P3
>              Labels: Clarified
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently it is not possible to have multiple joins in the same pipeline without wrapping them in individual PTransforms as this would generate name clashes.
> Consider the following test case:
> {code:java}
> @Test
> public void testMultipleJoinsInSamePipeline() { 
>   leftListOfKv.add(KV.of("Key2", 4L)); 
>   PCollection<KV<String, Long>> leftCollection = p.apply("CreateLeft", Create.of(leftListOfKv));
>   rightListOfKv.add(KV.of("Key2", "bar")); 
>   PCollection<KV<String, String>> rightCollection = p.apply("CreateRight", Create.of(rightListOfKv));
>   expectedResult.add(KV.of("Key2", KV.of(4L, "bar")));
>   PCollection<KV<String, KV<Long, String>>> output1 = Join.innerJoin(leftCollection, rightCollection);
>   PCollection<KV<String, KV<Long, String>>> output2 = Join.innerJoin(leftCollection, rightCollection);
>  PAssert.that(output1).containsInAnyOrder(expectedResult);
>  PAssert.that(output2).containsInAnyOrder(expectedResult);
>  p.run(); 
> }
> {code}
> This fails because of clashing names in the pipeline and there is currently no way to use the join library to give the joins different names.
> Therefore I find myself routinely wrapping joins in new PTransforms which leads me to believe that this should be part of the library itself.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)