You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Josh Wills (JIRA)" <ji...@apache.org> on 2014/10/28 12:45:34 UTC

[jira] [Resolved] (CRUNCH-479) Writing to target with WriteMode.APPEND merges values into PCollection

     [ https://issues.apache.org/jira/browse/CRUNCH-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Wills resolved CRUNCH-479.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 0.12.0

Pushed to master, with the changes to your test code that you found objectionable reverted, despite their obvious validity.

> Writing to target with WriteMode.APPEND merges values into PCollection
> ----------------------------------------------------------------------
>
>                 Key: CRUNCH-479
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-479
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>            Reporter: Micah Whitacre
>            Assignee: Josh Wills
>             Fix For: 0.12.0
>
>         Attachments: CRUNCH-479.patch, CRUNCH-479b.patch
>
>
> This was mentioned as part of CDK-617[1].  A PCollection that contains a set of values, is written to a target with WriteMode.APPEND, and then that PCollection is materialized, when you iterate over that PCollection it contains not only the new values that were appended but also the existing values.  This is surprising as most would expect that collection to only contain the original collection of values.  A use case for this might be if the solution is looking to only process the new values instead of dealing with all of the existing data.
> [1] - https://issues.cloudera.org/browse/CDK-671



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)