You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Josh Wills (JIRA)" <ji...@apache.org> on 2013/09/29 21:03:24 UTC
[jira] [Resolved] (CRUNCH-264) Writing to TextFileTarget map side
does not show up in plan
[ https://issues.apache.org/jira/browse/CRUNCH-264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Wills resolved CRUNCH-264.
-------------------------------
Resolution: Fixed
Fix Version/s: 0.8.0
> Writing to TextFileTarget map side does not show up in plan
> -----------------------------------------------------------
>
> Key: CRUNCH-264
> URL: https://issues.apache.org/jira/browse/CRUNCH-264
> Project: Crunch
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.0
> Reporter: Micah Whitacre
> Assignee: Josh Wills
> Priority: Minor
> Fix For: 0.8.0
>
> Attachments: CRUNCH-264b.patch, CRUNCH-264.patch, CRUNCH-264.png, CRUNCH-264.txt
>
>
> Creating a pipeline that writes out data to a TextFile (mapside) and then Avro (reduce side), causes the text side write and any processing that might happen on that branch to not show up in the the plan.
> Specifically the name of the pipeline is..
> Text(/simple.txt)+S0+[[S1+Text(/some/test/first)]/[S3]]+GBK+ungroup+PTables.values+Avro(/some/test/path)"
> However the generated DOT is:
> digraph G {
> "Text(/simple.txt)" [label="Text(/simple.txt)" shape=folder];
> "Avro(/some/test/path)" [label="Avro(/some/test/path)" shape=folder];
> subgraph "cluster-job1" {
> subgraph "cluster-job1-map" {
> label = Map; color = blue;
> "S3@2118275672@1822883541" [label="S3" shape=box];
> "S0@875319338@1822883541" [label="S0" shape=box];
> }
> subgraph "cluster-job1-reduce" {
> label = Reduce; color = red;
> "GBK@221482301@1822883541" [label="GBK" shape=box];
> "PTables.values@1156570456@1822883541" [label="PTables.values" shape=box];
> "ungroup@1830236047@1822883541" [label="ungroup" shape=box];
> }
> }
> "ungroup@1830236047@1822883541" -> "PTables.values@1156570456@1822883541";
> "GBK@221482301@1822883541" -> "ungroup@1830236047@1822883541";
> "PTables.values@1156570456@1822883541" -> "Avro(/some/test/path)";
> "Text(/simple.txt)" -> "S0@875319338@1822883541";
> "S3@2118275672@1822883541" -> "GBK@221482301@1822883541";
> "S0@875319338@1822883541" -> "S3@2118275672@1822883541";
> }
> Which is missing "S1" and the writing to '/some/test/first'
--
This message was sent by Atlassian JIRA
(v6.1#6144)