You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by Sandy Ryza <sa...@cloudera.com> on 2013/06/07 10:54:11 UTC
output records counter
Hey All,
Does Crunch not use the normal MR channels for outputting stuff? I'm
noticing that when I look at a job's Counters, the output records are
always 0, even when I know data has been written.
thanks
-Sandy
Re: output records counter
Posted by Sandy Ryza <sa...@cloudera.com>.
Ok, makes sense, thanks Gabriel
On Fri, Jun 7, 2013 at 2:20 AM, Gabriel Reid <ga...@gmail.com> wrote:
> Hi Sandy,
>
> Crunch uses something similar to Hadoop's MultipleOutputFormat to allow
> writing multiple outputs in multiple formats from the same job. This leads
> to different counters being used for output, as there can be multiple
> outputs (and therefore multiple counters) from a single job.
>
> The main implementation class of this is o.a.c.io.CrunchOutputs, and the
> counters that contain the actual output count are published in the counter
> group with the name of that class, and the counter name of out<d>, where
> <d> is the index of the output for the job (i.e. starting from 0).
>
> - Gabriel
>
>
>
> On Fri, Jun 7, 2013 at 10:54 AM, Sandy Ryza <sa...@cloudera.com>wrote:
>
>> Hey All,
>>
>> Does Crunch not use the normal MR channels for outputting stuff? I'm
>> noticing that when I look at a job's Counters, the output records are
>> always 0, even when I know data has been written.
>>
>> thanks
>> -Sandy
>>
>
>
Re: output records counter
Posted by Gabriel Reid <ga...@gmail.com>.
Hi Sandy,
Crunch uses something similar to Hadoop's MultipleOutputFormat to allow
writing multiple outputs in multiple formats from the same job. This leads
to different counters being used for output, as there can be multiple
outputs (and therefore multiple counters) from a single job.
The main implementation class of this is o.a.c.io.CrunchOutputs, and the
counters that contain the actual output count are published in the counter
group with the name of that class, and the counter name of out<d>, where
<d> is the index of the output for the job (i.e. starting from 0).
- Gabriel
On Fri, Jun 7, 2013 at 10:54 AM, Sandy Ryza <sa...@cloudera.com> wrote:
> Hey All,
>
> Does Crunch not use the normal MR channels for outputting stuff? I'm
> noticing that when I look at a job's Counters, the output records are
> always 0, even when I know data has been written.
>
> thanks
> -Sandy
>