You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by Ron Hashimshony <ro...@myheritage.com> on 2015/05/18 15:32:23 UTC

More plan details

Hi,
It seems that the planning dot is lacking the planned number of reducer,
which is computed from the input files size.
Is there any way of adding both the speculated file sizes that crunch is
defining for each job, and the number of reducers it is going to run?
Thanks,
Ron.

Re: More plan details

Posted by Josh Wills <jw...@cloudera.com>.
+1, that would be good to have. The Spotify folks also requested an
enhancement to print out a recommended number of reducers for the *next*
run of a job based on the performance of the last run, which I think should
be do-able w/o too much trouble.

J

On Mon, May 18, 2015 at 6:42 AM, Micah Whitacre <mk...@gmail.com>
wrote:

> It doesn't look like the current DOT file writer includes functionality to
> write out that information.
>
> Feel free to log an enhancement to get that functionality.
> https://issues.apache.org/jira/browse/CRUNCH
>
> On Mon, May 18, 2015 at 8:32 AM, Ron Hashimshony <
> ron.hashimshony@myheritage.com> wrote:
>
>> Hi,
>> It seems that the planning dot is lacking the planned number of reducer,
>> which is computed from the input files size.
>> Is there any way of adding both the speculated file sizes that crunch is
>> defining for each job, and the number of reducers it is going to run?
>> Thanks,
>> Ron.
>>
>
>


-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Re: More plan details

Posted by Ron Hashimshony <ro...@myheritage.com>.
I opened a Jira for that, and added my proposed patch which we are using,
and helps us very much.
https://issues.apache.org/jira/browse/CRUNCH-519
Ron

On Mon, May 18, 2015 at 4:42 PM, Micah Whitacre <mk...@gmail.com>
wrote:

> It doesn't look like the current DOT file writer includes functionality to
> write out that information.
>
> Feel free to log an enhancement to get that functionality.
> https://issues.apache.org/jira/browse/CRUNCH
>
> On Mon, May 18, 2015 at 8:32 AM, Ron Hashimshony <
> ron.hashimshony@myheritage.com> wrote:
>
>> Hi,
>> It seems that the planning dot is lacking the planned number of reducer,
>> which is computed from the input files size.
>> Is there any way of adding both the speculated file sizes that crunch is
>> defining for each job, and the number of reducers it is going to run?
>> Thanks,
>> Ron.
>>
>
>


-- 

*Ron Hashimshony*
Back-End developer

Mobile: +972-54-6617722 | ron.h@myheritage.com | www.myheritage.com
MyHeritage Ltd., 3 Ariel Sharon St., Terminal Park, Or Yehuda 60250, Israel

<http://www.myheritage.com/>   <http://blog.myheritage.com/>
<https://www.facebook.com/myheritage> <https://twitter.com/myheritage>

Re: More plan details

Posted by Micah Whitacre <mk...@gmail.com>.
It doesn't look like the current DOT file writer includes functionality to
write out that information.

Feel free to log an enhancement to get that functionality.
https://issues.apache.org/jira/browse/CRUNCH

On Mon, May 18, 2015 at 8:32 AM, Ron Hashimshony <
ron.hashimshony@myheritage.com> wrote:

> Hi,
> It seems that the planning dot is lacking the planned number of reducer,
> which is computed from the input files size.
> Is there any way of adding both the speculated file sizes that crunch is
> defining for each job, and the number of reducers it is going to run?
> Thanks,
> Ron.
>