You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2015/02/19 01:52:11 UTC

[jira] [Commented] (PIG-4424) Different configurations for different stages of script

    [ https://issues.apache.org/jira/browse/PIG-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326829#comment-14326829 ] 

Rohini Palaniswamy commented on PIG-4424:
-----------------------------------------

This usecase keeps coming up often. Had a discussion with [~daijy] last year. The plan was to allow "set" commands anywhere in in a pig script and make them apply to all the lines following them. 

For eg:
set pig.maxCombinedSplitSize 1073741824
A = LOAD 'input';
B = GROUP A by $0;
....
set pig.maxCombinedSplitSize 134217728
F = ORDER E by $1;

  This should be more easy for the user, but code changes will be slightly more involved. Different parts of the plan will have to have different settings and needs to be carried over from logical->physical->mapreduce/tez plan and different optimizers to the execution engine correctly. Any other suggestions on making it simple for the user and also on the implementation?

> Different configurations for different stages of script
> -------------------------------------------------------
>
>                 Key: PIG-4424
>                 URL: https://issues.apache.org/jira/browse/PIG-4424
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Rohini Palaniswamy
>
> From a user:
> I have a pig script which runs multiple map reduce jobs. (Ex: 'group by' and 'order by' which will be executed as 2 different map reduce jobs)
> Is there a way to specify different map reduce configuration options for different stages instead of specifying them for the whole script (Ex: different values for mapred.min.split.size for different stages)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)