You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Mohit Sabharwal (JIRA)" <ji...@apache.org> on 2015/05/13 04:00:00 UTC

[jira] [Updated] (PIG-4549) Set CROSS operation parallelism for Spark engine

     [ https://issues.apache.org/jira/browse/PIG-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mohit Sabharwal updated PIG-4549:
---------------------------------
    Attachment: PIG-4549.patch

> Set CROSS operation parallelism for Spark engine
> ------------------------------------------------
>
>                 Key: PIG-4549
>                 URL: https://issues.apache.org/jira/browse/PIG-4549
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>    Affects Versions: spark-branch
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>             Fix For: spark-branch
>
>         Attachments: PIG-4549.patch
>
>
> Spark engine should set parallelism to be used for CROSS operation by GFCross UDF.
> If not set, GFCross throws an exception:
> {code}
>                 String s = cfg.get(PigImplConstants.PIG_CROSS_PARALLELISM + "." + crossKey);
>                 if (s == null) {
>                     throw new IOException("Unable to get parallelism hint from job conf");
>                 }
> {code}
> Estimating parallelism for Spark engine is a TBD item. Until that is done, for CROSS to work, we should use the default parallelism value in GFCross.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)