You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Ángel Álvarez (JIRA)" <ji...@apache.org> on 2015/06/22 12:51:04 UTC

[jira] [Commented] (PIG-4443) Write inputsplits in Tez to disk if the size is huge and option to compress pig input splits

    [ https://issues.apache.org/jira/browse/PIG-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595756#comment-14595756 ] 

Ángel Álvarez commented on PIG-4443:
------------------------------------

I have a script in PIG that loads data from Hive using org.apache.hive.hcatalog.pig.HCatLoader. This script works fine in Pig 0.14, but in Pig 0.15 I'm getting this error:

Requested data length 160452289 is longer than maximum configured RPC length 67108864

In Pig 0.14 I had to deal with this issue too, but I could always make it work by reducing the number of splits in the Hive tables created by Sqoop (using no more than 60 splits). Is there any special configuration needed?

> Write inputsplits in Tez to disk if the size is huge and option to compress pig input splits
> --------------------------------------------------------------------------------------------
>
>                 Key: PIG-4443
>                 URL: https://issues.apache.org/jira/browse/PIG-4443
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.15.0
>
>         Attachments: PIG-4443-1.patch, PIG-4443-Fix-TEZ-2192-2.patch, PIG-4443-Fix-TEZ-2192.patch
>
>
> Pig sets the input split information in user payload and when running against a table with 10s of 1000s of partitions, DAG submission fails with
> java.io.IOException: Requested data length 305844060 is longer than maximum
> configured RPC length 67108864



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)