You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Laukik Chitnis (JIRA)" <ji...@apache.org> on 2009/02/06 00:35:59 UTC
[jira] Created: (PIG-657) splitsize is ignored in PigInputFormat
splitsize is ignored in PigInputFormat
--------------------------------------
Key: PIG-657
URL: https://issues.apache.org/jira/browse/PIG-657
Project: Pig
Issue Type: Bug
Reporter: Laukik Chitnis
The way to control the number of mappers in Hadoop has been to specify a mapred.min.split.size parameter in the job conf. For eg. mapred.min.split.size=1073741824,mapred.map.tasks=10
However, even if this parameter is specified, Pig creates the number of mappers depending only on the number of blocks in the file. This is because the parameter is not used in the PigInputFormat.
The parameter can actually be extracted from the job conf object. So, one way of doing this would be to pass an handle to the job conf object to the PigInputFormat or the custom slicer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-657) splitsize is ignored in PigInputFormat
Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich resolved PIG-657.
--------------------------------
Resolution: Fixed
This is resolved with Pig 0.7
> splitsize is ignored in PigInputFormat
> --------------------------------------
>
> Key: PIG-657
> URL: https://issues.apache.org/jira/browse/PIG-657
> Project: Pig
> Issue Type: Bug
> Reporter: Laukik Chitnis
>
> The way to control the number of mappers in Hadoop has been to specify a mapred.min.split.size parameter in the job conf. For eg. mapred.min.split.size=1073741824,mapred.map.tasks=10
> However, even if this parameter is specified, Pig creates the number of mappers depending only on the number of blocks in the file. This is because the parameter is not used in the PigInputFormat.
> The parameter can actually be extracted from the job conf object. So, one way of doing this would be to pass an handle to the job conf object to the PigInputFormat or the custom slicer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.