You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Raihan Jamal <ja...@gmail.com> on 2012/10/03 20:05:08 UTC

org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

I am running a Hive query and I am getting this exception below-

*Job Submission failed with exception
'org.apache.hadoop.ipc.RemoteException(java.io.IOException:
java.io.IOException: The number of tasks for this job 2072020 exceeds the
configured limit 200000*

I am not sure what does this error means? Can anyone help me out here?



*Raihan Jamal*

Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Posted by Raihan Jamal <ja...@gmail.com>.
What about if I do like below? Will this work?
*

set mapred.jobtracker.maxtasks.per.job=-1*




*Raihan Jamal*



On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja
<Ch...@careerbuilder.com>wrote:

>  Hi Raihan,****
>
> ** **
>
> You can set it in hive prompt like below,****
>
> set mapred.jobtracker.maxtasks.per.job=7777777; ****
>
> ** **
>
> To see if it is set just type in hive prompt, *set; * and you’ll see this
> parameter in the output.****
>
> ** **
>
> Hope this helps,****
>
> Chalcy****
>
> ** **
>
> *From:* Raihan Jamal [mailto:jamalraihan@gmail.com]
> *Sent:* Wednesday, October 03, 2012 5:51 PM
> *To:* user@hive.apache.org
> *Subject:* Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException:
> java.io.IOException****
>
> ** **
>
> Ok. Found the issue I guess.****
>
>  ****
>
> This is the below settings we have in the  *mapred-site.xml *for the
> site-specific configuration in Hadoop. And that is the reason exception is
> getting thrown.****
>
>  ****
>
> <property>****
>
>     <!-- 10,000 is 100 tasks per node on a 100-node cluster -->****
>
>     <name>mapred.jobtracker.maxtasks.per.job</name>****
>
>     <value>200000</value>****
>
>     <final>true</final>****
>
>   </property>****
>
> ** **
>
> How can I override these changes manually from the Hive prompt? Any
> suggestions?
>
>
>
> *Raihan Jamal*****
>
>
>
> ****
>
> On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <ja...@gmail.com>
> wrote:****
>
> Can anyone help me out here? What does the below error means? And this is
> the query I am using-****
>
> ** **
>
> *SELECT cguid,*****
>
> *event_item,*****
>
> *event_timestamp,*****
>
> *event_site_id*****
>
> *FROM (*****
>
> *SELECT event.app_payload ['n'] AS cguid,*****
>
> *event.app_payload ['itm'] AS event_item,*****
>
> *max(event.event_timestamp) AS event_timestamp,*****
>
> *event.site_id AS event_site_id*****
>
> *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event***
> **
>
> *WHERE a.dt = '20120917'*****
>
> *AND event.app_payload ['n'] IS NOT NULL*****
>
> *AND instr(event.app_payload ['itm'], '%') = 0*****
>
> *AND event.app_payload ['itm'] IS NOT NULL*****
>
> *AND (*****
>
> *event.page_type_id = '4340'*****
>
> *OR event.page_type_id = '2047675'*****
>
> *)*****
>
> *GROUP BY event.app_payload ['n'],*****
>
> *event.site_id,*****
>
> *event.app_payload ['itm']*****
>
> *ORDER BY cguid,*****
>
> *event_timestamp DESC*****
>
> *) m*****
>
> *LEFT OUTER JOIN (*****
>
> *SELECT event.app_payload ['n'] AS changed_cguid*****
>
> *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event***
> **
>
> *WHERE a.dt = '20120918'*****
>
> *AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
> 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd
> HH:mm:ss')*****
>
> *AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
> 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd
> HH:mm:ss')*****
>
> *) n ON m.cguid = n.changed_cguid*****
>
> *WHERE n.changed_cguid IS NULL*****
>
>
>
>
> *Raihan Jamal*****
>
>
>
> ****
>
> On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <ja...@gmail.com>
> wrote:****
>
> I am running a Hive query and I am getting this exception below-****
>
> ** **
>
> *Job Submission failed with exception
> 'org.apache.hadoop.ipc.RemoteException(java.io.IOException:
> java.io.IOException: The number of tasks for this job 2072020 exceeds the
> configured limit 200000*****
>
> ** **
>
> I am not sure what does this error means? Can anyone help me out here?
>
>
>
> *Raihan Jamal*****
>
> ** **
>
> ** **
>
> ** **
>

Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Posted by Bejoy KS <be...@yahoo.com>.
Hi Raihan

The propery 'mapred.jobtracker.maxtasks.per.job' is a JobTracker level one and not a task level one. Hence you cannot override it at task level. You need to make modifications in mapred_site.xml  also you may need to rebounce the JT as well for the new value to come into effect.
 
Regards,
Bejoy KS


________________________________
 From: Raihan Jamal <ja...@gmail.com>
To: user@hive.apache.org 
Sent: Thursday, October 4, 2012 5:24 AM
Subject: Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException
 

Just to add here
SojTimestampToDate will return data in this format only 2012/02/29 17:01:43





Raihan Jamal



On Wed, Oct 3, 2012 at 4:46 PM, Raihan Jamal <ja...@gmail.com> wrote:

This is still not working as in the XML file the final property has been set as true so that means I cannot override it. 
>And this below simple query is also throwing same exception-
>
>
>SELECT event.app_payload ['n'] AS changed_cguid
>FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event
>WHERE a.dt = '20120918'
>AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss')
>AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss')
>
>
>Exception I am getting:-
>
>
>Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException: The number of tasks for this job 2070929 exceeds the configured limit 200000
> 
>
>
>
>Any other suggestion what should I do to overcome this problem? May be any changes in the query can overcome this problem?
>
>
>Raihan Jamal
>
>
>
>On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja <Ch...@careerbuilder.com> wrote:
>
>Hi Raihan,
>> 
>>You can set it in hive prompt like below,
>>set mapred.jobtracker.maxtasks.per.job=7777777; 
>> 
>>To see if it is set just type in hive prompt, set;  and you’ll see this parameter in the output.
>> 
>>Hope this helps,
>>Chalcy
>> 
>>From:Raihan Jamal [mailto:jamalraihan@gmail.com] 
>>Sent: Wednesday, October 03, 2012 5:51 PM
>>To: user@hive.apache.org
>>Subject: Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException
>> 
>>Ok. Found the issue I guess.
>> 
>>This is the below settings we have in the  mapred-site.xml for the site-specific configuration in Hadoop. And that is the reason exception is getting thrown.
>> 
>><property>
>>    <!-- 10,000 is 100 tasks per node on a 100-node cluster -->
>>    <name>mapred.jobtracker.maxtasks.per.job</name>
>>    <value>200000</value>
>>    <final>true</final>
>>  </property>
>> 
>>How can I override these changes manually from the Hive prompt? Any suggestions?
>>
>>
>>
>>Raihan Jamal
>>
>>
>>
>>On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <ja...@gmail.com> wrote:
>>Can anyone help me out here? What does the below error means? And this is the query I am using-
>> 
>>SELECT cguid,
>>event_item,
>>event_timestamp,
>>event_site_id
>>FROM (
>>SELECT event.app_payload ['n'] AS cguid,
>>event.app_payload ['itm'] AS event_item,
>>max(event.event_timestamp) AS event_timestamp,
>>event.site_id AS event_site_id
>>FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event
>>WHERE a.dt = '20120917'
>>AND event.app_payload ['n'] IS NOT NULL
>>AND instr(event.app_payload ['itm'], '%') = 0
>>AND event.app_payload ['itm'] IS NOT NULL
>>AND (
>>event.page_type_id = '4340'
>>OR event.page_type_id = '2047675'
>>)
>>GROUP BY event.app_payload ['n'],
>>event.site_id,
>>event.app_payload ['itm']
>>ORDER BY cguid,
>>event_timestamp DESC
>>) m
>>LEFT OUTER JOIN (
>>SELECT event.app_payload ['n'] AS changed_cguid
>>FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event
>>WHERE a.dt = '20120918'
>>AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss')
>>AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss')
>>) n ON m.cguid = n.changed_cguid
>>WHERE n.changed_cguid IS NULL
>>
>>
>>
>>Raihan Jamal
>>
>>
>>
>>On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <ja...@gmail.com> wrote:
>>I am running a Hive query and I am getting this exception below-
>> 
>>Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException: The number of tasks for this job 2072020 exceeds the configured limit 200000
>> 
>>I am not sure what does this error means? Can anyone help me out here?
>>
>>
>>
>>Raihan Jamal
>> 
>> 
>> 
>

Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Posted by Raihan Jamal <ja...@gmail.com>.
Just to add here
*SojTimestampToDate* will return data in this format only *2012/02/29
17:01:43*




*Raihan Jamal*



On Wed, Oct 3, 2012 at 4:46 PM, Raihan Jamal <ja...@gmail.com> wrote:

> This is still not working as in the XML file the *final* property has
> been set as true so that means I cannot override it.
> And this below simple query is also throwing same exception-
>
> *SELECT event.app_payload ['n'] AS changed_cguid*
> * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*
> * WHERE a.dt = '20120918'*
> * AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
> 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd
> HH:mm:ss')*
> * AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
> 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd
> HH:mm:ss')*
>
>
> Exception I am getting:-
>
> *Job Submission failed with exception
> 'org.apache.hadoop.ipc.RemoteException(java.io.IOException:
> java.io.IOException: The number of tasks for this job 2070929 exceeds the
> configured limit 200000*
> * *
>
>
> Any other suggestion what should I do to overcome this problem? May be any
> changes in the query can overcome this problem?
>
>
> *Raihan Jamal*
>
>
>
> On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja <Chalcy.Raja@careerbuilder.com
> > wrote:
>
>>  Hi Raihan,****
>>
>> ** **
>>
>> You can set it in hive prompt like below,****
>>
>> set mapred.jobtracker.maxtasks.per.job=7777777; ****
>>
>> ** **
>>
>> To see if it is set just type in hive prompt, *set; * and you’ll see
>> this parameter in the output.****
>>
>> ** **
>>
>> Hope this helps,****
>>
>> Chalcy****
>>
>> ** **
>>
>> *From:* Raihan Jamal [mailto:jamalraihan@gmail.com]
>> *Sent:* Wednesday, October 03, 2012 5:51 PM
>> *To:* user@hive.apache.org
>> *Subject:* Re:
>> org.apache.hadoop.ipc.RemoteException(java.io.IOException:
>> java.io.IOException****
>>
>> ** **
>>
>> Ok. Found the issue I guess.****
>>
>>  ****
>>
>> This is the below settings we have in the  *mapred-site.xml *for the
>> site-specific configuration in Hadoop. And that is the reason exception is
>> getting thrown.****
>>
>>  ****
>>
>> <property>****
>>
>>     <!-- 10,000 is 100 tasks per node on a 100-node cluster -->****
>>
>>     <name>mapred.jobtracker.maxtasks.per.job</name>****
>>
>>     <value>200000</value>****
>>
>>     <final>true</final>****
>>
>>   </property>****
>>
>> ** **
>>
>> How can I override these changes manually from the Hive prompt? Any
>> suggestions?
>>
>>
>>
>> *Raihan Jamal*****
>>
>>
>>
>> ****
>>
>> On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <ja...@gmail.com>
>> wrote:****
>>
>> Can anyone help me out here? What does the below error means? And this is
>> the query I am using-****
>>
>> ** **
>>
>> *SELECT cguid,*****
>>
>> *event_item,*****
>>
>> *event_timestamp,*****
>>
>> *event_site_id*****
>>
>> *FROM (*****
>>
>> *SELECT event.app_payload ['n'] AS cguid,*****
>>
>> *event.app_payload ['itm'] AS event_item,*****
>>
>> *max(event.event_timestamp) AS event_timestamp,*****
>>
>> *event.site_id AS event_site_id*****
>>
>> *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event**
>> ***
>>
>> *WHERE a.dt = '20120917'*****
>>
>> *AND event.app_payload ['n'] IS NOT NULL*****
>>
>> *AND instr(event.app_payload ['itm'], '%') = 0*****
>>
>> *AND event.app_payload ['itm'] IS NOT NULL*****
>>
>> *AND (*****
>>
>> *event.page_type_id = '4340'*****
>>
>> *OR event.page_type_id = '2047675'*****
>>
>> *)*****
>>
>> *GROUP BY event.app_payload ['n'],*****
>>
>> *event.site_id,*****
>>
>> *event.app_payload ['itm']*****
>>
>> *ORDER BY cguid,*****
>>
>> *event_timestamp DESC*****
>>
>> *) m*****
>>
>> *LEFT OUTER JOIN (*****
>>
>> *SELECT event.app_payload ['n'] AS changed_cguid*****
>>
>> *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event**
>> ***
>>
>> *WHERE a.dt = '20120918'*****
>>
>> *AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
>> 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd
>> HH:mm:ss')*****
>>
>> *AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
>> 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd
>> HH:mm:ss')*****
>>
>> *) n ON m.cguid = n.changed_cguid*****
>>
>> *WHERE n.changed_cguid IS NULL*****
>>
>>
>>
>>
>> *Raihan Jamal*****
>>
>>
>>
>> ****
>>
>> On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <ja...@gmail.com>
>> wrote:****
>>
>> I am running a Hive query and I am getting this exception below-****
>>
>> ** **
>>
>> *Job Submission failed with exception
>> 'org.apache.hadoop.ipc.RemoteException(java.io.IOException:
>> java.io.IOException: The number of tasks for this job 2072020 exceeds the
>> configured limit 200000*****
>>
>> ** **
>>
>> I am not sure what does this error means? Can anyone help me out here?
>>
>>
>>
>> *Raihan Jamal*****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>
>

Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Posted by Raihan Jamal <ja...@gmail.com>.
This is still not working as in the XML file the *final* property has been
set as true so that means I cannot override it.
And this below simple query is also throwing same exception-

*SELECT event.app_payload ['n'] AS changed_cguid*
* FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*
* WHERE a.dt = '20120918'*
* AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd
HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss')*
* AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd
HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss')*


Exception I am getting:-

*Job Submission failed with exception
'org.apache.hadoop.ipc.RemoteException(java.io.IOException:
java.io.IOException: The number of tasks for this job 2070929 exceeds the
configured limit 200000*
* *


Any other suggestion what should I do to overcome this problem? May be any
changes in the query can overcome this problem?


*Raihan Jamal*



On Wed, Oct 3, 2012 at 2:59 PM, Chalcy Raja
<Ch...@careerbuilder.com>wrote:

>  Hi Raihan,****
>
> ** **
>
> You can set it in hive prompt like below,****
>
> set mapred.jobtracker.maxtasks.per.job=7777777; ****
>
> ** **
>
> To see if it is set just type in hive prompt, *set; * and you’ll see this
> parameter in the output.****
>
> ** **
>
> Hope this helps,****
>
> Chalcy****
>
> ** **
>
> *From:* Raihan Jamal [mailto:jamalraihan@gmail.com]
> *Sent:* Wednesday, October 03, 2012 5:51 PM
> *To:* user@hive.apache.org
> *Subject:* Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException:
> java.io.IOException****
>
> ** **
>
> Ok. Found the issue I guess.****
>
>  ****
>
> This is the below settings we have in the  *mapred-site.xml *for the
> site-specific configuration in Hadoop. And that is the reason exception is
> getting thrown.****
>
>  ****
>
> <property>****
>
>     <!-- 10,000 is 100 tasks per node on a 100-node cluster -->****
>
>     <name>mapred.jobtracker.maxtasks.per.job</name>****
>
>     <value>200000</value>****
>
>     <final>true</final>****
>
>   </property>****
>
> ** **
>
> How can I override these changes manually from the Hive prompt? Any
> suggestions?
>
>
>
> *Raihan Jamal*****
>
>
>
> ****
>
> On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <ja...@gmail.com>
> wrote:****
>
> Can anyone help me out here? What does the below error means? And this is
> the query I am using-****
>
> ** **
>
> *SELECT cguid,*****
>
> *event_item,*****
>
> *event_timestamp,*****
>
> *event_site_id*****
>
> *FROM (*****
>
> *SELECT event.app_payload ['n'] AS cguid,*****
>
> *event.app_payload ['itm'] AS event_item,*****
>
> *max(event.event_timestamp) AS event_timestamp,*****
>
> *event.site_id AS event_site_id*****
>
> *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event***
> **
>
> *WHERE a.dt = '20120917'*****
>
> *AND event.app_payload ['n'] IS NOT NULL*****
>
> *AND instr(event.app_payload ['itm'], '%') = 0*****
>
> *AND event.app_payload ['itm'] IS NOT NULL*****
>
> *AND (*****
>
> *event.page_type_id = '4340'*****
>
> *OR event.page_type_id = '2047675'*****
>
> *)*****
>
> *GROUP BY event.app_payload ['n'],*****
>
> *event.site_id,*****
>
> *event.app_payload ['itm']*****
>
> *ORDER BY cguid,*****
>
> *event_timestamp DESC*****
>
> *) m*****
>
> *LEFT OUTER JOIN (*****
>
> *SELECT event.app_payload ['n'] AS changed_cguid*****
>
> *FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event***
> **
>
> *WHERE a.dt = '20120918'*****
>
> *AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
> 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd
> HH:mm:ss')*****
>
> *AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
> 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd
> HH:mm:ss')*****
>
> *) n ON m.cguid = n.changed_cguid*****
>
> *WHERE n.changed_cguid IS NULL*****
>
>
>
>
> *Raihan Jamal*****
>
>
>
> ****
>
> On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <ja...@gmail.com>
> wrote:****
>
> I am running a Hive query and I am getting this exception below-****
>
> ** **
>
> *Job Submission failed with exception
> 'org.apache.hadoop.ipc.RemoteException(java.io.IOException:
> java.io.IOException: The number of tasks for this job 2072020 exceeds the
> configured limit 200000*****
>
> ** **
>
> I am not sure what does this error means? Can anyone help me out here?
>
>
>
> *Raihan Jamal*****
>
> ** **
>
> ** **
>
> ** **
>

RE: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Posted by Chalcy Raja <Ch...@careerbuilder.com>.
Hi Raihan,

You can set it in hive prompt like below,
set mapred.jobtracker.maxtasks.per.job=7777777;

To see if it is set just type in hive prompt, set;  and you'll see this parameter in the output.

Hope this helps,
Chalcy

From: Raihan Jamal [mailto:jamalraihan@gmail.com]
Sent: Wednesday, October 03, 2012 5:51 PM
To: user@hive.apache.org
Subject: Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Ok. Found the issue I guess.

This is the below settings we have in the  mapred-site.xml for the site-specific configuration in Hadoop. And that is the reason exception is getting thrown.

<property>
    <!-- 10,000 is 100 tasks per node on a 100-node cluster -->
    <name>mapred.jobtracker.maxtasks.per.job</name>
    <value>200000</value>
    <final>true</final>
  </property>

How can I override these changes manually from the Hive prompt? Any suggestions?



Raihan Jamal


On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <ja...@gmail.com>> wrote:
Can anyone help me out here? What does the below error means? And this is the query I am using-

SELECT cguid,
event_item,
event_timestamp,
event_site_id
FROM (
SELECT event.app_payload ['n'] AS cguid,
event.app_payload ['itm'] AS event_item,
max(event.event_timestamp) AS event_timestamp,
event.site_id AS event_site_id
FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event
WHERE a.dt = '20120917'
AND event.app_payload ['n'] IS NOT NULL
AND instr(event.app_payload ['itm'], '%') = 0
AND event.app_payload ['itm'] IS NOT NULL
AND (
event.page_type_id = '4340'
OR event.page_type_id = '2047675'
)
GROUP BY event.app_payload ['n'],
event.site_id,
event.app_payload ['itm']
ORDER BY cguid,
event_timestamp DESC
) m
LEFT OUTER JOIN (
SELECT event.app_payload ['n'] AS changed_cguid
FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event
WHERE a.dt = '20120918'
AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss')
AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss')
) n ON m.cguid = n.changed_cguid
WHERE n.changed_cguid IS NULL



Raihan Jamal


On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <ja...@gmail.com>> wrote:
I am running a Hive query and I am getting this exception below-

Job Submission failed with exception 'org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException: The number of tasks for this job 2072020 exceeds the configured limit 200000

I am not sure what does this error means? Can anyone help me out here?



Raihan Jamal




Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Posted by Raihan Jamal <ja...@gmail.com>.
Ok. Found the issue I guess.



This is the below settings we have in the  *mapred-site.xml *for the
site-specific configuration in Hadoop. And that is the reason exception is
getting thrown.



<property>

    <!-- 10,000 is 100 tasks per node on a 100-node cluster -->

    <name>mapred.jobtracker.maxtasks.per.job</name>

    <value>200000</value>

    <final>true</final>

  </property>

How can I override these changes manually from the Hive prompt? Any
suggestions?



*Raihan Jamal*



On Wed, Oct 3, 2012 at 2:19 PM, Raihan Jamal <ja...@gmail.com> wrote:

> Can anyone help me out here? What does the below error means? And this is
> the query I am using-
>
> *SELECT cguid,*
> * event_item,*
> * event_timestamp,*
> * event_site_id*
> *FROM (*
> * SELECT event.app_payload ['n'] AS cguid,*
> * event.app_payload ['itm'] AS event_item,*
> * max(event.event_timestamp) AS event_timestamp,*
> * event.site_id AS event_site_id*
> * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*
> * WHERE a.dt = '20120917'*
> * AND event.app_payload ['n'] IS NOT NULL*
> * AND instr(event.app_payload ['itm'], '%') = 0*
> * AND event.app_payload ['itm'] IS NOT NULL*
> * AND (*
> * event.page_type_id = '4340'*
> * OR event.page_type_id = '2047675'*
> * )*
> * GROUP BY event.app_payload ['n'],*
> * event.site_id,*
> * event.app_payload ['itm']*
> * ORDER BY cguid,*
> * event_timestamp DESC*
> * ) m*
> *LEFT OUTER JOIN (*
> * SELECT event.app_payload ['n'] AS changed_cguid*
> * FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*
> * WHERE a.dt = '20120918'*
> * AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
> 'yyyy/MM/dd HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd
> HH:mm:ss')*
> * AND unix_timestamp(SojTimestampToDate(event.event_timestamp),
> 'yyyy/MM/dd HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd
> HH:mm:ss')*
> * ) n ON m.cguid = n.changed_cguid*
> *WHERE n.changed_cguid IS NULL*
>
>
>
> *Raihan Jamal*
>
>
>
> On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <ja...@gmail.com>wrote:
>
>> I am running a Hive query and I am getting this exception below-
>>
>> *Job Submission failed with exception
>> 'org.apache.hadoop.ipc.RemoteException(java.io.IOException:
>> java.io.IOException: The number of tasks for this job 2072020 exceeds the
>> configured limit 200000*
>>
>> I am not sure what does this error means? Can anyone help me out here?
>>
>>
>>
>> *Raihan Jamal*
>>
>>
>

Re: org.apache.hadoop.ipc.RemoteException(java.io.IOException: java.io.IOException

Posted by Raihan Jamal <ja...@gmail.com>.
Can anyone help me out here? What does the below error means? And this is
the query I am using-

*SELECT cguid,*
* event_item,*
* event_timestamp,*
* event_site_id*
*FROM (*
* SELECT event.app_payload ['n'] AS cguid,*
* event.app_payload ['itm'] AS event_item,*
* max(event.event_timestamp) AS event_timestamp,*
* event.site_id AS event_site_id*
* FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*
* WHERE a.dt = '20120917'*
* AND event.app_payload ['n'] IS NOT NULL*
* AND instr(event.app_payload ['itm'], '%') = 0*
* AND event.app_payload ['itm'] IS NOT NULL*
* AND (*
* event.page_type_id = '4340'*
* OR event.page_type_id = '2047675'*
* )*
* GROUP BY event.app_payload ['n'],*
* event.site_id,*
* event.app_payload ['itm']*
* ORDER BY cguid,*
* event_timestamp DESC*
* ) m*
*LEFT OUTER JOIN (*
* SELECT event.app_payload ['n'] AS changed_cguid*
* FROM soj_session_container a LATERAL VIEW explode(a.events) t AS event*
* WHERE a.dt = '20120918'*
* AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd
HH:mm:ss') >= unix_timestamp('2012/09/18 00:00:00', 'yyyy/MM/dd HH:mm:ss')*
* AND unix_timestamp(SojTimestampToDate(event.event_timestamp), 'yyyy/MM/dd
HH:mm:ss') <= unix_timestamp('2012/09/18 02:00:00', 'yyyy/MM/dd HH:mm:ss')*
* ) n ON m.cguid = n.changed_cguid*
*WHERE n.changed_cguid IS NULL*



*Raihan Jamal*



On Wed, Oct 3, 2012 at 11:05 AM, Raihan Jamal <ja...@gmail.com> wrote:

> I am running a Hive query and I am getting this exception below-
>
> *Job Submission failed with exception
> 'org.apache.hadoop.ipc.RemoteException(java.io.IOException:
> java.io.IOException: The number of tasks for this job 2072020 exceeds the
> configured limit 200000*
>
> I am not sure what does this error means? Can anyone help me out here?
>
>
>
> *Raihan Jamal*
>
>