You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by Mark Grover <gr...@gmail.com> on 2012/11/29 19:11:46 UTC

Re: YARN hangs while computing pig?

Redirecting to Apache pig user list

On Thu, Nov 29, 2012 at 1:01 AM, Johnny Kowalski
<jo...@gmail.com>wrote:

> Another hint? Is is some permission issue?
>
> MY EXAMPLE:
> in = LOAD '/user/myUser/aTest' USING PigStorage();
> DUMP in;
>
> When I try to run above pig in grunt shell with "pig -x local" it
> successfully prints output.
> *
> And when I do it in mapreduce pig grundshell it hangs... on this line:"
>
> 2012-11-28 15:24:27,709 [main] INFO  org.apache.pig.backend.hadoop.
> *
> ***executionengine.****mapReduceLayer.****MapReduceLauncher - 0% complete"
> *
>
>
> W dniu środa, 28 listopada 2012 15:44:22 UTC+1 użytkownik Johnny Kowalski
> napisał:
>
>> Hi, after configuring whole stack I've got another issue. My pig jobs
>> hangs.
>> I've created completly new user added a dir for him
>> sudo -u hdfs hadoop fs -mkdir /user/newUser
>> sudo -u hdfs hadoop fs -chown newUser:newUser /user/newUser
>>
>> and wanted to run pig that runs fine on another yarn configuration
>> cluster. And got something like this:
>>
>>
>> PIG CONSOLE OUTPUT
>>
>> 2012-11-28 15:24:26,759 [Thread-4] INFO  org.apache.hadoop.mapreduce.**lib.input.FileInputFormat
>> - Total input paths to process : 1
>> 2012-11-28 15:24:26,760 [Thread-4] INFO  org.apache.pig.backend.hadoop.**
>> executionengine.util.**MapRedUtil - Total input paths to process : 1
>> 2012-11-28 15:24:26,783 [Thread-4] INFO  org.apache.pig.backend.hadoop.**
>> executionengine.util.**MapRedUtil - Total input paths (combined) to
>> process : 1
>> 2012-11-28 15:24:26,872 [Thread-4] INFO  org.apache.hadoop.mapreduce.**JobSubmitter
>> - number of splits:1
>> 2012-11-28 15:24:26,889 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration
>> - fs.default.name is deprecated. Instead, use fs.defaultFS
>> 2012-11-28 15:24:26,890 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration
>> - mapreduce.job.counters.limit is deprecated. Instead, use
>> mapreduce.job.counters.max
>> 2012-11-28 15:24:26,891 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration
>> - mapred.job.tracker is deprecated. Instead, use
>> mapreduce.jobtracker.address
>> 2012-11-28 15:24:26,892 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration
>> - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
>> 2012-11-28 15:24:26,893 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration
>> - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
>> 2012-11-28 15:24:26,894 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration
>> - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
>> 2012-11-28 15:24:27,563 [Thread-4] INFO  org.apache.hadoop.mapred.**ResourceMgrDelegate
>> - Submitted application application_1354100898821_0011 to ResourceManager
>> at userver/192.168.56.101:8032
>> 2012-11-28 15:24:27,629 [Thread-4] INFO  org.apache.hadoop.mapreduce.**Job
>> - The url to track the job: http://userver:8088/proxy/**
>> application_1354100898821_**0011/<http://userver:8088/proxy/application_1354100898821_0011/>
>> 2012-11-28 15:24:27,709 [main] INFO  org.apache.pig.backend.hadoop.**
>> executionengine.**mapReduceLayer.**MapReduceLauncher - 0% complete
>>
>> Here I've noticed logs keep growing while mine job is in status RUNNING
>>
>> /var/log/hadoop-yarn# less hadoop-cmf-yarn1-NODEMANAGER-**userver.log.out
>>
>> 2012-11-28 15:39:16,227 INFO org.apache.hadoop.yarn.server.**nodemanager.
>> **NodeStatusUpdaterImpl: Sending out status for container: container_id
>> {, app_attempt_id {, application_id {, id: 11, cluster_timestamp:
>> 1354100898821, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics:
>> "", exit_status: -1000,
>> 2012-11-28 15:39:17,230 INFO org.apache.hadoop.yarn.server.**nodemanager.
>> **NodeStatusUpdaterImpl: Sending out status for container: container_id
>> {, app_attempt_id {, application_id {, id: 11, cluster_timestamp:
>> 1354100898821, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics:
>> "", exit_status: -1000,
>> 2012-11-28 15:39:18,233 INFO org.apache.hadoop.yarn.server.**nodemanager.
>> **NodeStatusUpdaterImpl: Sending out status for container: container_id
>> {, app_attempt_id {, application_id {, id: 11, cluster_timestamp:
>> 1354100898821, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics:
>> "", exit_status: -1000,
>> 2012-11-28 15:39:18,356 INFO org.apache.hadoop.yarn.server.**
>> nodemanager.containermanager.**monitor.ContainersMonitorImpl: Memory
>> usage of ProcessTree 25203 for container-id container_1354100898821_0011_
>> **01_000001: 124.5mb of 1.5gb physical memory used; 1.9gb of 3.1gb
>> virtual memory used
>> 2012-11-28 15:39:19,238 INFO org.apache.hadoop.yarn.server.**nodemanager.
>> **NodeStatusUpdaterImpl: Sending out status for container: container_id
>> {, app_attempt_id {, application_id {, id: 11, cluster_timestamp:
>> 1354100898821, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics:
>> "", exit_status: -1000,
>> 2012-11-28 15:39:20,241 INFO org.apache.hadoop.yarn.server.**nodemanager.
>> **NodeStatusUpdaterImpl: Sending out status for container: container_id
>> {, app_attempt_id {, application_id {, id: 11, cluster_timestamp:
>> 1354100898821, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics:
>> "", exit_status: -1000,
>> 2012-11-28 15:39:21,245 INFO org.apache.hadoop.yarn.server.**nodemanager.
>> **NodeStatusUpdaterImpl: Sending out status for container: container_id
>> {, app_attempt_id {, application_id {, id: 11, cluster_timestamp:
>> 1354100898821, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics:
>> "", exit_status: -1000,
>> (END)
>>
>>
>> please help
>>
>

Re: YARN hangs while computing pig?

Posted by Johnny Kowalski <jo...@gmail.com>.

please do not forward it. I suppose it might be cloudera manager problem 
with computing mapreduce?

I've installed separetly CDH4.1 with YARN and I've made a virtual machine 
with cloudera manager and that 

mapreduce testing example from 
Running an example application with YARN from 
https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-ComponentsThatRequireAdditionalConfiguration

simply doesn't work on machine configured by cloudera manager and works on 
pure CDH4 installation...

not working I mean it stops at mapping


W dniu czwartek, 29 listopada 2012 19:11:46 UTC+1 użytkownik Mark Grover 
napisał:
>
> Redirecting to Apache pig user list
>
> On Thu, Nov 29, 2012 at 1:01 AM, Johnny Kowalski <johnny....@gmail.com<javascript:>
> > wrote:
>
>> Another hint? Is is some permission issue?
>>
>> MY EXAMPLE:
>> in = LOAD '/user/myUser/aTest' USING PigStorage();
>> DUMP in;
>>
>> When I try to run above pig in grunt shell with "pig -x local" it 
>> successfully prints output.
>> *
>> And when I do it in mapreduce pig grundshell it hangs... on this line:"
>>
>> 2012-11-28 15:24:27,709 [main] INFO  org.apache.pig.backend.hadoop.
>> *
>> ***executionengine.****mapReduceLayer.****MapReduceLauncher - 0% 
>> complete"
>> *
>>
>>
>> W dniu środa, 28 listopada 2012 15:44:22 UTC+1 użytkownik Johnny Kowalski 
>> napisał:
>>
>>> Hi, after configuring whole stack I've got another issue. My pig jobs 
>>> hangs.
>>> I've created completly new user added a dir for him
>>> sudo -u hdfs hadoop fs -mkdir /user/newUser
>>> sudo -u hdfs hadoop fs -chown newUser:newUser /user/newUser
>>>
>>> and wanted to run pig that runs fine on another yarn configuration 
>>> cluster. And got something like this:
>>>
>>>
>>> PIG CONSOLE OUTPUT
>>>
>>> 2012-11-28 15:24:26,759 [Thread-4] INFO  org.apache.hadoop.mapreduce.**lib.input.FileInputFormat 
>>> - Total input paths to process : 1
>>> 2012-11-28 15:24:26,760 [Thread-4] INFO  org.apache.pig.backend.hadoop.*
>>> *executionengine.util.**MapRedUtil - Total input paths to process : 1
>>> 2012-11-28 15:24:26,783 [Thread-4] INFO  org.apache.pig.backend.hadoop.*
>>> *executionengine.util.**MapRedUtil - Total input paths (combined) to 
>>> process : 1
>>> 2012-11-28 15:24:26,872 [Thread-4] INFO  org.apache.hadoop.mapreduce.**JobSubmitter 
>>> - number of splits:1
>>> 2012-11-28 15:24:26,889 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration 
>>> - fs.default.name is deprecated. Instead, use fs.defaultFS
>>> 2012-11-28 15:24:26,890 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration 
>>> - mapreduce.job.counters.limit is deprecated. Instead, use 
>>> mapreduce.job.counters.max
>>> 2012-11-28 15:24:26,891 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration 
>>> - mapred.job.tracker is deprecated. Instead, use 
>>> mapreduce.jobtracker.address
>>> 2012-11-28 15:24:26,892 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration 
>>> - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
>>> 2012-11-28 15:24:26,893 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration 
>>> - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
>>> 2012-11-28 15:24:26,894 [Thread-4] WARN  org.apache.hadoop.conf.**Configuration 
>>> - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
>>> 2012-11-28 15:24:27,563 [Thread-4] INFO  org.apache.hadoop.mapred.**ResourceMgrDelegate 
>>> - Submitted application application_1354100898821_0011 to ResourceManager 
>>> at userver/192.168.56.101:8032
>>> 2012-11-28 15:24:27,629 [Thread-4] INFO  org.apache.hadoop.mapreduce.**Job 
>>> - The url to track the job: http://userver:8088/proxy/**
>>> application_1354100898821_**0011/<http://userver:8088/proxy/application_1354100898821_0011/>
>>> 2012-11-28 15:24:27,709 [main] INFO  org.apache.pig.backend.hadoop.**
>>> executionengine.**mapReduceLayer.**MapReduceLauncher - 0% complete
>>>
>>> Here I've noticed logs keep growing while mine job is in status RUNNING
>>>
>>> /var/log/hadoop-yarn# less hadoop-cmf-yarn1-NODEMANAGER-**
>>> userver.log.out
>>>
>>> 2012-11-28 15:39:16,227 INFO org.apache.hadoop.yarn.server.**
>>> nodemanager.**NodeStatusUpdaterImpl: Sending out status for container: 
>>> container_id {, app_attempt_id {, application_id {, id: 11, 
>>> cluster_timestamp: 1354100898821, }, attemptId: 1, }, id: 1, }, state: 
>>> C_RUNNING, diagnostics: "", exit_status: -1000, 
>>> 2012-11-28 15:39:17,230 INFO org.apache.hadoop.yarn.server.**
>>> nodemanager.**NodeStatusUpdaterImpl: Sending out status for container: 
>>> container_id {, app_attempt_id {, application_id {, id: 11, 
>>> cluster_timestamp: 1354100898821, }, attemptId: 1, }, id: 1, }, state: 
>>> C_RUNNING, diagnostics: "", exit_status: -1000, 
>>> 2012-11-28 15:39:18,233 INFO org.apache.hadoop.yarn.server.**
>>> nodemanager.**NodeStatusUpdaterImpl: Sending out status for container: 
>>> container_id {, app_attempt_id {, application_id {, id: 11, 
>>> cluster_timestamp: 1354100898821, }, attemptId: 1, }, id: 1, }, state: 
>>> C_RUNNING, diagnostics: "", exit_status: -1000, 
>>> 2012-11-28 15:39:18,356 INFO org.apache.hadoop.yarn.server.**
>>> nodemanager.containermanager.**monitor.ContainersMonitorImpl: Memory 
>>> usage of ProcessTree 25203 for container-id container_1354100898821_0011_
>>> **01_000001: 124.5mb of 1.5gb physical memory used; 1.9gb of 3.1gb 
>>> virtual memory used
>>> 2012-11-28 15:39:19,238 INFO org.apache.hadoop.yarn.server.**
>>> nodemanager.**NodeStatusUpdaterImpl: Sending out status for container: 
>>> container_id {, app_attempt_id {, application_id {, id: 11, 
>>> cluster_timestamp: 1354100898821, }, attemptId: 1, }, id: 1, }, state: 
>>> C_RUNNING, diagnostics: "", exit_status: -1000, 
>>> 2012-11-28 15:39:20,241 INFO org.apache.hadoop.yarn.server.**
>>> nodemanager.**NodeStatusUpdaterImpl: Sending out status for container: 
>>> container_id {, app_attempt_id {, application_id {, id: 11, 
>>> cluster_timestamp: 1354100898821, }, attemptId: 1, }, id: 1, }, state: 
>>> C_RUNNING, diagnostics: "", exit_status: -1000, 
>>> 2012-11-28 15:39:21,245 INFO org.apache.hadoop.yarn.server.**
>>> nodemanager.**NodeStatusUpdaterImpl: Sending out status for container: 
>>> container_id {, app_attempt_id {, application_id {, id: 11, 
>>> cluster_timestamp: 1354100898821, }, attemptId: 1, }, id: 1, }, state: 
>>> C_RUNNING, diagnostics: "", exit_status: -1000, 
>>> (END)
>>>
>>>
>>> please help
>>>
>>
>