You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Chunky Gupta <ch...@vizury.com> on 2012/10/29 14:07:23 UTC

Enabling fair scheduler using Bootstrap is failing

Hi,

I am trying to enable fair scheduler on my emr cluster at bootstrap. The
steps I am doing are :

1. Creating Job instance from AWS console as "Create New Job Flow" with Job
Type as Hive program.
2. Selecting "Start an Interactive Hive Session".
3. Selecting Master and core instance group and Amazon EC2 Key Pair .
4. Selecting "Configure your Bootstrap Actions" and action type as
"Configure Hadoop".
5. Uploaded a mapred-site.xml in s3 with setting parameters for enabling
fair scheduler as :
 <property>
  <name>mapred.fairscheduler.allocation.file</name>
  <value>conf/pools.xml</value>
  </property>
  <property>
  <name>mapred.jobtracker.taskScheduler</name>
  <value>org.apache.hadoop.mapred.FairScheduler</value>
  </property>
  <property>
  <name>mapred.fairscheduler.assignmultiple</name>
  <value>true</value>
  </property>
  <property>
  <name>mapred.fairscheduler.eventlog.enabled</name>
  <value>false</value>
  </property>

6. In optional arguments I tried "--site-mapred-site,s3://XXX(where I
uploaded)/mapred-site.xml" to upload this xml file for my cluster.

Finally the creation of machine is failing with error "On the master
instance (xxx), bootstrap action 1 returned a non-zero return code".

I think in optional arguments I am giving something wrong. Please help me
in this.

Thanks,
Chunky.

Re: Enabling fair scheduler using Bootstrap is failing

Posted by Chunky Gupta <ch...@vizury.com>.
Hi,

Today, I enabled logging while creating new job. Error which I see in log
files are :

ERROR org.apache.hadoop.security.UserGroupInformation (IPC Server handler
12 on 9000): PriviledgedActionException as:hadoop
cause:java.io.IOException: File /mnt/var/lib/hadoop/tmp/mapred/system/
jobtracker.info could only be replicated to 0 nodes, instead of 1

and

2012-10-30 06:14:26,527 WARN org.apache.hadoop.hdfs.DFSClient (Thread-18):
Error Recovery for block null bad datanode[0] nodes == null
2012-10-30 06:14:26,527 WARN org.apache.hadoop.hdfs.DFSClient (Thread-18):
Could not get block locations. Source file
"/mnt/var/lib/hadoop/tmp/mapred/system/jobtracker.info" - Aborting...
2012-10-30 06:14:26,527 WARN org.apache.hadoop.mapred.JobTracker (main):
Writing to file hdfs://
10.92.235.20:9000/mnt/var/lib/hadoop/tmp/mapred/system/jobtracker.infofailed!
2012-10-30 06:14:26,528 WARN org.apache.hadoop.mapred.JobTracker (main):
FileSystem is not ready yet!
2012-10-30 06:14:26,534 WARN org.apache.hadoop.mapred.JobTracker (main):
Failed to initialize recovery manager.

Also the default file I am uploading at bootstrap mapred-site.xml, I am
removing this configuration:
<property>
<name>mapred.job.tracker</name>
<value>ip-10-116-159-127.ec2.internal:9001</value>
</property>

Please suggest any solution for this.

Thanks,
Chunky.


On Mon, Oct 29, 2012 at 7:10 PM, Chunky Gupta <ch...@vizury.com>wrote:

> Hi,
>
> I tried this also in optional arguments "--site-config-file
> s3://viz-emr-hive/config/mapred-site.xml -m
> mapred.jobtracker.taskScheduler=org.apache.hadoop.mapred.FairScheduler"
>
> This time it goes to state "Bootstrapping" and then failed.
>
> Let me know what changes I can do to make it work.
>
> Thanks,
> Chunky.
>
>
> On Mon, Oct 29, 2012 at 6:37 PM, Chunky Gupta <ch...@vizury.com>wrote:
>
>> Hi,
>>
>> I am trying to enable fair scheduler on my emr cluster at bootstrap. The
>> steps I am doing are :
>>
>> 1. Creating Job instance from AWS console as "Create New Job Flow" with
>> Job Type as Hive program.
>> 2. Selecting "Start an Interactive Hive Session".
>> 3. Selecting Master and core instance group and Amazon EC2 Key Pair .
>> 4. Selecting "Configure your Bootstrap Actions" and action type as
>> "Configure Hadoop".
>> 5. Uploaded a mapred-site.xml in s3 with setting parameters for enabling
>> fair scheduler as :
>>  <property>
>>   <name>mapred.fairscheduler.allocation.file</name>
>>   <value>conf/pools.xml</value>
>>   </property>
>>   <property>
>>   <name>mapred.jobtracker.taskScheduler</name>
>>   <value>org.apache.hadoop.mapred.FairScheduler</value>
>>   </property>
>>   <property>
>>   <name>mapred.fairscheduler.assignmultiple</name>
>>   <value>true</value>
>>   </property>
>>   <property>
>>   <name>mapred.fairscheduler.eventlog.enabled</name>
>>   <value>false</value>
>>   </property>
>>
>> 6. In optional arguments I tried "--site-mapred-site,s3://XXX(where I
>> uploaded)/mapred-site.xml" to upload this xml file for my cluster.
>>
>> Finally the creation of machine is failing with error "On the master
>> instance (xxx), bootstrap action 1 returned a non-zero return code".
>>
>> I think in optional arguments I am giving something wrong. Please help me
>> in this.
>>
>> Thanks,
>> Chunky.
>>
>
>

Re: Enabling fair scheduler using Bootstrap is failing

Posted by Chunky Gupta <ch...@vizury.com>.
Hi,

I tried this also in optional arguments "--site-config-file
s3://viz-emr-hive/config/mapred-site.xml -m
mapred.jobtracker.taskScheduler=org.apache.hadoop.mapred.FairScheduler"

This time it goes to state "Bootstrapping" and then failed.

Let me know what changes I can do to make it work.

Thanks,
Chunky.

On Mon, Oct 29, 2012 at 6:37 PM, Chunky Gupta <ch...@vizury.com>wrote:

> Hi,
>
> I am trying to enable fair scheduler on my emr cluster at bootstrap. The
> steps I am doing are :
>
> 1. Creating Job instance from AWS console as "Create New Job Flow" with
> Job Type as Hive program.
> 2. Selecting "Start an Interactive Hive Session".
> 3. Selecting Master and core instance group and Amazon EC2 Key Pair .
> 4. Selecting "Configure your Bootstrap Actions" and action type as
> "Configure Hadoop".
> 5. Uploaded a mapred-site.xml in s3 with setting parameters for enabling
> fair scheduler as :
>  <property>
>   <name>mapred.fairscheduler.allocation.file</name>
>   <value>conf/pools.xml</value>
>   </property>
>   <property>
>   <name>mapred.jobtracker.taskScheduler</name>
>   <value>org.apache.hadoop.mapred.FairScheduler</value>
>   </property>
>   <property>
>   <name>mapred.fairscheduler.assignmultiple</name>
>   <value>true</value>
>   </property>
>   <property>
>   <name>mapred.fairscheduler.eventlog.enabled</name>
>   <value>false</value>
>   </property>
>
> 6. In optional arguments I tried "--site-mapred-site,s3://XXX(where I
> uploaded)/mapred-site.xml" to upload this xml file for my cluster.
>
> Finally the creation of machine is failing with error "On the master
> instance (xxx), bootstrap action 1 returned a non-zero return code".
>
> I think in optional arguments I am giving something wrong. Please help me
> in this.
>
> Thanks,
> Chunky.
>