You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by LiuShuguang <ho...@hotmail.com> on 2007/10/12 10:48:59 UTC
Enhancement to Hadoop
Hello,
I made some enhancement to hadoop based on 0.14. But I do not know how to distribute to others. can you help me on that?
the following is the command and their manual.
regards,
Shuguang Liu
hbot(1) HADOOP hbot(1)
NAME
hbot - Move the job to the last relatively in job queue.
SYNOPSIS
hbot [jobid]
DESCRIPTION
Move the specifed job relative to your last job in the queue.
If invoked by a regular user, btop move the selected job before the
first job with the same priority submitted by the user.
If invoded by HADOOP Administrator, btop moves the selected job before
the first job with the same priority submitted to the queue.
EXAMPLE
% hbot job_200709181000_0001
Job has been moved to position 1 from
bottom.
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
htop(1), hjobs(1)
hbot (hadoop utils) 13.0 October 2007
htop(1) HADOOP htop(1)
NAME
htop - Move the job to the first in job queue.
SYNOPSIS
htop [jobid]
DESCRIPTION
Move the specifed job relative to your first job in the queue.
If invoked by a regular user, btop move the selected job before the
first job with the same priority submitted by the user.
If invoded by HADOOP Administrator, btop moves the selected job before
the first job with the same priority submitted to the queue.
EXAMPLE
% htop job_200709181000_0001
Job has been moved to position 1 from
top.
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
hbot(1), hjobs(1)
htop (hadoop utils) 13.0 October 2007
hopen(1) HADOOP hopen(1)
NAME
hopen - open a TaskTracker to accept new tasks.
SYNOPSIS
hopen [-h] [-jt jt1;jt2..] HOSTNAME
DESCRIPTION
Open the TaskTracker specified by HOSTNAME. After re-opened, it will
then accept new MAP/REDUCE tasks.
-h
Display usage information
-jt jobtracker url link
Specify the cluster by the jobtracker url link, for example.
-jt hostA:3479;hostB:3479
EXAMPLE
% hhosts
HOST STATUS MAX MAP REDUCE FAILURE TMP_SPACE
c0101 ok 8 0 0 0 203M
c0102 closed 8 0 0 0 553M
% hopen c0102
Host will be open, please confirm by hhosts/hload
% hhosts
HOST STATUS MAX MAP REDUCE FAILURE TMP_SPACE
c0101 ok 8 0 0 0 203M
c0102 ok 8 0 0 0 553M
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
hclose(1), hslot(1), hhosts(1), hload(1)
hopen (hadoop utils) 13.0 September 2007
hclose(1) HADOOP hclose(1)
NAME
hclose - close a TaskTracker for some reason.
SYNOPSIS
hclose [-h] [-jt jt1;jt2..] HOSTNAME
DESCRIPTION
Close the TaskTracker specified by HOSTNAME. After closed, it will
not accept new MAP/REDUCE tasks. But tasks currently running will
continue until finished. It is useful to maintenance the host.
-h
Display usage information
-jt jobtracker url link
Specify the cluster by the jobtracker url link, for example.
-jt hostA:3479;hostB:3479
EXAMPLE
% hhosts
HOST STATUS MAX MAP REDUCE FAILURE TMP_SPACE
c0101 ok 8 0 0 0 203M
c0102 ok 8 0 0 0 553M
% hclose c0102
Host will be closed, please confirm by hhosts/hload
% hhosts
HOST STATUS MAX MAP REDUCE FAILURE TMP_SPACE
c0101 ok 8 0 0 0 203M
c0102 closed 8 0 0 0 553M
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
hopen(1), hslot(1), hhosts(1), hload(1)
hclose (hadoop utils) 13.0 September 2007
hslot(1) HADOOP hslot(1)
NAME
hslot - Set the maximum number of tasks that can be run at the same
time on a specified host on the fly
SYNOPSIS
hslot [-h] [-jt jt1;jt2..] HOSTNAME SLOTNUMBER
DESCRIPTION
hslot set the TaskTracker's (specified by HOSTNAME) maximum availabe
slot. With this commands, users can change the currently running task
numbers dynamically. So the job can get a better performance.
For example, if the max increased, the tasktracker will be able to get
new tasks. But to increase the SLOTNUMBER to a big number is insane.
For tasktracker that is closed, it's behaved as before.
For TaskTracker that is open, There are two situations:
A: Increase the slot:
TaskTracker will be able to accept new task, the number is
MAX - CURRENT_RUNNING.
B: Decrease the slot:
Tasks currently running will not be affacted, after a long run,
there will at most MAX tasks running.
-h
Display usage information
-jt jobtracker url link
Specify the cluster by the jobtracker url link, for example.
-jt hostA:3479;hostB:3479
EXAMPLE
% hhosts
HOST STATUS MAX MAP REDUCE FAILURE TMP_SPACE
c0101 ok 8 0 0 0 203M
c0102 ok 8 0 0 0 553M
% hslot c0102
Host Task Slots will be set to , please confirm by hhosts/hload
% hhosts
HOST STATUS MAX MAP REDUCE FAILURE TMP_SPACE
c0101 ok 8 0 0 0 203M
c0102 ok 4 0 0 0 553M
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
hopen(1), hclose(1), hhosts(1), hload(1)
hslot (hadoop utils) 13.0 September 2007
hjobs(1) HADOOP hjobs(1)
NAME
hjobs - list hadoop jobs finished or running in hadoop
SYNOPSIS
hjobs [OPTION]... [JOBID]...
DESCRIPTION
hjobs will list hadoop jobs. If no parameters specifed, this command
will list the jobs queuing and running only.
-u user_name
With this option, hjobs will display jobs for a user_name
-l jobid
Display the jobid information in long format.
-jt jobtracker url link.
Display the job information of a cluster specified by the jobtracker
url link, for example. -jt hostA:3479;hostB:3479
EXAMPLE
List jobs
% hjobs
JOBID USER STAT FROM_HOST JOB_NAME SUBMIT_TIME
0023 user1 RUN hostA WordCount Sep 10 08:33
0024 user1 PEND hostA WordCount Sep 10 08:47
0025 user2 PEND hostB WordCount Sep 10 09:12
Run hjobs -u user_name to display jobs for a specific user.
% hjobs -u user1
JOBID USER STAT FROM_HOST JOB_NAME SUBMIT_TIME
0023 user1 RUN hostA WordCount Sep 10 08:33
0024 user1 PEND hostA WordCount Sep 10 08:47
% hjobs -l job_200708312310_0005
Job , User , Status
Thu Jan 01 08:00:00: Submitted, JobName
Input Files
Output Path
MAP : Total Progress
Total Number of MAP Tasks
Total Number of Finished MAPs
Total Number of Running MAPs
Total Number of Failed MAPs
Sun Sep 16 19:29:59: Map Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:36: Map Task Finished.
Sun Sep 16 19:30:01: Map Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:36: Map Task Finished.
Sun Sep 16 19:30:07: Map Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:36: Map Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:44: Map Task Finished.
Sun Sep 16 19:30:40: Map Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:44: Map Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:59: Map Task Finished.
Sun Sep 16 19:30:46: Map Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:31:02: Map Task Finished.
Sun Sep 16 19:31:02: Map Task , State Launched on hosts:
Failure Times Kill Times
REDUCE : Total Progress
Total Number of Reduce Tasks
Total Number of Finished Reduces
Total Number of Running Reduces
Total Number of Failed Reduces
Sun Sep 16 19:30:10: Reduce Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:11: Reduce Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:53: Reduce Task , State Launched on hosts:
Failure Times Kill Times
Sun Sep 16 19:30:54: Reduce Task , State Launched on hosts:
Failure Times Kill Times
Reduce Task , State
Reduce Task , State
Reduce Task , State
Reduce Task , State
URL :
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
hjobkill(1), htop(1), hbot(1)
hjobs (hadoop utils) 13.0 October 2007
hjobkill(1) HADOOP hjobkill(1)
NAME
hjobkill - Kill jobs given the job name and clustername. it maybe rerun
depending on parameter.
SYNOPSIS
hjobkill [OPTION] jobid1 jobid2 ...
DESCRIPTION
hjobkill will kill the jobs specified by the parameter. If 0 is the
argument, all jobs will be terminated.
-u user_name
With this option, hjobs will display jobs for a user_name
-jt jobtracker url link.
specify the cluster cluster by the jobtracker
url link, for example. -jt hostA:3479;hostB:3479
-r
If this option is present, the job will be reschedule and ran.
EXAMPLE
% hjobkill job_200708312310_0001
Job has been killed successfully.
% hjobkill -r job_200708312310_0001
Job has been killed successfully.
Job has been rerun.
% hjobkill 0
Job has been killed successfully.
Job has been killed successfully.
Job has been killed successfully.
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
hjobs(1)
hjobkill (hadoop utils) 13.0 October 2007
htasks(1) HADOOP htasks(1)
NAME
htasks -
SYNOPSIS
htasks [OPTION]...
DESCRIPTION
Add something here.
-a, --all
the option
-b, --ba
the option
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
htasks (hadoop utils) 13.0 October 2007
hload(1) HADOOP hload(1)
NAME
hload - Display load information about cluster hosts in 5 seconds
interval.
SYNOPSIS
hload [OPTION]...
DESCRIPTION
By default, hload displays load information about all hosts in the
specified cluster. This command will display the host status, cpu
usage, tmp space, idle time and so on.
-h
Disoplay usage information.
-jt jobtracker url
specify the cluster cluster by the jobtracker's url link, for
example: -jt hostA:3479;hostB:3479
OUTPUT
By default, hload display the following fields:
HOST
The name of the host. If a host is currently a host of a cluster
specifed by the command, the host name will be displayed here.
status
The current status of the host (In fact, the status of TaskTracker
daemon). The possilbe values for host status are as the follows.
ok
The host is available to accept new tasks.
closed
The host is not allowed to accept new tasks any more, but tasks
currently running on the host will continue until finished.
r1m
The 1-minute exponentially averaged CPU run queue length
r5m
The 5-minute exponentially averaged CPU run queue length
r15m
The 15-minute exponentially averaged CPU run queue length
ut%
The CPU utilization exponentially averaged over the last 5 seconds.
between 0.00 and 100.00
pg
The memory paging rate exponentially averaged over the last minute, in
pages per second.
it
On UNIX, the idle time of the host (keyboard not touched on all logged
in sessions), in minutes.
tmp
The amount of free space in /tmp, in megabytes.
swp
The amount of available SWAP, in megabytes.
mem
The amount of available RAM, in megabytes.
EXAMPLE
% hload
HOST status r1m r5m r15m ut% pg it tmp swp mem
c0101 ok 6.33 3.09 1.16 89.17 207.7 0 202M 509M 216M
c0102 ok 4.74 1.54 0.54 7.65 6.86 0 553M 509M 74M
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
hhosts(1), hopen(1), hclose(1), hslot(1)
hload (hadoop utils) 13.0 September 2007
hhosts(1) HADOOP hhosts(1)
NAME
hhosts - display current status of the host(s).
SYNOPSIS
hhosts [-jt trackers:trackers...] [-h]
DESCRIPTION
By default, hhosts returns the information about all hosts in the
specified cluster. This command will display the host status, task
slots and so on.
-h
Disoplay usage information.
-jt jobtracker url
specify the cluster cluster by the jobtracker's url link, for
example: -jt hostA:3479;hostB:3479
OUTPUT
By default, hhosts display the following fields:
HOST
The name of the host. If a host is currently a host of a cluster
specifed by the command, the host name will be displayed here.
STATUS
The current status of the host (In fact, the status of TaskTracker
daemon). The possilbe values for host status are as the follows.
ok
The host is available to accept new tasks.
closed
The host is not allowed to accept new tasks any more, but tasks
currently running on the host will continue until finished.
MAX
The maximum number of tasks including MAPs and Reduces that can be
run at the same time for a host. (2*cpu number is recommanded)
MAP
The number of MAP tasks currently running on the hosts.
REDUCE
The number of REDUCE tasks currently running on the hosts.
FAILURE
The number failed tasks on the host.
TMP_SPACE
The amount of free space in /tmp, in megabytes. For MAP/REDUCE jobs,
this parameter is important.
EXAMPLE
% hhosts
HOST STATUS MAX MAP REDUCE FAILURE TMP_SPACE
c0101 ok 6 6 0 0 218M
c0102 ok 6 2 4 0 568M
AUTHOR
Written by Shuguang Liu
REPORT BUGS
Report bugs to
COPYRIGHT
Copyright (C) 2005-2010 Shuguang Liu's Inc.
This file is not a free software, all the source codes are protected
and will not be released to any body or organization without
authority.
SEE ALSO
hload(1), hopen(1), hclose(1), hslot(1)
hhosts (hadoop utils) 13.0 October 2007
_________________________________________________________________
手机也能上 MSN 聊天了,快来试试吧!
http://mobile.msn.com.cn/
Re: Enhancement to Hadoop
Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Oct 12, 2007, at 1:48 AM, LiuShuguang wrote:
>
> Hello,
>
> I made some enhancement to hadoop based on 0.14. But I do not know
> how to distribute to others. can you help me on that?
By the way, I think rather than moving jobs to the bottom of the
queue, you should just change their priority. It would be much easier
and job priorities are already implemented in 0.15.
-- Owen
Re: Enhancement to Hadoop
Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Oct 12, 2007, at 1:48 AM, LiuShuguang wrote:
> I made some enhancement to hadoop based on 0.14. But I do not know
> how to distribute to others. can you help me on that?
Please read:
http://wiki.apache.org/lucene-hadoop/HowToContribute
You'll need to make sure that it matches the coding standards
including having the Apache headers at the top of each file and
removing author lines.
-- Owen