You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Brensch <as...@sub.uni-goettingen.de> on 2008/03/26 09:57:22 UTC
Howto?: Monitor File/Job allocation
Hello everybody,
I've been playing with Hadoop for a few days, and I'm only starting to
explore it's beauty.
In an attempt to learn on the Grep Example, I ended up wondering whether you
can actually extract from within a map, on which file you are currently
running.
e.g. Suppose I want to grep through a set of files, and instead of having
only a global response, I need an output per file as well.
> ./bin/hadoop jar hadoop-0.16.1-examples.jar grep input output "au[a-c]"
> input/file1.txt 3 aua
> input/file1.txt 2 aub
> input/file1.txt 1 auc
> input/file2.txt 1 aua
> input/file2.txt 2 aub
> input/file2.txt 3 auc
> 4 aua
> 4 aub
> 4 auc
now this could be really easy to do (just hit the right variable in the
JobConf?) or it could be absolutely impossible, since its hadoop's innate
goal to extract from file-related stuff - I'd really appreciate a hint or a
link to read about this.
regards,
Brensch
--
View this message in context: http://www.nabble.com/Howto-%3A-Monitor-File-Job-allocation-tp16297900p16297900.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Howto?: Monitor File/Job allocation
Posted by Miles Osborne <mi...@inf.ed.ac.uk>.
>From here:
http://wiki.apache.org/hadoop/TaskExecutionEnvironment
The following properties are localized for each task's JobConf:
*Name*
*Type*
*Description*
mapred.job.id
String
The job id
mapred.task.id
String
The task id
mapred.task.is.map
boolean
Is this a map task
mapred.task.partition
int
The id of the task within the job
map.input.file
String
The filename that the map is reading from
map.input.start
long
The offset of the start of the map input split
map.input.length
long
The number of bytes in the map input split
On 26/03/2008, Brensch <as...@sub.uni-goettingen.de> wrote:
>
>
> Hello everybody,
>
> I've been playing with Hadoop for a few days, and I'm only starting to
> explore it's beauty.
>
> In an attempt to learn on the Grep Example, I ended up wondering whether
> you
> can actually extract from within a map, on which file you are currently
> running.
> e.g. Suppose I want to grep through a set of files, and instead of having
> only a global response, I need an output per file as well.
>
> > ./bin/hadoop jar hadoop-0.16.1-examples.jar grep input output "au[a-c]"
>
> > input/file1.txt 3 aua
> > input/file1.txt 2 aub
> > input/file1.txt 1 auc
>
> > input/file2.txt 1 aua
> > input/file2.txt 2 aub
> > input/file2.txt 3 auc
>
> > 4 aua
> > 4 aub
> > 4 auc
>
>
> now this could be really easy to do (just hit the right variable in the
> JobConf?) or it could be absolutely impossible, since its hadoop's innate
> goal to extract from file-related stuff - I'd really appreciate a hint or
> a
> link to read about this.
>
> regards,
> Brensch
>
> --
> View this message in context:
> http://www.nabble.com/Howto-%3A-Monitor-File-Job-allocation-tp16297900p16297900.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>
--
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.