You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by rab ra <ra...@gmail.com> on 2014/07/10 11:55:25 UTC
multiple map tasks writing in same hdfs file -issue
Hello
I have one use-case that spans multiple map tasks in hadoop environment. I
use hadoop 1.2.1 and with 6 task nodes. Each map task writes their output
into a file stored in hdfs. This file is shared across all the map tasks.
Though, they all computes thier output but some of them are missing in the
output file.
The output file is an excel file with 8 parameters(headings). Each map task
is supposed to compute all these 8 values, and save it as soon as it is
computed. This means, the programming logic of a map task opens the file,
writes the value and close, 8 times.
Can someone give me a hint on whats going wrong here?
Is it possible to make more than one map task to write in a shared file in
HDFS?
Regards
Rab
Re: multiple map tasks writing in same hdfs file -issue
Posted by Arpit Agarwal <aa...@hortonworks.com>.
HDFS is single-writer, multiple-reader (see sec 8.3.1 of
http://aosabook.org/en/hdfs.html). You cannot have multiple writers for a
single file at a time.
On Thu, Jul 10, 2014 at 2:55 AM, rab ra <ra...@gmail.com> wrote:
> Hello
>
>
> I have one use-case that spans multiple map tasks in hadoop environment. I
> use hadoop 1.2.1 and with 6 task nodes. Each map task writes their output
> into a file stored in hdfs. This file is shared across all the map tasks.
> Though, they all computes thier output but some of them are missing in the
> output file.
>
>
>
> The output file is an excel file with 8 parameters(headings). Each map
> task is supposed to compute all these 8 values, and save it as soon as it
> is computed. This means, the programming logic of a map task opens the
> file, writes the value and close, 8 times.
>
>
>
> Can someone give me a hint on whats going wrong here?
>
>
>
> Is it possible to make more than one map task to write in a shared file in
> HDFS?
>
> Regards
> Rab
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: multiple map tasks writing in same hdfs file -issue
Posted by Arpit Agarwal <aa...@hortonworks.com>.
HDFS is single-writer, multiple-reader (see sec 8.3.1 of
http://aosabook.org/en/hdfs.html). You cannot have multiple writers for a
single file at a time.
On Thu, Jul 10, 2014 at 2:55 AM, rab ra <ra...@gmail.com> wrote:
> Hello
>
>
> I have one use-case that spans multiple map tasks in hadoop environment. I
> use hadoop 1.2.1 and with 6 task nodes. Each map task writes their output
> into a file stored in hdfs. This file is shared across all the map tasks.
> Though, they all computes thier output but some of them are missing in the
> output file.
>
>
>
> The output file is an excel file with 8 parameters(headings). Each map
> task is supposed to compute all these 8 values, and save it as soon as it
> is computed. This means, the programming logic of a map task opens the
> file, writes the value and close, 8 times.
>
>
>
> Can someone give me a hint on whats going wrong here?
>
>
>
> Is it possible to make more than one map task to write in a shared file in
> HDFS?
>
> Regards
> Rab
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: multiple map tasks writing in same hdfs file -issue
Posted by Arpit Agarwal <aa...@hortonworks.com>.
HDFS is single-writer, multiple-reader (see sec 8.3.1 of
http://aosabook.org/en/hdfs.html). You cannot have multiple writers for a
single file at a time.
On Thu, Jul 10, 2014 at 2:55 AM, rab ra <ra...@gmail.com> wrote:
> Hello
>
>
> I have one use-case that spans multiple map tasks in hadoop environment. I
> use hadoop 1.2.1 and with 6 task nodes. Each map task writes their output
> into a file stored in hdfs. This file is shared across all the map tasks.
> Though, they all computes thier output but some of them are missing in the
> output file.
>
>
>
> The output file is an excel file with 8 parameters(headings). Each map
> task is supposed to compute all these 8 values, and save it as soon as it
> is computed. This means, the programming logic of a map task opens the
> file, writes the value and close, 8 times.
>
>
>
> Can someone give me a hint on whats going wrong here?
>
>
>
> Is it possible to make more than one map task to write in a shared file in
> HDFS?
>
> Regards
> Rab
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: multiple map tasks writing in same hdfs file -issue
Posted by Arpit Agarwal <aa...@hortonworks.com>.
HDFS is single-writer, multiple-reader (see sec 8.3.1 of
http://aosabook.org/en/hdfs.html). You cannot have multiple writers for a
single file at a time.
On Thu, Jul 10, 2014 at 2:55 AM, rab ra <ra...@gmail.com> wrote:
> Hello
>
>
> I have one use-case that spans multiple map tasks in hadoop environment. I
> use hadoop 1.2.1 and with 6 task nodes. Each map task writes their output
> into a file stored in hdfs. This file is shared across all the map tasks.
> Though, they all computes thier output but some of them are missing in the
> output file.
>
>
>
> The output file is an excel file with 8 parameters(headings). Each map
> task is supposed to compute all these 8 values, and save it as soon as it
> is computed. This means, the programming logic of a map task opens the
> file, writes the value and close, 8 times.
>
>
>
> Can someone give me a hint on whats going wrong here?
>
>
>
> Is it possible to make more than one map task to write in a shared file in
> HDFS?
>
> Regards
> Rab
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.