You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by reena upadhyay <re...@outlook.com> on 2014/03/28 10:45:22 UTC

Does hadoop depends on ecc memory to generate checksum for data stored in HDFS

To ensure data I/O integrity,  hadoop uses CRC 32 mechanism  to generate checksum for the data stored on hdfs . But suppose I have a data node machine that does not have ecc(error correcting code) type of memory, So will hadoop hdfs will be able to generate checksum for data blocks when read/write will happen in hdfs?

Or In simple words, Does hadoop depends on ecc memory to generate checksum for data stored in HDFS?


 		 	   		  

Re: Does hadoop depends on ecc memory to generate checksum for data stored in HDFS

Posted by Harsh J <ha...@cloudera.com>.
While the HDFS functionality of computing, storing and validating
checksums for block files does not specifically _require_ ECC, you do
_want_ ECC to avoid frequent checksum failures.

This is noted in Tom's book as well, in the chapter that discusses
setting up your own cluster:
"ECC memory is strongly recommended, as several Hadoop users have
reported seeing many checksum errors when using non-ECC memory on
Hadoop clusters."

On Fri, Mar 28, 2014 at 3:15 PM, reena upadhyay <re...@outlook.com> wrote:
> To ensure data I/O integrity,  hadoop uses CRC 32 mechanism  to generate
> checksum for the data stored on hdfs . But suppose I have a data node
> machine that does not have ecc(error correcting code) type of memory, So
> will hadoop hdfs will be able to generate checksum for data blocks when
> read/write will happen in hdfs?
>
> Or In simple words, Does hadoop depends on ecc memory to generate checksum
> for data stored in HDFS?
>
>



-- 
Harsh J

Re: Does hadoop depends on ecc memory to generate checksum for data stored in HDFS

Posted by Harsh J <ha...@cloudera.com>.
While the HDFS functionality of computing, storing and validating
checksums for block files does not specifically _require_ ECC, you do
_want_ ECC to avoid frequent checksum failures.

This is noted in Tom's book as well, in the chapter that discusses
setting up your own cluster:
"ECC memory is strongly recommended, as several Hadoop users have
reported seeing many checksum errors when using non-ECC memory on
Hadoop clusters."

On Fri, Mar 28, 2014 at 3:15 PM, reena upadhyay <re...@outlook.com> wrote:
> To ensure data I/O integrity,  hadoop uses CRC 32 mechanism  to generate
> checksum for the data stored on hdfs . But suppose I have a data node
> machine that does not have ecc(error correcting code) type of memory, So
> will hadoop hdfs will be able to generate checksum for data blocks when
> read/write will happen in hdfs?
>
> Or In simple words, Does hadoop depends on ecc memory to generate checksum
> for data stored in HDFS?
>
>



-- 
Harsh J

Re: Does hadoop depends on ecc memory to generate checksum for data stored in HDFS

Posted by Harsh J <ha...@cloudera.com>.
While the HDFS functionality of computing, storing and validating
checksums for block files does not specifically _require_ ECC, you do
_want_ ECC to avoid frequent checksum failures.

This is noted in Tom's book as well, in the chapter that discusses
setting up your own cluster:
"ECC memory is strongly recommended, as several Hadoop users have
reported seeing many checksum errors when using non-ECC memory on
Hadoop clusters."

On Fri, Mar 28, 2014 at 3:15 PM, reena upadhyay <re...@outlook.com> wrote:
> To ensure data I/O integrity,  hadoop uses CRC 32 mechanism  to generate
> checksum for the data stored on hdfs . But suppose I have a data node
> machine that does not have ecc(error correcting code) type of memory, So
> will hadoop hdfs will be able to generate checksum for data blocks when
> read/write will happen in hdfs?
>
> Or In simple words, Does hadoop depends on ecc memory to generate checksum
> for data stored in HDFS?
>
>



-- 
Harsh J

Re: Does hadoop depends on ecc memory to generate checksum for data stored in HDFS

Posted by Harsh J <ha...@cloudera.com>.
While the HDFS functionality of computing, storing and validating
checksums for block files does not specifically _require_ ECC, you do
_want_ ECC to avoid frequent checksum failures.

This is noted in Tom's book as well, in the chapter that discusses
setting up your own cluster:
"ECC memory is strongly recommended, as several Hadoop users have
reported seeing many checksum errors when using non-ECC memory on
Hadoop clusters."

On Fri, Mar 28, 2014 at 3:15 PM, reena upadhyay <re...@outlook.com> wrote:
> To ensure data I/O integrity,  hadoop uses CRC 32 mechanism  to generate
> checksum for the data stored on hdfs . But suppose I have a data node
> machine that does not have ecc(error correcting code) type of memory, So
> will hadoop hdfs will be able to generate checksum for data blocks when
> read/write will happen in hdfs?
>
> Or In simple words, Does hadoop depends on ecc memory to generate checksum
> for data stored in HDFS?
>
>



-- 
Harsh J