You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by VJ Shalish <vj...@gmail.com> on 2014/01/04 11:29:33 UTC

Cutting a line in between while creating blocks

Hi,

While creating the blocks for a file containing n number of lines, how does
Hadoop take care of the problem of not Cutting a line in  between while
creating blocks?
Is it taken care of by Hadoop?

Thanks
Shalish.

Re: Cutting a line in between while creating blocks

Posted by Harsh J <ha...@cloudera.com>.
HDFS is agnostic about the contents of the data you store. Think about
it: Line ending character is not the universal way for files to
separate their records.

This question's been asked several times before (search on
http://search-hadoop.com for example). Read
http://wiki.apache.org/hadoop/HadoopMapReduce to understand how
despite HDFS splitting at exactly the 64th megabyte, MR (or other HDFS
file reading operations) make sure to read records whole.

On Sat, Jan 4, 2014 at 3:59 PM, VJ Shalish <vj...@gmail.com> wrote:
> Hi,
>
> While creating the blocks for a file containing n number of lines, how does
> Hadoop take care of the problem of not Cutting a line in  between while
> creating blocks?
> Is it taken care of by Hadoop?
>
> Thanks
> Shalish.



-- 
Harsh J

Re: Cutting a line in between while creating blocks

Posted by Harsh J <ha...@cloudera.com>.
HDFS is agnostic about the contents of the data you store. Think about
it: Line ending character is not the universal way for files to
separate their records.

This question's been asked several times before (search on
http://search-hadoop.com for example). Read
http://wiki.apache.org/hadoop/HadoopMapReduce to understand how
despite HDFS splitting at exactly the 64th megabyte, MR (or other HDFS
file reading operations) make sure to read records whole.

On Sat, Jan 4, 2014 at 3:59 PM, VJ Shalish <vj...@gmail.com> wrote:
> Hi,
>
> While creating the blocks for a file containing n number of lines, how does
> Hadoop take care of the problem of not Cutting a line in  between while
> creating blocks?
> Is it taken care of by Hadoop?
>
> Thanks
> Shalish.



-- 
Harsh J

Re: Cutting a line in between while creating blocks

Posted by Harsh J <ha...@cloudera.com>.
HDFS is agnostic about the contents of the data you store. Think about
it: Line ending character is not the universal way for files to
separate their records.

This question's been asked several times before (search on
http://search-hadoop.com for example). Read
http://wiki.apache.org/hadoop/HadoopMapReduce to understand how
despite HDFS splitting at exactly the 64th megabyte, MR (or other HDFS
file reading operations) make sure to read records whole.

On Sat, Jan 4, 2014 at 3:59 PM, VJ Shalish <vj...@gmail.com> wrote:
> Hi,
>
> While creating the blocks for a file containing n number of lines, how does
> Hadoop take care of the problem of not Cutting a line in  between while
> creating blocks?
> Is it taken care of by Hadoop?
>
> Thanks
> Shalish.



-- 
Harsh J

Re: Cutting a line in between while creating blocks

Posted by Harsh J <ha...@cloudera.com>.
HDFS is agnostic about the contents of the data you store. Think about
it: Line ending character is not the universal way for files to
separate their records.

This question's been asked several times before (search on
http://search-hadoop.com for example). Read
http://wiki.apache.org/hadoop/HadoopMapReduce to understand how
despite HDFS splitting at exactly the 64th megabyte, MR (or other HDFS
file reading operations) make sure to read records whole.

On Sat, Jan 4, 2014 at 3:59 PM, VJ Shalish <vj...@gmail.com> wrote:
> Hi,
>
> While creating the blocks for a file containing n number of lines, how does
> Hadoop take care of the problem of not Cutting a line in  between while
> creating blocks?
> Is it taken care of by Hadoop?
>
> Thanks
> Shalish.



-- 
Harsh J