You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Ramasubramanian Narayanan <ra...@gmail.com> on 2012/11/07 15:52:55 UTC

Regarding loading Image file into HDFS

Hi,

 I have basic doubt... How Hadoop splits an Image file into blocks and puts
in HDFS? Usually Image file cannot be splitted right how it is happening in
Hadoop?

regards,
Rams

Re: Regarding loading Image file into HDFS

Posted by Harsh J <ha...@cloudera.com>.
Hi,

Blocks are split at arbitrary block size boundaries. Readers can read
the whole file by reading all blocks together (this is transparently
handled by the underlying DFS reader classes itself, a developer does
not have to care about it).

HDFS does not care about what _type_ of file you store, its agnostic
and just splits it based on the block size. Its up to the apps to not
split a reader across blocks if it can't be parallelized.

On Wed, Nov 7, 2012 at 8:22 PM, Ramasubramanian Narayanan
<ra...@gmail.com> wrote:
> Hi,
>
>  I have basic doubt... How Hadoop splits an Image file into blocks and puts
> in HDFS? Usually Image file cannot be splitted right how it is happening in
> Hadoop?
>
> regards,
> Rams



-- 
Harsh J

Re: Regarding loading Image file into HDFS

Posted by Harsh J <ha...@cloudera.com>.
Hi,

Blocks are split at arbitrary block size boundaries. Readers can read
the whole file by reading all blocks together (this is transparently
handled by the underlying DFS reader classes itself, a developer does
not have to care about it).

HDFS does not care about what _type_ of file you store, its agnostic
and just splits it based on the block size. Its up to the apps to not
split a reader across blocks if it can't be parallelized.

On Wed, Nov 7, 2012 at 8:22 PM, Ramasubramanian Narayanan
<ra...@gmail.com> wrote:
> Hi,
>
>  I have basic doubt... How Hadoop splits an Image file into blocks and puts
> in HDFS? Usually Image file cannot be splitted right how it is happening in
> Hadoop?
>
> regards,
> Rams



-- 
Harsh J

Re: Regarding loading Image file into HDFS

Posted by Harsh J <ha...@cloudera.com>.
Hi,

Blocks are split at arbitrary block size boundaries. Readers can read
the whole file by reading all blocks together (this is transparently
handled by the underlying DFS reader classes itself, a developer does
not have to care about it).

HDFS does not care about what _type_ of file you store, its agnostic
and just splits it based on the block size. Its up to the apps to not
split a reader across blocks if it can't be parallelized.

On Wed, Nov 7, 2012 at 8:22 PM, Ramasubramanian Narayanan
<ra...@gmail.com> wrote:
> Hi,
>
>  I have basic doubt... How Hadoop splits an Image file into blocks and puts
> in HDFS? Usually Image file cannot be splitted right how it is happening in
> Hadoop?
>
> regards,
> Rams



-- 
Harsh J

Re: Regarding loading Image file into HDFS

Posted by Harsh J <ha...@cloudera.com>.
Hi,

Blocks are split at arbitrary block size boundaries. Readers can read
the whole file by reading all blocks together (this is transparently
handled by the underlying DFS reader classes itself, a developer does
not have to care about it).

HDFS does not care about what _type_ of file you store, its agnostic
and just splits it based on the block size. Its up to the apps to not
split a reader across blocks if it can't be parallelized.

On Wed, Nov 7, 2012 at 8:22 PM, Ramasubramanian Narayanan
<ra...@gmail.com> wrote:
> Hi,
>
>  I have basic doubt... How Hadoop splits an Image file into blocks and puts
> in HDFS? Usually Image file cannot be splitted right how it is happening in
> Hadoop?
>
> regards,
> Rams



-- 
Harsh J