You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Pedro Costa <ps...@gmail.com> on 2011/02/01 12:43:10 UTC

raw length vs part length

Hi,

Hadoop uses the compressed length and the raw length.

1 - In my example, the RT is fetching a map output that shows that it
has the raw length of 14 bytes and the partLength of 10 bytes. The map
output doesn't use any compression.
When I'm dealing with uncompressed data, the raw length should be 14
and the partlength 0? I'm saying this because, the data that is being
transferred to the RT is uncompressed.

2 - The raw length of the map output is the size of the block (10
bytes) + header?

3 - part length means partition length?

Thanks,
-- 
Pedro

Fwd: raw length vs part length

Posted by Pedro Costa <ps...@gmail.com>.

I just have to make a correction and add a new question.
The correction:
PartLength: 14 bytes
Raw length: 10 bytes.

Added question:
Why the class IFileInputStream is created with a size of 14 bytes, if
the data (segments) has the size of 10 bytes, despite of all the
map-0.out file have the size of 14 bytes?


---------- Forwarded message ----------
From: Pedro Costa <ps...@gmail.com>
Date: Tue, Feb 1, 2011 at 11:43 AM
Subject: raw length vs part length
To: mapreduce-user@hadoop.apache.org


Hi,

Hadoop uses the compressed length and the raw length.

1 - In my example, the RT is fetching a map output that shows that it
has the raw length of 14 bytes and the partLength of 10 bytes. The map
output doesn't use any compression.
When I'm dealing with uncompressed data, the raw length should be 14
and the partlength 0? I'm saying this because, the data that is being
transferred to the RT is uncompressed.

2 - The raw length of the map output is the size of the block (10
bytes) + header?

3 - part length means partition length?

Thanks,
--
Pedro



-- 
Pedro