You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by "Chandra Mohan, Ananda Vel Murugan" <An...@honeywell.com> on 2013/10/01 06:39:48 UTC

Question on BytesWritable

Hi,

I am using Hadoop 1.0.2. I have written a map reduce job. I have a requirement to process the whole file without splitting. So I have written a new input format to process the file as a whole by overriding the isSplittable() method. I have also created a new Record reader implementation to read the whole file. I followed the sample in Chapter 7 of "Hadoop- The Definitive Guide" book. In my map reduce job, my mapper emits BytesWritable as value. I want to get the bytes and read some specific information from the bytes. I use ByteArrayInputStream and do further processing. But strangely the following code shows different numbers. Because of this I am getting errors.

//value -> BytesWritable
System.out.println("Bytes length " + value.getLength()); // Bytes length 1931650
byte[] bytes = value.getBytes();
System.out.println("Bytes array length"+bytes.length); //Bytes array length 2897340

My file size is 1931650 bytes. I don't know why byte array is bigger than the original file.

Any idea what is going wrong. Please help. Thanks in advance.

Regards,
Anand.C

Re: Question on BytesWritable

Posted by John Meagher <jo...@gmail.com>.

https://issues.apache.org/jira/browse/HADOOP-6298

On Tue, Oct 1, 2013 at 12:39 AM, Chandra Mohan, Ananda Vel Murugan
<An...@honeywell.com> wrote:
> Hi,
>
>
>
> I am using Hadoop 1.0.2. I have written a map reduce job. I have a
> requirement to process the whole file without splitting. So I have written a
> new input format to process the file as a whole by overriding the
> isSplittable() method. I have also created a new Record reader
> implementation to read the whole file. I followed the sample in Chapter 7 of
> “Hadoop- The Definitive Guide” book. In my map reduce job, my mapper emits
> BytesWritable as value. I want to get the bytes and read some specific
> information from the bytes. I use ByteArrayInputStream and do further
> processing. But strangely the following code shows different numbers.
> Because of this I am getting errors.
>
>
>
> //value -> BytesWritable
>
> System.out.println(“Bytes length ” + value.getLength()); // Bytes length
> 1931650
>
> byte[] bytes = value.getBytes();
>
> System.out.println("Bytes array length"+bytes.length); //Bytes array length
> 2897340
>
>
>
> My file size is 1931650 bytes. I don’t know why byte array is bigger than
> the original file.
>
>
>
> Any idea what is going wrong. Please help. Thanks in advance.
>
>
>
> Regards,
>
> Anand.C

Re: Question on BytesWritable

Posted by John Meagher <jo...@gmail.com>.

https://issues.apache.org/jira/browse/HADOOP-6298

On Tue, Oct 1, 2013 at 12:39 AM, Chandra Mohan, Ananda Vel Murugan
<An...@honeywell.com> wrote:
> Hi,
>
>
>
> I am using Hadoop 1.0.2. I have written a map reduce job. I have a
> requirement to process the whole file without splitting. So I have written a
> new input format to process the file as a whole by overriding the
> isSplittable() method. I have also created a new Record reader
> implementation to read the whole file. I followed the sample in Chapter 7 of
> “Hadoop- The Definitive Guide” book. In my map reduce job, my mapper emits
> BytesWritable as value. I want to get the bytes and read some specific
> information from the bytes. I use ByteArrayInputStream and do further
> processing. But strangely the following code shows different numbers.
> Because of this I am getting errors.
>
>
>
> //value -> BytesWritable
>
> System.out.println(“Bytes length ” + value.getLength()); // Bytes length
> 1931650
>
> byte[] bytes = value.getBytes();
>
> System.out.println("Bytes array length"+bytes.length); //Bytes array length
> 2897340
>
>
>
> My file size is 1931650 bytes. I don’t know why byte array is bigger than
> the original file.
>
>
>
> Any idea what is going wrong. Please help. Thanks in advance.
>
>
>
> Regards,
>
> Anand.C

Re: Question on BytesWritable

Posted by John Meagher <jo...@gmail.com>.

https://issues.apache.org/jira/browse/HADOOP-6298

On Tue, Oct 1, 2013 at 12:39 AM, Chandra Mohan, Ananda Vel Murugan
<An...@honeywell.com> wrote:
> Hi,
>
>
>
> I am using Hadoop 1.0.2. I have written a map reduce job. I have a
> requirement to process the whole file without splitting. So I have written a
> new input format to process the file as a whole by overriding the
> isSplittable() method. I have also created a new Record reader
> implementation to read the whole file. I followed the sample in Chapter 7 of
> “Hadoop- The Definitive Guide” book. In my map reduce job, my mapper emits
> BytesWritable as value. I want to get the bytes and read some specific
> information from the bytes. I use ByteArrayInputStream and do further
> processing. But strangely the following code shows different numbers.
> Because of this I am getting errors.
>
>
>
> //value -> BytesWritable
>
> System.out.println(“Bytes length ” + value.getLength()); // Bytes length
> 1931650
>
> byte[] bytes = value.getBytes();
>
> System.out.println("Bytes array length"+bytes.length); //Bytes array length
> 2897340
>
>
>
> My file size is 1931650 bytes. I don’t know why byte array is bigger than
> the original file.
>
>
>
> Any idea what is going wrong. Please help. Thanks in advance.
>
>
>
> Regards,
>
> Anand.C

Re: Question on BytesWritable

Posted by John Meagher <jo...@gmail.com>.

https://issues.apache.org/jira/browse/HADOOP-6298

On Tue, Oct 1, 2013 at 12:39 AM, Chandra Mohan, Ananda Vel Murugan
<An...@honeywell.com> wrote:
> Hi,
>
>
>
> I am using Hadoop 1.0.2. I have written a map reduce job. I have a
> requirement to process the whole file without splitting. So I have written a
> new input format to process the file as a whole by overriding the
> isSplittable() method. I have also created a new Record reader
> implementation to read the whole file. I followed the sample in Chapter 7 of
> “Hadoop- The Definitive Guide” book. In my map reduce job, my mapper emits
> BytesWritable as value. I want to get the bytes and read some specific
> information from the bytes. I use ByteArrayInputStream and do further
> processing. But strangely the following code shows different numbers.
> Because of this I am getting errors.
>
>
>
> //value -> BytesWritable
>
> System.out.println(“Bytes length ” + value.getLength()); // Bytes length
> 1931650
>
> byte[] bytes = value.getBytes();
>
> System.out.println("Bytes array length"+bytes.length); //Bytes array length
> 2897340
>
>
>
> My file size is 1931650 bytes. I don’t know why byte array is bigger than
> the original file.
>
>
>
> Any idea what is going wrong. Please help. Thanks in advance.
>
>
>
> Regards,
>
> Anand.C