You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by hequn cheng <ch...@gmail.com> on 2014/03/07 06:32:34 UTC
why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Hi~
First, i use FileSystem to open a file in hdfs.
FSDataInputStream m_dis = fs.open(...);
Second, read the data in m_dis to a byte array.
byte[] inputdata = new byte[m_dis.available()];
//m_dis.available = 47185920
m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
the value returned by m_dis.read() is 131072(2^17), so the data after
131072 is missing. It seems that FSDataInputStream use short to manage it's
data which confused me a lot. The same code run well in hadoop1.2.1.
thank you~
Re: why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Posted by hequn cheng <ch...@gmail.com>.
yep that did the job :)
I use readFully instead and it works well~~thank you~
2014-03-07 13:48 GMT+08:00 Binglin Chang <de...@gmail.com>:
> the semantic of read does not guarantee read as much as possible. you need
> to call read() many times or use readFully
>
>
> On Fri, Mar 7, 2014 at 1:32 PM, hequn cheng <ch...@gmail.com> wrote:
>
>> Hi~
>> First, i use FileSystem to open a file in hdfs.
>> FSDataInputStream m_dis = fs.open(...);
>>
>> Second, read the data in m_dis to a byte array.
>> byte[] inputdata = new byte[m_dis.available()];
>> //m_dis.available = 47185920
>> m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
>>
>> the value returned by m_dis.read() is 131072(2^17), so the data after
>> 131072 is missing. It seems that FSDataInputStream use short to manage it's
>> data which confused me a lot. The same code run well in hadoop1.2.1.
>>
>> thank you~
>>
>
>
Re: why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Posted by hequn cheng <ch...@gmail.com>.
yep that did the job :)
I use readFully instead and it works well~~thank you~
2014-03-07 13:48 GMT+08:00 Binglin Chang <de...@gmail.com>:
> the semantic of read does not guarantee read as much as possible. you need
> to call read() many times or use readFully
>
>
> On Fri, Mar 7, 2014 at 1:32 PM, hequn cheng <ch...@gmail.com> wrote:
>
>> Hi~
>> First, i use FileSystem to open a file in hdfs.
>> FSDataInputStream m_dis = fs.open(...);
>>
>> Second, read the data in m_dis to a byte array.
>> byte[] inputdata = new byte[m_dis.available()];
>> //m_dis.available = 47185920
>> m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
>>
>> the value returned by m_dis.read() is 131072(2^17), so the data after
>> 131072 is missing. It seems that FSDataInputStream use short to manage it's
>> data which confused me a lot. The same code run well in hadoop1.2.1.
>>
>> thank you~
>>
>
>
Re: why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Posted by hequn cheng <ch...@gmail.com>.
yep that did the job :)
I use readFully instead and it works well~~thank you~
2014-03-07 13:48 GMT+08:00 Binglin Chang <de...@gmail.com>:
> the semantic of read does not guarantee read as much as possible. you need
> to call read() many times or use readFully
>
>
> On Fri, Mar 7, 2014 at 1:32 PM, hequn cheng <ch...@gmail.com> wrote:
>
>> Hi~
>> First, i use FileSystem to open a file in hdfs.
>> FSDataInputStream m_dis = fs.open(...);
>>
>> Second, read the data in m_dis to a byte array.
>> byte[] inputdata = new byte[m_dis.available()];
>> //m_dis.available = 47185920
>> m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
>>
>> the value returned by m_dis.read() is 131072(2^17), so the data after
>> 131072 is missing. It seems that FSDataInputStream use short to manage it's
>> data which confused me a lot. The same code run well in hadoop1.2.1.
>>
>> thank you~
>>
>
>
Re: why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Posted by hequn cheng <ch...@gmail.com>.
yep that did the job :)
I use readFully instead and it works well~~thank you~
2014-03-07 13:48 GMT+08:00 Binglin Chang <de...@gmail.com>:
> the semantic of read does not guarantee read as much as possible. you need
> to call read() many times or use readFully
>
>
> On Fri, Mar 7, 2014 at 1:32 PM, hequn cheng <ch...@gmail.com> wrote:
>
>> Hi~
>> First, i use FileSystem to open a file in hdfs.
>> FSDataInputStream m_dis = fs.open(...);
>>
>> Second, read the data in m_dis to a byte array.
>> byte[] inputdata = new byte[m_dis.available()];
>> //m_dis.available = 47185920
>> m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
>>
>> the value returned by m_dis.read() is 131072(2^17), so the data after
>> 131072 is missing. It seems that FSDataInputStream use short to manage it's
>> data which confused me a lot. The same code run well in hadoop1.2.1.
>>
>> thank you~
>>
>
>
Re: why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Posted by Binglin Chang <de...@gmail.com>.
the semantic of read does not guarantee read as much as possible. you need
to call read() many times or use readFully
On Fri, Mar 7, 2014 at 1:32 PM, hequn cheng <ch...@gmail.com> wrote:
> Hi~
> First, i use FileSystem to open a file in hdfs.
> FSDataInputStream m_dis = fs.open(...);
>
> Second, read the data in m_dis to a byte array.
> byte[] inputdata = new byte[m_dis.available()];
> //m_dis.available = 47185920
> m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
>
> the value returned by m_dis.read() is 131072(2^17), so the data after
> 131072 is missing. It seems that FSDataInputStream use short to manage it's
> data which confused me a lot. The same code run well in hadoop1.2.1.
>
> thank you~
>
Re: why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Posted by Binglin Chang <de...@gmail.com>.
the semantic of read does not guarantee read as much as possible. you need
to call read() many times or use readFully
On Fri, Mar 7, 2014 at 1:32 PM, hequn cheng <ch...@gmail.com> wrote:
> Hi~
> First, i use FileSystem to open a file in hdfs.
> FSDataInputStream m_dis = fs.open(...);
>
> Second, read the data in m_dis to a byte array.
> byte[] inputdata = new byte[m_dis.available()];
> //m_dis.available = 47185920
> m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
>
> the value returned by m_dis.read() is 131072(2^17), so the data after
> 131072 is missing. It seems that FSDataInputStream use short to manage it's
> data which confused me a lot. The same code run well in hadoop1.2.1.
>
> thank you~
>
Re: why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Posted by Binglin Chang <de...@gmail.com>.
the semantic of read does not guarantee read as much as possible. you need
to call read() many times or use readFully
On Fri, Mar 7, 2014 at 1:32 PM, hequn cheng <ch...@gmail.com> wrote:
> Hi~
> First, i use FileSystem to open a file in hdfs.
> FSDataInputStream m_dis = fs.open(...);
>
> Second, read the data in m_dis to a byte array.
> byte[] inputdata = new byte[m_dis.available()];
> //m_dis.available = 47185920
> m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
>
> the value returned by m_dis.read() is 131072(2^17), so the data after
> 131072 is missing. It seems that FSDataInputStream use short to manage it's
> data which confused me a lot. The same code run well in hadoop1.2.1.
>
> thank you~
>
Re: why can FSDataInputStream.read() only read 2^17 bytes in hadoop2.0?
Posted by Binglin Chang <de...@gmail.com>.
the semantic of read does not guarantee read as much as possible. you need
to call read() many times or use readFully
On Fri, Mar 7, 2014 at 1:32 PM, hequn cheng <ch...@gmail.com> wrote:
> Hi~
> First, i use FileSystem to open a file in hdfs.
> FSDataInputStream m_dis = fs.open(...);
>
> Second, read the data in m_dis to a byte array.
> byte[] inputdata = new byte[m_dis.available()];
> //m_dis.available = 47185920
> m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
>
> the value returned by m_dis.read() is 131072(2^17), so the data after
> 131072 is missing. It seems that FSDataInputStream use short to manage it's
> data which confused me a lot. The same code run well in hadoop1.2.1.
>
> thank you~
>