You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Thibaut_ <tb...@blue.lu> on 2008/08/17 18:20:23 UTC

"Data corruption" when reading file from DFS (version 0.18.0, 0.17.2)

Hi,

I was finally able to reproduce an error that was driving me crazy for the
last few weeks.

I get a reproducable data corruption when I'm reading a file from the dfs.
Reading the file locally does seem to work fine, also making sure to only
read at most 2048 bytes at a time, seems to skip the bug as well. This
happens on the latest release candidate 0.18.0 and also on older versions (I
was using the cvs version from 0.17.2 before).

here is the code I'm using:

  //initialize...DistributedFileSystem

 FSDataInputStream in = remoteFileSystem.open(new Path(feoutput));
 byte buffer[] = new byte[4];
	   while (in.available() >= 1) {

		in.read(buffer, 0, 4); // id (ignore)
		int id = BinConverter.byteArrayToInt(buffer, 0);

		in.read(buffer, 0, 4); // size
		int size = BinConverter.byteArrayToInt(buffer, 0);

		log.info("ID %d  -> %d", id, size);		
		
		byte[] aaa = new byte[size];

                //THIS WORKS!!!
		while (size > 0) {
		   int read = in.read(aaa, aaa.length-size, (int) Math.min(2048, size));
		   size -= read;
		}
                //OR THIS (but I want to use the data)
                in.skip((long) size);		

                //AND THIS DOESN0T!
		in.read(aaa,0,size);
		
	   }
	   in.close();


Interestingly code piece 1 works, and the skip as well, but when reading the
data in all at once I get the following data corruption after a while....

..........
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46126  -> 1322
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46127  -> 1547
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46128  -> 1470
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46129  -> 675
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46130  -> 1666
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46131  -> 765
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46132  -> 574
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46133  -> 1761
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46134  -> 937
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46135  -> 942

vs: (here it fails)
......
08/08/17 17:35:36 INFO indexer.PCreateClusters: ID 46133  -> 1761
08/08/17 17:35:36 INFO indexer.PCreateClusters: ID 46134  -> 937
08/08/17 17:35:36 INFO indexer.PCreateClusters: ID 1240951364  ->
-2045632292
java.lang.NegativeArraySizeException
 .................


Any ideas on how this can be fixed? I suspect a buffer is wrongly filled in
some cases when the size of the requested read call exceeds a certain limit.

Thanks for your help,
Thibaut


-- 
View this message in context: http://www.nabble.com/%22Data-corruption%22-when-reading-file-from-DFS-%28version-0.18.0%2C-0.17.2%29-tp19021171p19021171.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

Re: "Data corruption" when reading file from DFS (version 0.18.0, 0.17.2)

Posted by Thibaut_ <tb...@blue.lu>.

Thank you very much!

This is indeed true. I have had a wrong interpretation of the read command
for a few years now, so I would have never found out. And that error never
turned up when running a programm locally. (also other programs)

Thanks again!
Thibaut


Ted Dunning wrote:
> 
> You aren't checking to see how much data got read.  There is no guarantee
> that this call will read everything you ask for in just one call.
> 
> My suspicion is that the return value will show that <size bytes were
> read.
> 
> On Sun, Aug 17, 2008 at 9:20 AM, Thibaut_ <tb...@blue.lu> wrote:
> 
>> ...
>>
>>                //AND THIS DOESN0T!
>>                in.read(aaa,0,size);
> 
> ...
>> Any ideas on how this can be fixed? I suspect a buffer is wrongly filled
>> in
>> some cases when the size of the requested read call exceeds a certain
>> limit.
>>
> 
> 
> -- 
> ted
> 
> 

-- 
View this message in context: http://www.nabble.com/%22Data-corruption%22-when-reading-file-from-DFS-%28version-0.18.0%2C-0.17.2%29-tp19021171p19023292.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

Re: "Data corruption" when reading file from DFS (version 0.18.0, 0.17.2)

Posted by Ted Dunning <te...@gmail.com>.

You aren't checking to see how much data got read.  There is no guarantee
that this call will read everything you ask for in just one call.

My suspicion is that the return value will show that <size bytes were read.

On Sun, Aug 17, 2008 at 9:20 AM, Thibaut_ <tb...@blue.lu> wrote:

> ...
>
>                //AND THIS DOESN0T!
>                in.read(aaa,0,size);

...
> Any ideas on how this can be fixed? I suspect a buffer is wrongly filled in
> some cases when the size of the requested read call exceeds a certain
> limit.
>

-- 
ted