You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Rob Stewart <ro...@googlemail.com> on 2010/01/29 14:15:57 UTC

Issue with DataGenerator (Checksum error)

Hi, I'm coming across a problem creating a file with DataGenerator and then
uploading to the HDFS,

Here's what I'm running:
-------------------
> hadoop org.apache.pig.test.utils.datagen.DataGenerator -libjars $zipfjar
-conf $conf_file -rows 100000 -f allFiles.dat s:10:100000:u:0
> hadoop org.apache.pig.test.utils.datagen.DataGenerator -libjars $zipfjar
-conf $conf_file -s , -i allFiles.dat -f theDir1.dat s:10:100000:u:0

#This runs fine and uploads the file "allFiles.dat" to the HDFS
> hadoop dfs -put allFiles.dat Inputs/DirDiff/

# HOWEVER, this command fails:
> hadoop dfs -put theDir1.dat Inputs/DirDiff/

ERROR:
-----------
10/01/29 13:10:09 INFO fs.FSInputChecker: Found checksum error: b[0, 0]=
org.apache.hadoop.fs.ChecksumException: Checksum error: theDir1.dat at 0
        at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:220)
        at
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:237)
        at
org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
...............
...............
put: Checksum error: theDir1.dat at 0
------------------------

Is there something wrong with the way I am producing the "theDir1.dat" file
??

Thanks,

Rob Stewart