You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by ch huang <ju...@gmail.com> on 2013/12/11 10:14:31 UTC

issue about file in DN datadir

hi,maillist:
              i have a question about file which represent a block in
DN,here is my way to find a block ,i have a file part-m-00000
and i find one replica of one block blk_-5451264646515882190_106793   on
box 192.168.10.224,when i search the datadir on 224
i find the meta file ,but no datafile why?

# sudo -u hdfs hdfs fsck /alex/terasort/10G-input/part-m-00000 -files
-blocks -locations
Connecting to namenode via http://CHBM220:50070 <http://chbm220:50070/>
FSCK started by hdfs (auth:SIMPLE) from /192.168.10.224 for path
/alex/terasort/10G-input/part-m-00000 at Wed Dec 11 14:45:15 CST 2013
/alex/terasort/10G-input/part-m-00000 500000000 bytes, 8 block(s):  OK
0. BP-50684181-192.168.10.220-1383638483950:blk_1612709339511818235_106786
len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
192.168.10.223:50010]
1. BP-50684181-192.168.10.220-1383638483950:blk_-3802055733518151718_106789
len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
192.168.10.223:50010]
2. BP-50684181-192.168.10.220-1383638483950:blk_-1672420361561559829_106791
len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
192.168.10.223:50010]
3. BP-50684181-192.168.10.220-1383638483950:blk_-5451264646515882190_106793
len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
192.168.10.224:50010]
4. BP-50684181-192.168.10.220-1383638483950:blk_6624597853174216221_106795
len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
192.168.10.221:50010]
5. BP-50684181-192.168.10.220-1383638483950:blk_-4947775334639504308_106797
len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
192.168.10.223:50010]
6. BP-50684181-192.168.10.220-1383638483950:blk_214751650269427943_106799
len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
192.168.10.224:50010]

# find /data -name 'blk_-5451264646515882190_106793*'
/data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/blk_-5451264646515882190_106793.meta
# ls
/data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/
blk_3810334848964580951              blk_4621466474283145207_106809.meta
blk_-5451264646515882190              blk_580162309124277323_106788.meta
blk_3810334848964580951_106801.meta  blk_516060569193828059
blk_-5451264646515882190_106793.meta
blk_4621466474283145207              blk_516060569193828059_106796.meta
blk_580162309124277323

Re: issue about file in DN datadir

Posted by Harsh J <ha...@cloudera.com>.
As posted on the other thread, your find expression includes the
genstamp, for example:

find /data -name 'blk_-5451264646515882190*'

On Wed, Dec 11, 2013 at 2:44 PM, ch huang <ju...@gmail.com> wrote:
> hi,maillist:
>               i have a question about file which represent a block in
> DN,here is my way to find a block ,i have a file part-m-00000
> and i find one replica of one block blk_-5451264646515882190_106793   on box
> 192.168.10.224,when i search the datadir on 224
> i find the meta file ,but no datafile why?
>
> # sudo -u hdfs hdfs fsck /alex/terasort/10G-input/part-m-00000 -files
> -blocks -locations
> Connecting to namenode via http://CHBM220:50070
> FSCK started by hdfs (auth:SIMPLE) from /192.168.10.224 for path
> /alex/terasort/10G-input/part-m-00000 at Wed Dec 11 14:45:15 CST 2013
> /alex/terasort/10G-input/part-m-00000 500000000 bytes, 8 block(s):  OK
> 0. BP-50684181-192.168.10.220-1383638483950:blk_1612709339511818235_106786
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.223:50010]
> 1. BP-50684181-192.168.10.220-1383638483950:blk_-3802055733518151718_106789
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.223:50010]
> 2. BP-50684181-192.168.10.220-1383638483950:blk_-1672420361561559829_106791
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.223:50010]
> 3. BP-50684181-192.168.10.220-1383638483950:blk_-5451264646515882190_106793
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.224:50010]
> 4. BP-50684181-192.168.10.220-1383638483950:blk_6624597853174216221_106795
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.221:50010]
> 5. BP-50684181-192.168.10.220-1383638483950:blk_-4947775334639504308_106797
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.223:50010]
> 6. BP-50684181-192.168.10.220-1383638483950:blk_214751650269427943_106799
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.224:50010]
>
> # find /data -name 'blk_-5451264646515882190_106793*'
> /data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/blk_-5451264646515882190_106793.meta
> # ls
> /data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/
> blk_3810334848964580951              blk_4621466474283145207_106809.meta
> blk_-5451264646515882190              blk_580162309124277323_106788.meta
> blk_3810334848964580951_106801.meta  blk_516060569193828059
> blk_-5451264646515882190_106793.meta
> blk_4621466474283145207              blk_516060569193828059_106796.meta
> blk_580162309124277323



-- 
Harsh J

Re: issue about file in DN datadir

Posted by Harsh J <ha...@cloudera.com>.
As posted on the other thread, your find expression includes the
genstamp, for example:

find /data -name 'blk_-5451264646515882190*'

On Wed, Dec 11, 2013 at 2:44 PM, ch huang <ju...@gmail.com> wrote:
> hi,maillist:
>               i have a question about file which represent a block in
> DN,here is my way to find a block ,i have a file part-m-00000
> and i find one replica of one block blk_-5451264646515882190_106793   on box
> 192.168.10.224,when i search the datadir on 224
> i find the meta file ,but no datafile why?
>
> # sudo -u hdfs hdfs fsck /alex/terasort/10G-input/part-m-00000 -files
> -blocks -locations
> Connecting to namenode via http://CHBM220:50070
> FSCK started by hdfs (auth:SIMPLE) from /192.168.10.224 for path
> /alex/terasort/10G-input/part-m-00000 at Wed Dec 11 14:45:15 CST 2013
> /alex/terasort/10G-input/part-m-00000 500000000 bytes, 8 block(s):  OK
> 0. BP-50684181-192.168.10.220-1383638483950:blk_1612709339511818235_106786
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.223:50010]
> 1. BP-50684181-192.168.10.220-1383638483950:blk_-3802055733518151718_106789
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.223:50010]
> 2. BP-50684181-192.168.10.220-1383638483950:blk_-1672420361561559829_106791
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.223:50010]
> 3. BP-50684181-192.168.10.220-1383638483950:blk_-5451264646515882190_106793
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.224:50010]
> 4. BP-50684181-192.168.10.220-1383638483950:blk_6624597853174216221_106795
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.221:50010]
> 5. BP-50684181-192.168.10.220-1383638483950:blk_-4947775334639504308_106797
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.223:50010]
> 6. BP-50684181-192.168.10.220-1383638483950:blk_214751650269427943_106799
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.224:50010]
>
> # find /data -name 'blk_-5451264646515882190_106793*'
> /data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/blk_-5451264646515882190_106793.meta
> # ls
> /data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/
> blk_3810334848964580951              blk_4621466474283145207_106809.meta
> blk_-5451264646515882190              blk_580162309124277323_106788.meta
> blk_3810334848964580951_106801.meta  blk_516060569193828059
> blk_-5451264646515882190_106793.meta
> blk_4621466474283145207              blk_516060569193828059_106796.meta
> blk_580162309124277323



-- 
Harsh J

Re: issue about file in DN datadir

Posted by Harsh J <ha...@cloudera.com>.
As posted on the other thread, your find expression includes the
genstamp, for example:

find /data -name 'blk_-5451264646515882190*'

On Wed, Dec 11, 2013 at 2:44 PM, ch huang <ju...@gmail.com> wrote:
> hi,maillist:
>               i have a question about file which represent a block in
> DN,here is my way to find a block ,i have a file part-m-00000
> and i find one replica of one block blk_-5451264646515882190_106793   on box
> 192.168.10.224,when i search the datadir on 224
> i find the meta file ,but no datafile why?
>
> # sudo -u hdfs hdfs fsck /alex/terasort/10G-input/part-m-00000 -files
> -blocks -locations
> Connecting to namenode via http://CHBM220:50070
> FSCK started by hdfs (auth:SIMPLE) from /192.168.10.224 for path
> /alex/terasort/10G-input/part-m-00000 at Wed Dec 11 14:45:15 CST 2013
> /alex/terasort/10G-input/part-m-00000 500000000 bytes, 8 block(s):  OK
> 0. BP-50684181-192.168.10.220-1383638483950:blk_1612709339511818235_106786
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.223:50010]
> 1. BP-50684181-192.168.10.220-1383638483950:blk_-3802055733518151718_106789
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.223:50010]
> 2. BP-50684181-192.168.10.220-1383638483950:blk_-1672420361561559829_106791
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.223:50010]
> 3. BP-50684181-192.168.10.220-1383638483950:blk_-5451264646515882190_106793
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.224:50010]
> 4. BP-50684181-192.168.10.220-1383638483950:blk_6624597853174216221_106795
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.221:50010]
> 5. BP-50684181-192.168.10.220-1383638483950:blk_-4947775334639504308_106797
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.223:50010]
> 6. BP-50684181-192.168.10.220-1383638483950:blk_214751650269427943_106799
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.224:50010]
>
> # find /data -name 'blk_-5451264646515882190_106793*'
> /data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/blk_-5451264646515882190_106793.meta
> # ls
> /data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/
> blk_3810334848964580951              blk_4621466474283145207_106809.meta
> blk_-5451264646515882190              blk_580162309124277323_106788.meta
> blk_3810334848964580951_106801.meta  blk_516060569193828059
> blk_-5451264646515882190_106793.meta
> blk_4621466474283145207              blk_516060569193828059_106796.meta
> blk_580162309124277323



-- 
Harsh J

Re: issue about file in DN datadir

Posted by Harsh J <ha...@cloudera.com>.
As posted on the other thread, your find expression includes the
genstamp, for example:

find /data -name 'blk_-5451264646515882190*'

On Wed, Dec 11, 2013 at 2:44 PM, ch huang <ju...@gmail.com> wrote:
> hi,maillist:
>               i have a question about file which represent a block in
> DN,here is my way to find a block ,i have a file part-m-00000
> and i find one replica of one block blk_-5451264646515882190_106793   on box
> 192.168.10.224,when i search the datadir on 224
> i find the meta file ,but no datafile why?
>
> # sudo -u hdfs hdfs fsck /alex/terasort/10G-input/part-m-00000 -files
> -blocks -locations
> Connecting to namenode via http://CHBM220:50070
> FSCK started by hdfs (auth:SIMPLE) from /192.168.10.224 for path
> /alex/terasort/10G-input/part-m-00000 at Wed Dec 11 14:45:15 CST 2013
> /alex/terasort/10G-input/part-m-00000 500000000 bytes, 8 block(s):  OK
> 0. BP-50684181-192.168.10.220-1383638483950:blk_1612709339511818235_106786
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.223:50010]
> 1. BP-50684181-192.168.10.220-1383638483950:blk_-3802055733518151718_106789
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.223:50010]
> 2. BP-50684181-192.168.10.220-1383638483950:blk_-1672420361561559829_106791
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.223:50010]
> 3. BP-50684181-192.168.10.220-1383638483950:blk_-5451264646515882190_106793
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.224:50010]
> 4. BP-50684181-192.168.10.220-1383638483950:blk_6624597853174216221_106795
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.221:50010]
> 5. BP-50684181-192.168.10.220-1383638483950:blk_-4947775334639504308_106797
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.224:50010,
> 192.168.10.223:50010]
> 6. BP-50684181-192.168.10.220-1383638483950:blk_214751650269427943_106799
> len=67108864 repl=3 [192.168.10.222:50010, 192.168.10.221:50010,
> 192.168.10.224:50010]
>
> # find /data -name 'blk_-5451264646515882190_106793*'
> /data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/blk_-5451264646515882190_106793.meta
> # ls
> /data/dataspace/3/current/BP-50684181-192.168.10.220-1383638483950/current/finalized/subdir39/
> blk_3810334848964580951              blk_4621466474283145207_106809.meta
> blk_-5451264646515882190              blk_580162309124277323_106788.meta
> blk_3810334848964580951_106801.meta  blk_516060569193828059
> blk_-5451264646515882190_106793.meta
> blk_4621466474283145207              blk_516060569193828059_106796.meta
> blk_580162309124277323



-- 
Harsh J