You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by ch huang <ju...@gmail.com> on 2013/07/30 08:54:51 UTC
hadoop missing file?
one of my workmate told me some of his file missing ,i use fs check find
following info , how can i prevent them from missing? anyone can help me?
Status: HEALTHY
Total size: 272020850157 B (Total open files size: 652056 B)
Total dirs: 1143
Total files: 1886 (Files currently being written: 2)
Total blocks (validated): 5651 (avg. block size 48136763 B) (Total
open file blocks (not validated): 1)
Minimally replicated blocks: 5651 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 129 (2.2827818 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 903 (5.0571237 %)
Number of data-nodes: 3
Number of racks: 1
FSCK ended at Tue Jul 30 14:38:01 CST 2013 in 462 milliseconds
Re: hadoop missing file?
Posted by Bertrand Dechoux <de...@gmail.com>.
(10-3) * 129 = 903
But long answer
1) which missing file?
2) how do you know it is missing?
You have a cluster with 3 datanodes, the default replication factor is 3
but not for the job jar which is 10 (mapred.submit.replication).
Let's say you ran 129 jobs who failed in a weird way (like at submission),
you would have 129 under-replicated blocks (one block per jar because your
jar is small) and 903 missing replicas because with 3 datanodes you can't
have more than 3 replicas anyway.
So back to the first question : which missing file?
It might only be that the file hasn't be uploaded in the first place. It
happens.
For all your blocks, you do have at least one replica : Minimally
replicated blocks: 5651 (100.0 %)
Bertrand
On Tue, Jul 30, 2013 at 8:54 AM, ch huang <ju...@gmail.com> wrote:
> one of my workmate told me some of his file missing ,i use fs check find
> following info , how can i prevent them from missing? anyone can help me?
>
> Status: HEALTHY
> Total size: 272020850157 B (Total open files size: 652056 B)
> Total dirs: 1143
> Total files: 1886 (Files currently being written: 2)
> Total blocks (validated): 5651 (avg. block size 48136763 B) (Total
> open file blocks (not validated): 1)
> Minimally replicated blocks: 5651 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 129 (2.2827818 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 3
> Average block replication: 3.0
> Corrupt blocks: 0
> Missing replicas: 903 (5.0571237 %)
> Number of data-nodes: 3
> Number of racks: 1
> FSCK ended at Tue Jul 30 14:38:01 CST 2013 in 462 milliseconds
>
--
Bertrand Dechoux
Re: hadoop missing file?
Posted by Bertrand Dechoux <de...@gmail.com>.
(10-3) * 129 = 903
But long answer
1) which missing file?
2) how do you know it is missing?
You have a cluster with 3 datanodes, the default replication factor is 3
but not for the job jar which is 10 (mapred.submit.replication).
Let's say you ran 129 jobs who failed in a weird way (like at submission),
you would have 129 under-replicated blocks (one block per jar because your
jar is small) and 903 missing replicas because with 3 datanodes you can't
have more than 3 replicas anyway.
So back to the first question : which missing file?
It might only be that the file hasn't be uploaded in the first place. It
happens.
For all your blocks, you do have at least one replica : Minimally
replicated blocks: 5651 (100.0 %)
Bertrand
On Tue, Jul 30, 2013 at 8:54 AM, ch huang <ju...@gmail.com> wrote:
> one of my workmate told me some of his file missing ,i use fs check find
> following info , how can i prevent them from missing? anyone can help me?
>
> Status: HEALTHY
> Total size: 272020850157 B (Total open files size: 652056 B)
> Total dirs: 1143
> Total files: 1886 (Files currently being written: 2)
> Total blocks (validated): 5651 (avg. block size 48136763 B) (Total
> open file blocks (not validated): 1)
> Minimally replicated blocks: 5651 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 129 (2.2827818 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 3
> Average block replication: 3.0
> Corrupt blocks: 0
> Missing replicas: 903 (5.0571237 %)
> Number of data-nodes: 3
> Number of racks: 1
> FSCK ended at Tue Jul 30 14:38:01 CST 2013 in 462 milliseconds
>
--
Bertrand Dechoux
Re: hadoop missing file?
Posted by Bertrand Dechoux <de...@gmail.com>.
(10-3) * 129 = 903
But long answer
1) which missing file?
2) how do you know it is missing?
You have a cluster with 3 datanodes, the default replication factor is 3
but not for the job jar which is 10 (mapred.submit.replication).
Let's say you ran 129 jobs who failed in a weird way (like at submission),
you would have 129 under-replicated blocks (one block per jar because your
jar is small) and 903 missing replicas because with 3 datanodes you can't
have more than 3 replicas anyway.
So back to the first question : which missing file?
It might only be that the file hasn't be uploaded in the first place. It
happens.
For all your blocks, you do have at least one replica : Minimally
replicated blocks: 5651 (100.0 %)
Bertrand
On Tue, Jul 30, 2013 at 8:54 AM, ch huang <ju...@gmail.com> wrote:
> one of my workmate told me some of his file missing ,i use fs check find
> following info , how can i prevent them from missing? anyone can help me?
>
> Status: HEALTHY
> Total size: 272020850157 B (Total open files size: 652056 B)
> Total dirs: 1143
> Total files: 1886 (Files currently being written: 2)
> Total blocks (validated): 5651 (avg. block size 48136763 B) (Total
> open file blocks (not validated): 1)
> Minimally replicated blocks: 5651 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 129 (2.2827818 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 3
> Average block replication: 3.0
> Corrupt blocks: 0
> Missing replicas: 903 (5.0571237 %)
> Number of data-nodes: 3
> Number of racks: 1
> FSCK ended at Tue Jul 30 14:38:01 CST 2013 in 462 milliseconds
>
--
Bertrand Dechoux
Re: hadoop missing file?
Posted by Bertrand Dechoux <de...@gmail.com>.
(10-3) * 129 = 903
But long answer
1) which missing file?
2) how do you know it is missing?
You have a cluster with 3 datanodes, the default replication factor is 3
but not for the job jar which is 10 (mapred.submit.replication).
Let's say you ran 129 jobs who failed in a weird way (like at submission),
you would have 129 under-replicated blocks (one block per jar because your
jar is small) and 903 missing replicas because with 3 datanodes you can't
have more than 3 replicas anyway.
So back to the first question : which missing file?
It might only be that the file hasn't be uploaded in the first place. It
happens.
For all your blocks, you do have at least one replica : Minimally
replicated blocks: 5651 (100.0 %)
Bertrand
On Tue, Jul 30, 2013 at 8:54 AM, ch huang <ju...@gmail.com> wrote:
> one of my workmate told me some of his file missing ,i use fs check find
> following info , how can i prevent them from missing? anyone can help me?
>
> Status: HEALTHY
> Total size: 272020850157 B (Total open files size: 652056 B)
> Total dirs: 1143
> Total files: 1886 (Files currently being written: 2)
> Total blocks (validated): 5651 (avg. block size 48136763 B) (Total
> open file blocks (not validated): 1)
> Minimally replicated blocks: 5651 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 129 (2.2827818 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 3
> Average block replication: 3.0
> Corrupt blocks: 0
> Missing replicas: 903 (5.0571237 %)
> Number of data-nodes: 3
> Number of racks: 1
> FSCK ended at Tue Jul 30 14:38:01 CST 2013 in 462 milliseconds
>
--
Bertrand Dechoux