You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Thanh Do <th...@cs.wisc.edu> on 2010/10/14 02:01:27 UTC

DataBlockScanner scan period

Hi again,

Could any body explain to me about the scanning period
policy of DataBlockScanner? That is who often it wake up
and scan a block file.
When looking at the code, I found

 static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks


but definitely it does not wake up and pick a random block
to verify every three weeks, right?

Thanks a lot,
Thanh

Re: DataBlockScanner scan period

Posted by Raghu Angadi <ra...@apache.org>.
On Wed, Oct 13, 2010 at 5:01 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> Hi again,
>
> Could any body explain to me about the scanning period
> policy of DataBlockScanner? That is who often it wake up
> and scan a block file.
> When looking at the code, I found
>
>  static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
>
but definitely it does not wake up and pick a random block
> to verify every three weeks, right?
>

of course, not.

The scanner is always alive. It paces itself so that it scans all the blocks
in 3 weeks. So if a datanode has just 210 blocks, it would be scanning about
10 a day.

Raghu.


> Thanks a lot,
> Thanh
>
>

Re: DataBlockScanner scan period

Posted by Thanh Do <th...@cs.wisc.edu>.
Oh, now i see the problem.
The implication here is that some blocks might not be
scanned for every long time, because the scanner
may not finish scan all the blocks during 3 weeks,
then after that, it start over again, ...

Interesting, thanks for prompt reply, Brian.

Thanh



On Wed, Oct 13, 2010 at 7:37 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:

>
> On Oct 13, 2010, at 7:29 PM, Thanh Do wrote:
>
> > Hi Brian,
> >
> > If this is the case, then is there any chance that,
> > some how the DataBlockScanner cannot finishes
> > the verification for all the block in three weeks
> > (e.g, a node has a very large number of blocks)?
> >
>
> Yes.  At some point, I'd really like to figure out what percentage of our
> blocks actually get scanned at our site, I suspect some go very long without
> a scan.
>
> Brian
>
> > Thanh
> >
> > On Wed, Oct 13, 2010 at 7:18 PM, Brian Bockelman <bbockelm@cse.unl.edu
> >wrote:
> >
> >> Hi Thanh,
> >>
> >> That is correct.  Last time I read the code, Hadoop scheduled the block
> >> verifications randomly throughout the period in order to avoid periodic
> >> effects (i.e., high load every N minutes).
> >>
> >> Brian
> >>
> >> On Oct 13, 2010, at 7:14 PM, Thanh Do wrote:
> >>
> >>> Brian,
> >>>
> >>> When you say *attempt* to complete and *entire* node scan,
> >>> you mean for example, if a node has 100 block files, it will
> >>> try to verify all 100 block every 3 weeks?
> >>> That is in average, a block is scanned every (3 weeks / 100 time
> >> interval)?
> >>>
> >>> Thanks
> >>> Thanh
> >>>
> >>>
> >>> On Wed, Oct 13, 2010 at 7:07 PM, Brian Bockelman <bbockelm@cse.unl.edu
> >>> wrote:
> >>>
> >>>> Hi Thanh,
> >>>>
> >>>> The scan period is the period that hadoop *attempts* to complete an
> >> entire
> >>>> node scan.  That is, if it's set to 3 weeks, HDFS will try to scan
> each
> >>>> block once every 3 weeks.
> >>>>
> >>>> Obviously, depending on the bandwidth you have made available to the
> >>>> scanning thread, you can specify impossibly small periods.
> >>>>
> >>>> Brian
> >>>>
> >>>> On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:
> >>>>
> >>>>> Hi again,
> >>>>>
> >>>>> Could any body explain to me about the scanning period
> >>>>> policy of DataBlockScanner? That is who often it wake up
> >>>>> and scan a block file.
> >>>>> When looking at the code, I found
> >>>>>
> >>>>> static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
> >>>>>
> >>>>>
> >>>>> but definitely it does not wake up and pick a random block
> >>>>> to verify every three weeks, right?
> >>>>>
> >>>>> Thanks a lot,
> >>>>> Thanh
> >>>>
> >>>>
> >>
> >>
>
>

Re: DataBlockScanner scan period

Posted by Brian Bockelman <bb...@cse.unl.edu>.
On Nov 23, 2010, at 7:41 PM, Thanh Do wrote:

> sorry for digging up this old thread.
> 
> Brian, is this the reason you want to add a "data-level" scan
> to HDFS, as in HDFS-221.
> 
> It seems to me that a very rarely read block could
> be silently corrupted, because the DataBlockScanner
> never finish it scanning job in 3 weeks...
> 
> 

Why?  What if you restarted your datanode once every 2 weeks?  Last I checked, HDFS randomly assigned blocks to be verified throughout a time interval.  If you have too many blocks and an insufficient time interval, because HDFS also provides a rate limiting feature, you can easily come up with a case where blocks won't get verified.

The reason one wants a data-level scan is if the admin wants to manually verify that all copies of a file are good (well, "good" compared to the checksum... maybe the user corrupted it before uploading it :).  It'd be a great debugging tool to put site admin's minds at easy.

Brian

Re: DataBlockScanner scan period

Posted by Thanh Do <th...@cs.wisc.edu>.
sorry for digging up this old thread.

Brian, is this the reason you want to add a "data-level" scan
to HDFS, as in HDFS-221.

It seems to me that a very rarely read block could
be silently corrupted, because the DataBlockScanner
never finish it scanning job in 3 weeks...


On Wed, Oct 13, 2010 at 7:37 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:

>
> On Oct 13, 2010, at 7:29 PM, Thanh Do wrote:
>
> > Hi Brian,
> >
> > If this is the case, then is there any chance that,
> > some how the DataBlockScanner cannot finishes
> > the verification for all the block in three weeks
> > (e.g, a node has a very large number of blocks)?
> >
>
> Yes.  At some point, I'd really like to figure out what percentage of our
> blocks actually get scanned at our site, I suspect some go very long without
> a scan.
>
> Brian
>
> > Thanh
> >
> > On Wed, Oct 13, 2010 at 7:18 PM, Brian Bockelman <bbockelm@cse.unl.edu
> >wrote:
> >
> >> Hi Thanh,
> >>
> >> That is correct.  Last time I read the code, Hadoop scheduled the block
> >> verifications randomly throughout the period in order to avoid periodic
> >> effects (i.e., high load every N minutes).
> >>
> >> Brian
> >>
> >> On Oct 13, 2010, at 7:14 PM, Thanh Do wrote:
> >>
> >>> Brian,
> >>>
> >>> When you say *attempt* to complete and *entire* node scan,
> >>> you mean for example, if a node has 100 block files, it will
> >>> try to verify all 100 block every 3 weeks?
> >>> That is in average, a block is scanned every (3 weeks / 100 time
> >> interval)?
> >>>
> >>> Thanks
> >>> Thanh
> >>>
> >>>
> >>> On Wed, Oct 13, 2010 at 7:07 PM, Brian Bockelman <bbockelm@cse.unl.edu
> >>> wrote:
> >>>
> >>>> Hi Thanh,
> >>>>
> >>>> The scan period is the period that hadoop *attempts* to complete an
> >> entire
> >>>> node scan.  That is, if it's set to 3 weeks, HDFS will try to scan
> each
> >>>> block once every 3 weeks.
> >>>>
> >>>> Obviously, depending on the bandwidth you have made available to the
> >>>> scanning thread, you can specify impossibly small periods.
> >>>>
> >>>> Brian
> >>>>
> >>>> On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:
> >>>>
> >>>>> Hi again,
> >>>>>
> >>>>> Could any body explain to me about the scanning period
> >>>>> policy of DataBlockScanner? That is who often it wake up
> >>>>> and scan a block file.
> >>>>> When looking at the code, I found
> >>>>>
> >>>>> static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
> >>>>>
> >>>>>
> >>>>> but definitely it does not wake up and pick a random block
> >>>>> to verify every three weeks, right?
> >>>>>
> >>>>> Thanks a lot,
> >>>>> Thanh
> >>>>
> >>>>
> >>
> >>
>
>

Re: DataBlockScanner scan period

Posted by Brian Bockelman <bb...@cse.unl.edu>.
On Oct 13, 2010, at 7:29 PM, Thanh Do wrote:

> Hi Brian,
> 
> If this is the case, then is there any chance that,
> some how the DataBlockScanner cannot finishes
> the verification for all the block in three weeks
> (e.g, a node has a very large number of blocks)?
> 

Yes.  At some point, I'd really like to figure out what percentage of our blocks actually get scanned at our site, I suspect some go very long without a scan.

Brian

> Thanh
> 
> On Wed, Oct 13, 2010 at 7:18 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:
> 
>> Hi Thanh,
>> 
>> That is correct.  Last time I read the code, Hadoop scheduled the block
>> verifications randomly throughout the period in order to avoid periodic
>> effects (i.e., high load every N minutes).
>> 
>> Brian
>> 
>> On Oct 13, 2010, at 7:14 PM, Thanh Do wrote:
>> 
>>> Brian,
>>> 
>>> When you say *attempt* to complete and *entire* node scan,
>>> you mean for example, if a node has 100 block files, it will
>>> try to verify all 100 block every 3 weeks?
>>> That is in average, a block is scanned every (3 weeks / 100 time
>> interval)?
>>> 
>>> Thanks
>>> Thanh
>>> 
>>> 
>>> On Wed, Oct 13, 2010 at 7:07 PM, Brian Bockelman <bbockelm@cse.unl.edu
>>> wrote:
>>> 
>>>> Hi Thanh,
>>>> 
>>>> The scan period is the period that hadoop *attempts* to complete an
>> entire
>>>> node scan.  That is, if it's set to 3 weeks, HDFS will try to scan each
>>>> block once every 3 weeks.
>>>> 
>>>> Obviously, depending on the bandwidth you have made available to the
>>>> scanning thread, you can specify impossibly small periods.
>>>> 
>>>> Brian
>>>> 
>>>> On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:
>>>> 
>>>>> Hi again,
>>>>> 
>>>>> Could any body explain to me about the scanning period
>>>>> policy of DataBlockScanner? That is who often it wake up
>>>>> and scan a block file.
>>>>> When looking at the code, I found
>>>>> 
>>>>> static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
>>>>> 
>>>>> 
>>>>> but definitely it does not wake up and pick a random block
>>>>> to verify every three weeks, right?
>>>>> 
>>>>> Thanks a lot,
>>>>> Thanh
>>>> 
>>>> 
>> 
>> 


Re: DataBlockScanner scan period

Posted by Thanh Do <th...@cs.wisc.edu>.
Hi Brian,

If this is the case, then is there any chance that,
some how the DataBlockScanner cannot finishes
the verification for all the block in three weeks
(e.g, a node has a very large number of blocks)?

Thanh

On Wed, Oct 13, 2010 at 7:18 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:

> Hi Thanh,
>
> That is correct.  Last time I read the code, Hadoop scheduled the block
> verifications randomly throughout the period in order to avoid periodic
> effects (i.e., high load every N minutes).
>
> Brian
>
> On Oct 13, 2010, at 7:14 PM, Thanh Do wrote:
>
> > Brian,
> >
> > When you say *attempt* to complete and *entire* node scan,
> > you mean for example, if a node has 100 block files, it will
> > try to verify all 100 block every 3 weeks?
> > That is in average, a block is scanned every (3 weeks / 100 time
> interval)?
> >
> > Thanks
> > Thanh
> >
> >
> > On Wed, Oct 13, 2010 at 7:07 PM, Brian Bockelman <bbockelm@cse.unl.edu
> >wrote:
> >
> >> Hi Thanh,
> >>
> >> The scan period is the period that hadoop *attempts* to complete an
> entire
> >> node scan.  That is, if it's set to 3 weeks, HDFS will try to scan each
> >> block once every 3 weeks.
> >>
> >> Obviously, depending on the bandwidth you have made available to the
> >> scanning thread, you can specify impossibly small periods.
> >>
> >> Brian
> >>
> >> On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:
> >>
> >>> Hi again,
> >>>
> >>> Could any body explain to me about the scanning period
> >>> policy of DataBlockScanner? That is who often it wake up
> >>> and scan a block file.
> >>> When looking at the code, I found
> >>>
> >>> static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
> >>>
> >>>
> >>> but definitely it does not wake up and pick a random block
> >>> to verify every three weeks, right?
> >>>
> >>> Thanks a lot,
> >>> Thanh
> >>
> >>
>
>

Re: DataBlockScanner scan period

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Hi Thanh,

That is correct.  Last time I read the code, Hadoop scheduled the block verifications randomly throughout the period in order to avoid periodic effects (i.e., high load every N minutes).

Brian

On Oct 13, 2010, at 7:14 PM, Thanh Do wrote:

> Brian,
> 
> When you say *attempt* to complete and *entire* node scan,
> you mean for example, if a node has 100 block files, it will
> try to verify all 100 block every 3 weeks?
> That is in average, a block is scanned every (3 weeks / 100 time interval)?
> 
> Thanks
> Thanh
> 
> 
> On Wed, Oct 13, 2010 at 7:07 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:
> 
>> Hi Thanh,
>> 
>> The scan period is the period that hadoop *attempts* to complete an entire
>> node scan.  That is, if it's set to 3 weeks, HDFS will try to scan each
>> block once every 3 weeks.
>> 
>> Obviously, depending on the bandwidth you have made available to the
>> scanning thread, you can specify impossibly small periods.
>> 
>> Brian
>> 
>> On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:
>> 
>>> Hi again,
>>> 
>>> Could any body explain to me about the scanning period
>>> policy of DataBlockScanner? That is who often it wake up
>>> and scan a block file.
>>> When looking at the code, I found
>>> 
>>> static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
>>> 
>>> 
>>> but definitely it does not wake up and pick a random block
>>> to verify every three weeks, right?
>>> 
>>> Thanks a lot,
>>> Thanh
>> 
>> 


Re: DataBlockScanner scan period

Posted by Thanh Do <th...@cs.wisc.edu>.
Brian,

When you say *attempt* to complete and *entire* node scan,
you mean for example, if a node has 100 block files, it will
try to verify all 100 block every 3 weeks?
That is in average, a block is scanned every (3 weeks / 100 time interval)?

Thanks
Thanh


On Wed, Oct 13, 2010 at 7:07 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:

> Hi Thanh,
>
> The scan period is the period that hadoop *attempts* to complete an entire
> node scan.  That is, if it's set to 3 weeks, HDFS will try to scan each
> block once every 3 weeks.
>
> Obviously, depending on the bandwidth you have made available to the
> scanning thread, you can specify impossibly small periods.
>
> Brian
>
> On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:
>
> > Hi again,
> >
> > Could any body explain to me about the scanning period
> > policy of DataBlockScanner? That is who often it wake up
> > and scan a block file.
> > When looking at the code, I found
> >
> > static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
> >
> >
> > but definitely it does not wake up and pick a random block
> > to verify every three weeks, right?
> >
> > Thanks a lot,
> > Thanh
>
>

Re: DataBlockScanner scan period

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Hi Thanh,

The scan period is the period that hadoop *attempts* to complete an entire node scan.  That is, if it's set to 3 weeks, HDFS will try to scan each block once every 3 weeks.

Obviously, depending on the bandwidth you have made available to the scanning thread, you can specify impossibly small periods.

Brian

On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:

> Hi again,
> 
> Could any body explain to me about the scanning period
> policy of DataBlockScanner? That is who often it wake up
> and scan a block file.
> When looking at the code, I found
> 
> static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
> 
> 
> but definitely it does not wake up and pick a random block
> to verify every three weeks, right?
> 
> Thanks a lot,
> Thanh