You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Torsten Curdt <tc...@apache.org> on 2007/12/12 11:41:59 UTC
finalize upgrade
Hey guys,
triggered by a post on the mailing list I also checked our 0.14
cluster and although we really though we did the finalize after the
upgrade we also have a big "previous" dir there. A couple of things I
am wondering here...
1) I thought that the data is actually not duplicated ...so why is it
so big?
2) Is there a way of finding out whether finalize still needs to be run?
cheers
--
Torsten
Re: finalize upgrade
Posted by Torsten Curdt <tc...@apache.org>.
On 14.12.2007, at 23:35, Konstantin Shvachko wrote:
>> Well, from the output it looks like that has been run. At least I
>> cannot see any sign telling me I still need to run it ...still
>> was the previous directory on the name node.
>>> The way it works in pre 0.16 is that you start the cluster, and
>>> issue:
>>> hadoop dfsadmin -finalizeUpgrade
>> I've just run that again. Now the 'previous' dir on the namenode
>> is gone. But on the data nodes the 'previous' is still there.
>
> That means finalizeUpgrade has not been run before or failed for
> some reason.
>
>>> Now, if you still want to do it manually, then yes just remove
>>> "previous"
>>> dir on the name-node and then start the cluster.
>>> Data-nodes will finalize automatically.
>> Hmmm ...I cannot see that happening. Now what?
>
> Finalizing on the data-nodes is very lazy.
> During registrations and block reports the name-node will inform
> data-nodes
> whether it has the previous state or not. If not the data-nodes
> will remove
> their previous states.
> On the running cluster a complete previous state removal can take
> up to one hour.
>
> If you need to accelerate it - restart the cluster.
> But it should be done by now. There's been 3 hours since you wrote
> this.
Yepp ...you were right :) 'previous' is gone now.
cheers
--
Torsten
Re: finalize upgrade
Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
> Well, from the output it looks like that has been run. At least I
> cannot see any sign telling me I still need to run it ...still was the
> previous directory on the name node.
>
>> The way it works in pre 0.16 is that you start the cluster, and issue:
>> hadoop dfsadmin -finalizeUpgrade
>
> I've just run that again. Now the 'previous' dir on the namenode is
> gone. But on the data nodes the 'previous' is still there.
That means finalizeUpgrade has not been run before or failed for some reason.
>> Now, if you still want to do it manually, then yes just remove
>> "previous"
>> dir on the name-node and then start the cluster.
>> Data-nodes will finalize automatically.
>
> Hmmm ...I cannot see that happening. Now what?
Finalizing on the data-nodes is very lazy.
During registrations and block reports the name-node will inform data-nodes
whether it has the previous state or not. If not the data-nodes will remove
their previous states.
On the running cluster a complete previous state removal can take up to one hour.
If you need to accelerate it - restart the cluster.
But it should be done by now. There's been 3 hours since you wrote this.
Re: finalize upgrade
Posted by Torsten Curdt <tc...@apache.org>.
On 14.12.2007, at 19:41, Konstantin Shvachko wrote:
> Sorry, it looks like the UI and report feature will appear only in
> 0.16.
> It is related to HADOOP-1604.
> In general you are not supposed to remove any directories manually.
That's why I am so careful :)
> You should just use finalizeUpgrade.
Well, from the output it looks like that has been run. At least I
cannot see any sign telling me I still need to run it ...still was
the previous directory on the name node.
> The way it works in pre 0.16 is that you start the cluster, and issue:
> hadoop dfsadmin -finalizeUpgrade
I've just run that again. Now the 'previous' dir on the namenode is
gone. But on the data nodes the 'previous' is still there.
> Now, if you still want to do it manually, then yes just remove
> "previous"
> dir on the name-node and then start the cluster.
> Data-nodes will finalize automatically.
Hmmm ...I cannot see that happening. Now what?
cheers
--
Torsten
Re: finalize upgrade
Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
Sorry, it looks like the UI and report feature will appear only in 0.16.
It is related to HADOOP-1604.
In general you are not supposed to remove any directories manually.
You should just use finalizeUpgrade.
The way it works in pre 0.16 is that you start the cluster, and issue:
hadoop dfsadmin -finalizeUpgrade
Now, if you still want to do it manually, then yes just remove "previous"
dir on the name-node and then start the cluster.
Data-nodes will finalize automatically.
Is it what you were asking?
--Konstantin
Torsten Curdt wrote:
> Can anyone confirm?
>
> On 13.12.2007, at 09:46, Torsten Curdt wrote:
>
>> No sign of 'upgrade still needs to be finalized' or something ...so I
>> assume removing the 'previous' dir is safe then?
>>
>> On 12.12.2007, at 21:18, Konstantin Shvachko wrote:
>>
>>>> 2) Is there a way of finding out whether finalize still needs to be
>>>> run?
>>>
>>>
>>> Yes, you can see it on the name-node web UI, and by running
>>> hadoop dfsadmin -report
>>
>>
>
>
Re: finalize upgrade
Posted by Torsten Curdt <tc...@apache.org>.
Can anyone confirm?
On 13.12.2007, at 09:46, Torsten Curdt wrote:
> No sign of 'upgrade still needs to be finalized' or something ...so
> I assume removing the 'previous' dir is safe then?
>
> On 12.12.2007, at 21:18, Konstantin Shvachko wrote:
>
>>> 2) Is there a way of finding out whether finalize still needs to
>>> be run?
>>
>> Yes, you can see it on the name-node web UI, and by running
>> hadoop dfsadmin -report
>
Re: finalize upgrade
Posted by Torsten Curdt <tc...@apache.org>.
No sign of 'upgrade still needs to be finalized' or something ...so I
assume removing the 'previous' dir is safe then?
On 12.12.2007, at 21:18, Konstantin Shvachko wrote:
>> 2) Is there a way of finding out whether finalize still needs to
>> be run?
>
> Yes, you can see it on the name-node web UI, and by running
> hadoop dfsadmin -report
Re: finalize upgrade
Posted by Eric Guillemot <eg...@uvic.ca>.
unsubscribe
----- Original Message -----
From: "Konstantin Shvachko" <sh...@yahoo-inc.com>
To: <ha...@lucene.apache.org>
Sent: Wednesday, December 12, 2007 12:18 PM
Subject: Re: finalize upgrade
>> 2) Is there a way of finding out whether finalize still needs to be run?
>
> Yes, you can see it on the name-node web UI, and by running
> hadoop dfsadmin -report
>
Re: finalize upgrade
Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
> 2) Is there a way of finding out whether finalize still needs to be run?
Yes, you can see it on the name-node web UI, and by running
hadoop dfsadmin -report
Re: finalize upgrade
Posted by Colin Evans <co...@metaweb.com>.
I just found this problem today after upgrading to 0.15. I just deleted
the "previous" directory on all of our machines with no bad consequences
so far.
Torsten Curdt wrote:
> Hey guys,
>
> triggered by a post on the mailing list I also checked our 0.14
> cluster and although we really though we did the finalize after the
> upgrade we also have a big "previous" dir there. A couple of things I
> am wondering here...
>
> 1) I thought that the data is actually not duplicated ...so why is it
> so big?
> 2) Is there a way of finding out whether finalize still needs to be run?
>
> cheers
> --
> Torsten
Re: finalize upgrade
Posted by Doug Cutting <cu...@apache.org>.
Joydeep Sen Sarma wrote:
> it consumes real space though. we were disk full on the drive hosting control/tmp data and got space back once the finalizeUpgrade finished ..
Is that perhaps because it still holds data that's since been deleted?
Doug
RE: finalize upgrade
Posted by Joydeep Sen Sarma <js...@facebook.com>.
it consumes real space though. we were disk full on the drive hosting control/tmp data and got space back once the finalizeUpgrade finished ..
-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org]
Sent: Wed 12/12/2007 11:14 AM
To: hadoop-user@lucene.apache.org
Subject: Re: finalize upgrade
Torsten Curdt wrote:
> 1) I thought that the data is actually not duplicated ...so why is it so
> big?
I think it is a directory of hard links.
Doug
Re: finalize upgrade
Posted by Doug Cutting <cu...@apache.org>.
Torsten Curdt wrote:
> 1) I thought that the data is actually not duplicated ...so why is it so
> big?
I think it is a directory of hard links.
Doug