You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Keith Wiley <kw...@keithwiley.com> on 2012/08/27 20:45:07 UTC

Adding additional storage

I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?

Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?

Thanks.

________________________________________________________________________________
Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
scratch. All together this implies: He scratched the itch from the scratch that
itched but would never itch the scratch from the itch that scratched."
                                           --  Keith Wiley
________________________________________________________________________________

Re: Adding additional storage

Posted by Keith Wiley <kw...@keithwiley.com>.

That appears to have worked.  Thanks.

On Aug 27, 2012, at 11:52 , Harsh J wrote:

> Hey Keith,
> 
> Pseudo-distributed isn't any different from fully-distributed,
> operationally, except for nodes = 1 - so don't let it limit your
> thoughts :)
> 
> Stop the HDFS cluster, mv your existing dfs.name.dir and dfs.data.dir
> dir contents onto the new storage mount. Reconfigure dfs.data.dir and
> dfs.name.dir to point to these new locations and start it back up. All
> should be well.
> 
> On Tue, Aug 28, 2012 at 12:15 AM, Keith Wiley <kw...@keithwiley.com> wrote:
>> I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?
>> 
>> Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?
>> 
>> Thanks.
>> 
>> ________________________________________________________________________________
>> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>> 
>> "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
>> itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
>> scratch. All together this implies: He scratched the itch from the scratch that
>> itched but would never itch the scratch from the itch that scratched."
>>                                           --  Keith Wiley
>> ________________________________________________________________________________
>> 
> 
> 
> 
> -- 
> Harsh J


________________________________________________________________________________
Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda
________________________________________________________________________________

Re: Adding additional storage

Posted by Keith Wiley <kw...@keithwiley.com>.

That appears to have worked.  Thanks.

On Aug 27, 2012, at 11:52 , Harsh J wrote:

> Hey Keith,
> 
> Pseudo-distributed isn't any different from fully-distributed,
> operationally, except for nodes = 1 - so don't let it limit your
> thoughts :)
> 
> Stop the HDFS cluster, mv your existing dfs.name.dir and dfs.data.dir
> dir contents onto the new storage mount. Reconfigure dfs.data.dir and
> dfs.name.dir to point to these new locations and start it back up. All
> should be well.
> 
> On Tue, Aug 28, 2012 at 12:15 AM, Keith Wiley <kw...@keithwiley.com> wrote:
>> I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?
>> 
>> Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?
>> 
>> Thanks.
>> 
>> ________________________________________________________________________________
>> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>> 
>> "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
>> itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
>> scratch. All together this implies: He scratched the itch from the scratch that
>> itched but would never itch the scratch from the itch that scratched."
>>                                           --  Keith Wiley
>> ________________________________________________________________________________
>> 
> 
> 
> 
> -- 
> Harsh J


________________________________________________________________________________
Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda
________________________________________________________________________________

Re: Adding additional storage

Posted by Keith Wiley <kw...@keithwiley.com>.

That appears to have worked.  Thanks.

On Aug 27, 2012, at 11:52 , Harsh J wrote:

> Hey Keith,
> 
> Pseudo-distributed isn't any different from fully-distributed,
> operationally, except for nodes = 1 - so don't let it limit your
> thoughts :)
> 
> Stop the HDFS cluster, mv your existing dfs.name.dir and dfs.data.dir
> dir contents onto the new storage mount. Reconfigure dfs.data.dir and
> dfs.name.dir to point to these new locations and start it back up. All
> should be well.
> 
> On Tue, Aug 28, 2012 at 12:15 AM, Keith Wiley <kw...@keithwiley.com> wrote:
>> I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?
>> 
>> Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?
>> 
>> Thanks.
>> 
>> ________________________________________________________________________________
>> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>> 
>> "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
>> itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
>> scratch. All together this implies: He scratched the itch from the scratch that
>> itched but would never itch the scratch from the itch that scratched."
>>                                           --  Keith Wiley
>> ________________________________________________________________________________
>> 
> 
> 
> 
> -- 
> Harsh J


________________________________________________________________________________
Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda
________________________________________________________________________________

Re: Adding additional storage

Posted by Keith Wiley <kw...@keithwiley.com>.

That appears to have worked.  Thanks.

On Aug 27, 2012, at 11:52 , Harsh J wrote:

> Hey Keith,
> 
> Pseudo-distributed isn't any different from fully-distributed,
> operationally, except for nodes = 1 - so don't let it limit your
> thoughts :)
> 
> Stop the HDFS cluster, mv your existing dfs.name.dir and dfs.data.dir
> dir contents onto the new storage mount. Reconfigure dfs.data.dir and
> dfs.name.dir to point to these new locations and start it back up. All
> should be well.
> 
> On Tue, Aug 28, 2012 at 12:15 AM, Keith Wiley <kw...@keithwiley.com> wrote:
>> I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?
>> 
>> Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?
>> 
>> Thanks.
>> 
>> ________________________________________________________________________________
>> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>> 
>> "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
>> itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
>> scratch. All together this implies: He scratched the itch from the scratch that
>> itched but would never itch the scratch from the itch that scratched."
>>                                           --  Keith Wiley
>> ________________________________________________________________________________
>> 
> 
> 
> 
> -- 
> Harsh J


________________________________________________________________________________
Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda
________________________________________________________________________________

Re: Adding additional storage

Posted by Harsh J <ha...@cloudera.com>.

Hey Keith,

Pseudo-distributed isn't any different from fully-distributed,
operationally, except for nodes = 1 - so don't let it limit your
thoughts :)

Stop the HDFS cluster, mv your existing dfs.name.dir and dfs.data.dir
dir contents onto the new storage mount. Reconfigure dfs.data.dir and
dfs.name.dir to point to these new locations and start it back up. All
should be well.

On Tue, Aug 28, 2012 at 12:15 AM, Keith Wiley <kw...@keithwiley.com> wrote:
> I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?
>
> Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?
>
> Thanks.
>
> ________________________________________________________________________________
> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>
> "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
> itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
> scratch. All together this implies: He scratched the itch from the scratch that
> itched but would never itch the scratch from the itch that scratched."
>                                            --  Keith Wiley
> ________________________________________________________________________________
>



-- 
Harsh J

Re: Adding additional storage

Posted by Harsh J <ha...@cloudera.com>.

Hey Keith,

Pseudo-distributed isn't any different from fully-distributed,
operationally, except for nodes = 1 - so don't let it limit your
thoughts :)

Stop the HDFS cluster, mv your existing dfs.name.dir and dfs.data.dir
dir contents onto the new storage mount. Reconfigure dfs.data.dir and
dfs.name.dir to point to these new locations and start it back up. All
should be well.

On Tue, Aug 28, 2012 at 12:15 AM, Keith Wiley <kw...@keithwiley.com> wrote:
> I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?
>
> Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?
>
> Thanks.
>
> ________________________________________________________________________________
> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>
> "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
> itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
> scratch. All together this implies: He scratched the itch from the scratch that
> itched but would never itch the scratch from the itch that scratched."
>                                            --  Keith Wiley
> ________________________________________________________________________________
>



-- 
Harsh J

Re: Adding additional storage

Posted by Harsh J <ha...@cloudera.com>.

Hey Keith,

Pseudo-distributed isn't any different from fully-distributed,
operationally, except for nodes = 1 - so don't let it limit your
thoughts :)

Stop the HDFS cluster, mv your existing dfs.name.dir and dfs.data.dir
dir contents onto the new storage mount. Reconfigure dfs.data.dir and
dfs.name.dir to point to these new locations and start it back up. All
should be well.

On Tue, Aug 28, 2012 at 12:15 AM, Keith Wiley <kw...@keithwiley.com> wrote:
> I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?
>
> Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?
>
> Thanks.
>
> ________________________________________________________________________________
> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>
> "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
> itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
> scratch. All together this implies: He scratched the itch from the scratch that
> itched but would never itch the scratch from the itch that scratched."
>                                            --  Keith Wiley
> ________________________________________________________________________________
>



-- 
Harsh J

Re: Adding additional storage

Posted by Harsh J <ha...@cloudera.com>.

Hey Keith,

Pseudo-distributed isn't any different from fully-distributed,
operationally, except for nodes = 1 - so don't let it limit your
thoughts :)

Stop the HDFS cluster, mv your existing dfs.name.dir and dfs.data.dir
dir contents onto the new storage mount. Reconfigure dfs.data.dir and
dfs.name.dir to point to these new locations and start it back up. All
should be well.

On Tue, Aug 28, 2012 at 12:15 AM, Keith Wiley <kw...@keithwiley.com> wrote:
> I'm running a pseudo-distributed cluster on a single machine and I would like to use a larger disk (mounted and ready to go of course).  I don't mind transferring to the new disk (as opposed to using both disks for the hdfs which seems much hairier), but I'm not sure how to transfer a hadoop cluster to a new disk...or if its even possible.  Even if I simply copy the entire directory where the hdfs is emulated, I still need to somehow switch the namenode to know about the new disk?
>
> Is there any way to do this that beats manually "getting" the data, throwing away the old cluster, making a new cluster from scratch, and reuploading the data to hdfs?...or is that really the only feasible way to migrate a pseudo-distributed cluster to a second larger storage?
>
> Thanks.
>
> ________________________________________________________________________________
> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
>
> "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
> itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
> scratch. All together this implies: He scratched the itch from the scratch that
> itched but would never itch the scratch from the itch that scratched."
>                                            --  Keith Wiley
> ________________________________________________________________________________
>



-- 
Harsh J