You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "shanmuganathan.r" <sh...@zohocorp.com> on 2011/10/06 07:57:58 UTC

Secondary namenode fsimage concept

Hi All,

            I have a doubt in hadoop secondary namenode concept . Please correct if the following statements are wrong .


The namenode hosts the fsimage and edit log files. The secondary namenode hosts the fsimage file only. At the time of checkpoint the edit log file is transferred to the secondary namenode and the both files are merged, Then the updated fsimage file is transferred to the namenode . Is it correct?


If we run the secondary namenode in separate machine , then both machines contain the fsimage file . Namenode only contains the editlog file. Is it true?



Thanks R.Shanmuganathan  







Re: Secondary namenode fsimage concept

Posted by Shouguo Li <th...@gmail.com>.
very helpful tips, thx guys!

On Tue, Oct 11, 2011 at 2:52 PM, patrick sang <si...@gmail.com>wrote:

> Alright this is from my note while i were playing with this.
>
> NFS4
> ====
> would be pretty straight forward.
>
> @NFS server
> /data     *(rw,fsid=0)
>
> !!! don't forget fsid=0
>
> @Client
>
> mount -t rpc_pipefs sunrpc /var/lib/nfs/rpc_pipefs/
> service rpcidmapd start
> mount -t nfs4 nfs_host:/ /hadoop/backup
>
> !!!! NOT
> mount -t nfs4 nfs_host:/data /hadoop/backup
>
>
> NFS3
> ====
> $ ps -ef |grep rpc
> rpc       1509     1  0 17:28 ?        00:00:00 portmap
> root      1592     1  0 17:28 ?        00:00:00 rpc.idmapd
> rpcuser   1704     1  0 17:29 ?        00:00:00 rpc.statd
>
> # service rpcidmapd start
> Starting RPC idmapd: Error: RPC MTAB does not exist.
>
> Fixed by
> mount -t rpc_pipefs sunrpc /var/lib/nfs/rpc_pipefs/
>
> hth
> P
>
>
>
> On Mon, Oct 10, 2011 at 7:55 PM, Harsh J <ha...@cloudera.com> wrote:
>
> > Generally you just gotta ensure that your rpc.lockd service is up and
> > running on both ends, to allow for locking over NFS.
> >
> > On Tue, Oct 11, 2011 at 8:16 AM, Uma Maheswara Rao G 72686
> > <ma...@huawei.com> wrote:
> > > Hi,
> > >
> > > It looks to me that, problem with your NFS. It is not supporting locks.
> > Which version of NFS are you using?
> > > Please check your NFS locking support by writing simple program for
> file
> > locking.
> > >
> > > I think NFS4 supports locking ( i did not tried).
> > >
> > > http://nfs.sourceforge.net/
> > >
> > >  A6. What are the main new features in version 4 of the NFS protocol?
> > >  *NFS Versions 2 and 3 are stateless protocols, but NFS Version 4
> > introduces state. An NFS Version 4 client uses state to notify an NFS
> > Version 4 server of its intentions on a file: locking, reading, writing,
> and
> > so on. An NFS Version 4 server can return information to a client about
> what
> > other clients have intentions on a file to allow a client to cache file
> data
> > more aggressively via delegation. To help keep state consistent, more
> > sophisticated client and server reboot recovery mechanisms are built in
> to
> > the NFS Version 4 protocol.
> > >  *NFS Version 4 introduces support for byte-range locking and share
> > reservation. Locking in NFS Version 4 is lease-based, so an NFS Version 4
> > client must maintain contact with an NFS Version 4 server to continue
> > extending its open and lock leases.
> > >
> > >
> > > Regards,
> > > Uma
> > > ----- Original Message -----
> > > From: Shouguo Li <th...@gmail.com>
> > > Date: Tuesday, October 11, 2011 2:31 am
> > > Subject: Re: Secondary namenode fsimage concept
> > > To: common-user@hadoop.apache.org
> > >
> > >> hey parick
> > >>
> > >> i wanted to configure my cluster to write namenode metadata to
> > >> multipledirectories as well:
> > >>  <property>
> > >>    <name>dfs.name.dir</name>
> > >>    <value>/hadoop/var/name,/mnt/hadoop/var/name</value>
> > >>  </property>
> > >>
> > >> in my case, /hadoop/var/name is local directory,
> > >> /mnt/hadoop/var/name is NFS
> > >> volume. i took down the cluster first, then copied over files from
> > >> /hadoop/var/name to /mnt/hadoop/var/name, and then tried to start
> > >> up the
> > >> cluster. but the cluster won't start up properly...
> > >> here's the namenode log: http://pastebin.com/gmu0B7yd
> > >>
> > >> any ideas why it wouldn't start up?
> > >> thx
> > >>
> > >>
> > >> On Thu, Oct 6, 2011 at 6:58 PM, patrick sang
> > >> <si...@gmail.com>wrote:
> > >> > I would say your namenode write metadata in local fs (where your
> > >> secondary> namenode will pull files), and NFS mount.
> > >> >
> > >> >  <property>
> > >> >    <name>dfs.name.dir</name>
> > >> >    <value>/hadoop/name,/hadoop/nfs_server_name</value>
> > >> >  </property>
> > >> >
> > >> >
> > >> > my 0.02$
> > >> > P
> > >> >
> > >> > On Thu, Oct 6, 2011 at 12:04 AM, shanmuganathan.r <
> > >> > shanmuganathan.r@zohocorp.com> wrote:
> > >> >
> > >> > > Hi Kai,
> > >> > >
> > >> > >      There is no datas stored  in the secondarynamenode related
> > >> to the
> > >> > > Hadoop cluster . Am I correct?
> > >> > > If it correct means If we run the secondaryname node in
> > >> separate machine
> > >> > > then fetching , merging and transferring time is increased if
> > >> the cluster
> > >> > > has large data in the namenode fsimage file . At the time if
> > >> fail over
> > >> > > occurs , then how can we recover the nearly one hour changes in
> > >> the HDFS
> > >> > > file ? (default check point time is one hour)
> > >> > >
> > >> > > Thanks R.Shanmuganathan
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > ---- On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigt<k@123.org&gt;
> > >> wrote> > ----
> > >> > >
> > >> > >
> > >> > > Hi,
> > >> > >
> > >> > > the secondary namenode only fetches the two files when a
> > >> checkpointing is
> > >> > > needed.
> > >> > >
> > >> > > Kai
> > >> > >
> > >> > > Am 06.10.2011 um 08:45 schrieb shanmuganathan.r:
> > >> > >
> > >> > > &gt; Hi Kai,
> > >> > > &gt;
> > >> > > &gt; In the Second part I meant
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt; Is the secondary namenode also contain the FSImage file or
> > >> the two
> > >> > > files(FSImage and EdiltLog) are transferred from the namenode
> > >> at the
> > >> > > checkpoint time.
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt; Thanks
> > >> > > &gt; Shanmuganathan
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt; ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai
> > >> Voigt&amp;lt;k@123.org> &amp;gt;
> > >> > > wrote ----
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt; Hi,
> > >> > > &gt;
> > >> > > &gt; you're correct when saying the namenode hosts the fsimage
> > >> file and
> > >> > the
> > >> > > edits log file.
> > >> > > &gt;
> > >> > > &gt; The fsimage file contains a snapshot of the HDFS metadata (a
> > >> > filename
> > >> > > to blocks list mapping). Whenever there is a change to HDFS, it
> > >> will be
> > >> > > appended to the edits file. Think of it as a database
> > >> transaction log,
> > >> > where
> > >> > > changes will not be applied to the datafile, but appended to a
> > >> log.> > &gt;
> > >> > > &gt; To prevent the edits file growing infinitely, the
> > >> secondary namenode
> > >> > > periodically pulls these two files, and the namenode starts
> > >> writing> changes
> > >> > > to a new edits file. Then, the secondary namenode merges the
> > >> changes from
> > >> > > the edits file with the old snapshot from the fsimage file and
> > >> creates an
> > >> > > updated fsimage file. This updated fsimage file is then copied
> > >> to the
> > >> > > namenode.
> > >> > > &gt;
> > >> > > &gt; Then, the entire cycle starts again. To answer your
> > >> question: The
> > >> > > namenode has both files, even if the secondary namenode is
> > >> running on a
> > >> > > different machine.
> > >> > > &gt;
> > >> > > &gt; Kai
> > >> > > &gt;
> > >> > > &gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r:
> > >> > > &gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt; Hi All,
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt; I have a doubt in hadoop secondary namenode
> > >> concept .
> > >> > Please
> > >> > > correct if the following statements are wrong .
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt; The namenode hosts the fsimage and edit log
> > >> files. The
> > >> > > secondary namenode hosts the fsimage file only. At the time of
> > >> checkpoint> > the edit log file is transferred to the secondary
> > >> namenode and the both
> > >> > > files are merged, Then the updated fsimage file is transferred
> > >> to the
> > >> > > namenode . Is it correct?
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt; If we run the secondary namenode in separate
> > >> machine , then
> > >> > > both machines contain the fsimage file . Namenode only contains
> > >> the> editlog
> > >> > > file. Is it true?
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt; Thanks R.Shanmuganathan
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt; &amp;gt;
> > >> > > &gt;
> > >> > > &gt; --
> > >> > > &gt; Kai Voigt
> > >> > > &gt; k@123.org
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt;
> > >> > > &gt;
> > >> > >
> > >> > > --
> > >> > > Kai Voigt
> > >> > > k@123.org
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> >
> >
> >
> > --
> > Harsh J
> >
>

Re: Secondary namenode fsimage concept

Posted by patrick sang <si...@gmail.com>.
Alright this is from my note while i were playing with this.

NFS4
====
would be pretty straight forward.

@NFS server
/data     *(rw,fsid=0)

!!! don't forget fsid=0

@Client

mount -t rpc_pipefs sunrpc /var/lib/nfs/rpc_pipefs/
service rpcidmapd start
mount -t nfs4 nfs_host:/ /hadoop/backup

!!!! NOT
mount -t nfs4 nfs_host:/data /hadoop/backup


NFS3
====
$ ps -ef |grep rpc
rpc       1509     1  0 17:28 ?        00:00:00 portmap
root      1592     1  0 17:28 ?        00:00:00 rpc.idmapd
rpcuser   1704     1  0 17:29 ?        00:00:00 rpc.statd

# service rpcidmapd start
Starting RPC idmapd: Error: RPC MTAB does not exist.

Fixed by
mount -t rpc_pipefs sunrpc /var/lib/nfs/rpc_pipefs/

hth
P



On Mon, Oct 10, 2011 at 7:55 PM, Harsh J <ha...@cloudera.com> wrote:

> Generally you just gotta ensure that your rpc.lockd service is up and
> running on both ends, to allow for locking over NFS.
>
> On Tue, Oct 11, 2011 at 8:16 AM, Uma Maheswara Rao G 72686
> <ma...@huawei.com> wrote:
> > Hi,
> >
> > It looks to me that, problem with your NFS. It is not supporting locks.
> Which version of NFS are you using?
> > Please check your NFS locking support by writing simple program for file
> locking.
> >
> > I think NFS4 supports locking ( i did not tried).
> >
> > http://nfs.sourceforge.net/
> >
> >  A6. What are the main new features in version 4 of the NFS protocol?
> >  *NFS Versions 2 and 3 are stateless protocols, but NFS Version 4
> introduces state. An NFS Version 4 client uses state to notify an NFS
> Version 4 server of its intentions on a file: locking, reading, writing, and
> so on. An NFS Version 4 server can return information to a client about what
> other clients have intentions on a file to allow a client to cache file data
> more aggressively via delegation. To help keep state consistent, more
> sophisticated client and server reboot recovery mechanisms are built in to
> the NFS Version 4 protocol.
> >  *NFS Version 4 introduces support for byte-range locking and share
> reservation. Locking in NFS Version 4 is lease-based, so an NFS Version 4
> client must maintain contact with an NFS Version 4 server to continue
> extending its open and lock leases.
> >
> >
> > Regards,
> > Uma
> > ----- Original Message -----
> > From: Shouguo Li <th...@gmail.com>
> > Date: Tuesday, October 11, 2011 2:31 am
> > Subject: Re: Secondary namenode fsimage concept
> > To: common-user@hadoop.apache.org
> >
> >> hey parick
> >>
> >> i wanted to configure my cluster to write namenode metadata to
> >> multipledirectories as well:
> >>  <property>
> >>    <name>dfs.name.dir</name>
> >>    <value>/hadoop/var/name,/mnt/hadoop/var/name</value>
> >>  </property>
> >>
> >> in my case, /hadoop/var/name is local directory,
> >> /mnt/hadoop/var/name is NFS
> >> volume. i took down the cluster first, then copied over files from
> >> /hadoop/var/name to /mnt/hadoop/var/name, and then tried to start
> >> up the
> >> cluster. but the cluster won't start up properly...
> >> here's the namenode log: http://pastebin.com/gmu0B7yd
> >>
> >> any ideas why it wouldn't start up?
> >> thx
> >>
> >>
> >> On Thu, Oct 6, 2011 at 6:58 PM, patrick sang
> >> <si...@gmail.com>wrote:
> >> > I would say your namenode write metadata in local fs (where your
> >> secondary> namenode will pull files), and NFS mount.
> >> >
> >> >  <property>
> >> >    <name>dfs.name.dir</name>
> >> >    <value>/hadoop/name,/hadoop/nfs_server_name</value>
> >> >  </property>
> >> >
> >> >
> >> > my 0.02$
> >> > P
> >> >
> >> > On Thu, Oct 6, 2011 at 12:04 AM, shanmuganathan.r <
> >> > shanmuganathan.r@zohocorp.com> wrote:
> >> >
> >> > > Hi Kai,
> >> > >
> >> > >      There is no datas stored  in the secondarynamenode related
> >> to the
> >> > > Hadoop cluster . Am I correct?
> >> > > If it correct means If we run the secondaryname node in
> >> separate machine
> >> > > then fetching , merging and transferring time is increased if
> >> the cluster
> >> > > has large data in the namenode fsimage file . At the time if
> >> fail over
> >> > > occurs , then how can we recover the nearly one hour changes in
> >> the HDFS
> >> > > file ? (default check point time is one hour)
> >> > >
> >> > > Thanks R.Shanmuganathan
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > ---- On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigt<k@123.org&gt;
> >> wrote> > ----
> >> > >
> >> > >
> >> > > Hi,
> >> > >
> >> > > the secondary namenode only fetches the two files when a
> >> checkpointing is
> >> > > needed.
> >> > >
> >> > > Kai
> >> > >
> >> > > Am 06.10.2011 um 08:45 schrieb shanmuganathan.r:
> >> > >
> >> > > &gt; Hi Kai,
> >> > > &gt;
> >> > > &gt; In the Second part I meant
> >> > > &gt;
> >> > > &gt;
> >> > > &gt; Is the secondary namenode also contain the FSImage file or
> >> the two
> >> > > files(FSImage and EdiltLog) are transferred from the namenode
> >> at the
> >> > > checkpoint time.
> >> > > &gt;
> >> > > &gt;
> >> > > &gt; Thanks
> >> > > &gt; Shanmuganathan
> >> > > &gt;
> >> > > &gt;
> >> > > &gt;
> >> > > &gt;
> >> > > &gt;
> >> > > &gt; ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai
> >> Voigt&amp;lt;k@123.org> &amp;gt;
> >> > > wrote ----
> >> > > &gt;
> >> > > &gt;
> >> > > &gt; Hi,
> >> > > &gt;
> >> > > &gt; you're correct when saying the namenode hosts the fsimage
> >> file and
> >> > the
> >> > > edits log file.
> >> > > &gt;
> >> > > &gt; The fsimage file contains a snapshot of the HDFS metadata (a
> >> > filename
> >> > > to blocks list mapping). Whenever there is a change to HDFS, it
> >> will be
> >> > > appended to the edits file. Think of it as a database
> >> transaction log,
> >> > where
> >> > > changes will not be applied to the datafile, but appended to a
> >> log.> > &gt;
> >> > > &gt; To prevent the edits file growing infinitely, the
> >> secondary namenode
> >> > > periodically pulls these two files, and the namenode starts
> >> writing> changes
> >> > > to a new edits file. Then, the secondary namenode merges the
> >> changes from
> >> > > the edits file with the old snapshot from the fsimage file and
> >> creates an
> >> > > updated fsimage file. This updated fsimage file is then copied
> >> to the
> >> > > namenode.
> >> > > &gt;
> >> > > &gt; Then, the entire cycle starts again. To answer your
> >> question: The
> >> > > namenode has both files, even if the secondary namenode is
> >> running on a
> >> > > different machine.
> >> > > &gt;
> >> > > &gt; Kai
> >> > > &gt;
> >> > > &gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r:
> >> > > &gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt; Hi All,
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt; I have a doubt in hadoop secondary namenode
> >> concept .
> >> > Please
> >> > > correct if the following statements are wrong .
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt; The namenode hosts the fsimage and edit log
> >> files. The
> >> > > secondary namenode hosts the fsimage file only. At the time of
> >> checkpoint> > the edit log file is transferred to the secondary
> >> namenode and the both
> >> > > files are merged, Then the updated fsimage file is transferred
> >> to the
> >> > > namenode . Is it correct?
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt; If we run the secondary namenode in separate
> >> machine , then
> >> > > both machines contain the fsimage file . Namenode only contains
> >> the> editlog
> >> > > file. Is it true?
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt; Thanks R.Shanmuganathan
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt; &amp;gt;
> >> > > &gt;
> >> > > &gt; --
> >> > > &gt; Kai Voigt
> >> > > &gt; k@123.org
> >> > > &gt;
> >> > > &gt;
> >> > > &gt;
> >> > > &gt;
> >> > > &gt;
> >> > > &gt;
> >> > > &gt;
> >> > >
> >> > > --
> >> > > Kai Voigt
> >> > > k@123.org
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> >
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Secondary namenode fsimage concept

Posted by Harsh J <ha...@cloudera.com>.
Generally you just gotta ensure that your rpc.lockd service is up and
running on both ends, to allow for locking over NFS.

On Tue, Oct 11, 2011 at 8:16 AM, Uma Maheswara Rao G 72686
<ma...@huawei.com> wrote:
> Hi,
>
> It looks to me that, problem with your NFS. It is not supporting locks. Which version of NFS are you using?
> Please check your NFS locking support by writing simple program for file locking.
>
> I think NFS4 supports locking ( i did not tried).
>
> http://nfs.sourceforge.net/
>
>  A6. What are the main new features in version 4 of the NFS protocol?
>  *NFS Versions 2 and 3 are stateless protocols, but NFS Version 4 introduces state. An NFS Version 4 client uses state to notify an NFS Version 4 server of its intentions on a file: locking, reading, writing, and so on. An NFS Version 4 server can return information to a client about what other clients have intentions on a file to allow a client to cache file data more aggressively via delegation. To help keep state consistent, more sophisticated client and server reboot recovery mechanisms are built in to the NFS Version 4 protocol.
>  *NFS Version 4 introduces support for byte-range locking and share reservation. Locking in NFS Version 4 is lease-based, so an NFS Version 4 client must maintain contact with an NFS Version 4 server to continue extending its open and lock leases.
>
>
> Regards,
> Uma
> ----- Original Message -----
> From: Shouguo Li <th...@gmail.com>
> Date: Tuesday, October 11, 2011 2:31 am
> Subject: Re: Secondary namenode fsimage concept
> To: common-user@hadoop.apache.org
>
>> hey parick
>>
>> i wanted to configure my cluster to write namenode metadata to
>> multipledirectories as well:
>>  <property>
>>    <name>dfs.name.dir</name>
>>    <value>/hadoop/var/name,/mnt/hadoop/var/name</value>
>>  </property>
>>
>> in my case, /hadoop/var/name is local directory,
>> /mnt/hadoop/var/name is NFS
>> volume. i took down the cluster first, then copied over files from
>> /hadoop/var/name to /mnt/hadoop/var/name, and then tried to start
>> up the
>> cluster. but the cluster won't start up properly...
>> here's the namenode log: http://pastebin.com/gmu0B7yd
>>
>> any ideas why it wouldn't start up?
>> thx
>>
>>
>> On Thu, Oct 6, 2011 at 6:58 PM, patrick sang
>> <si...@gmail.com>wrote:
>> > I would say your namenode write metadata in local fs (where your
>> secondary> namenode will pull files), and NFS mount.
>> >
>> >  <property>
>> >    <name>dfs.name.dir</name>
>> >    <value>/hadoop/name,/hadoop/nfs_server_name</value>
>> >  </property>
>> >
>> >
>> > my 0.02$
>> > P
>> >
>> > On Thu, Oct 6, 2011 at 12:04 AM, shanmuganathan.r <
>> > shanmuganathan.r@zohocorp.com> wrote:
>> >
>> > > Hi Kai,
>> > >
>> > >      There is no datas stored  in the secondarynamenode related
>> to the
>> > > Hadoop cluster . Am I correct?
>> > > If it correct means If we run the secondaryname node in
>> separate machine
>> > > then fetching , merging and transferring time is increased if
>> the cluster
>> > > has large data in the namenode fsimage file . At the time if
>> fail over
>> > > occurs , then how can we recover the nearly one hour changes in
>> the HDFS
>> > > file ? (default check point time is one hour)
>> > >
>> > > Thanks R.Shanmuganathan
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > ---- On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigt<k@123.org&gt;
>> wrote> > ----
>> > >
>> > >
>> > > Hi,
>> > >
>> > > the secondary namenode only fetches the two files when a
>> checkpointing is
>> > > needed.
>> > >
>> > > Kai
>> > >
>> > > Am 06.10.2011 um 08:45 schrieb shanmuganathan.r:
>> > >
>> > > &gt; Hi Kai,
>> > > &gt;
>> > > &gt; In the Second part I meant
>> > > &gt;
>> > > &gt;
>> > > &gt; Is the secondary namenode also contain the FSImage file or
>> the two
>> > > files(FSImage and EdiltLog) are transferred from the namenode
>> at the
>> > > checkpoint time.
>> > > &gt;
>> > > &gt;
>> > > &gt; Thanks
>> > > &gt; Shanmuganathan
>> > > &gt;
>> > > &gt;
>> > > &gt;
>> > > &gt;
>> > > &gt;
>> > > &gt; ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai
>> Voigt&amp;lt;k@123.org> &amp;gt;
>> > > wrote ----
>> > > &gt;
>> > > &gt;
>> > > &gt; Hi,
>> > > &gt;
>> > > &gt; you're correct when saying the namenode hosts the fsimage
>> file and
>> > the
>> > > edits log file.
>> > > &gt;
>> > > &gt; The fsimage file contains a snapshot of the HDFS metadata (a
>> > filename
>> > > to blocks list mapping). Whenever there is a change to HDFS, it
>> will be
>> > > appended to the edits file. Think of it as a database
>> transaction log,
>> > where
>> > > changes will not be applied to the datafile, but appended to a
>> log.> > &gt;
>> > > &gt; To prevent the edits file growing infinitely, the
>> secondary namenode
>> > > periodically pulls these two files, and the namenode starts
>> writing> changes
>> > > to a new edits file. Then, the secondary namenode merges the
>> changes from
>> > > the edits file with the old snapshot from the fsimage file and
>> creates an
>> > > updated fsimage file. This updated fsimage file is then copied
>> to the
>> > > namenode.
>> > > &gt;
>> > > &gt; Then, the entire cycle starts again. To answer your
>> question: The
>> > > namenode has both files, even if the secondary namenode is
>> running on a
>> > > different machine.
>> > > &gt;
>> > > &gt; Kai
>> > > &gt;
>> > > &gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r:
>> > > &gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt; Hi All,
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt; I have a doubt in hadoop secondary namenode
>> concept .
>> > Please
>> > > correct if the following statements are wrong .
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt; The namenode hosts the fsimage and edit log
>> files. The
>> > > secondary namenode hosts the fsimage file only. At the time of
>> checkpoint> > the edit log file is transferred to the secondary
>> namenode and the both
>> > > files are merged, Then the updated fsimage file is transferred
>> to the
>> > > namenode . Is it correct?
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt; If we run the secondary namenode in separate
>> machine , then
>> > > both machines contain the fsimage file . Namenode only contains
>> the> editlog
>> > > file. Is it true?
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt; Thanks R.Shanmuganathan
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt; &amp;gt;
>> > > &gt;
>> > > &gt; --
>> > > &gt; Kai Voigt
>> > > &gt; k@123.org
>> > > &gt;
>> > > &gt;
>> > > &gt;
>> > > &gt;
>> > > &gt;
>> > > &gt;
>> > > &gt;
>> > >
>> > > --
>> > > Kai Voigt
>> > > k@123.org
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>>
>



-- 
Harsh J

Re: Secondary namenode fsimage concept

Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.
Hi,

It looks to me that, problem with your NFS. It is not supporting locks. Which version of NFS are you using? 
Please check your NFS locking support by writing simple program for file locking.

I think NFS4 supports locking ( i did not tried).

http://nfs.sourceforge.net/

  A6. What are the main new features in version 4 of the NFS protocol?
  *NFS Versions 2 and 3 are stateless protocols, but NFS Version 4 introduces state. An NFS Version 4 client uses state to notify an NFS Version 4 server of its intentions on a file: locking, reading, writing, and so on. An NFS Version 4 server can return information to a client about what other clients have intentions on a file to allow a client to cache file data more aggressively via delegation. To help keep state consistent, more sophisticated client and server reboot recovery mechanisms are built in to the NFS Version 4 protocol.
 *NFS Version 4 introduces support for byte-range locking and share reservation. Locking in NFS Version 4 is lease-based, so an NFS Version 4 client must maintain contact with an NFS Version 4 server to continue extending its open and lock leases. 


Regards,
Uma
----- Original Message -----
From: Shouguo Li <th...@gmail.com>
Date: Tuesday, October 11, 2011 2:31 am
Subject: Re: Secondary namenode fsimage concept
To: common-user@hadoop.apache.org

> hey parick
> 
> i wanted to configure my cluster to write namenode metadata to 
> multipledirectories as well:
>  <property>
>    <name>dfs.name.dir</name>
>    <value>/hadoop/var/name,/mnt/hadoop/var/name</value>
>  </property>
> 
> in my case, /hadoop/var/name is local directory, 
> /mnt/hadoop/var/name is NFS
> volume. i took down the cluster first, then copied over files from
> /hadoop/var/name to /mnt/hadoop/var/name, and then tried to start 
> up the
> cluster. but the cluster won't start up properly...
> here's the namenode log: http://pastebin.com/gmu0B7yd
> 
> any ideas why it wouldn't start up?
> thx
> 
> 
> On Thu, Oct 6, 2011 at 6:58 PM, patrick sang 
> <si...@gmail.com>wrote:
> > I would say your namenode write metadata in local fs (where your 
> secondary> namenode will pull files), and NFS mount.
> >
> >  <property>
> >    <name>dfs.name.dir</name>
> >    <value>/hadoop/name,/hadoop/nfs_server_name</value>
> >  </property>
> >
> >
> > my 0.02$
> > P
> >
> > On Thu, Oct 6, 2011 at 12:04 AM, shanmuganathan.r <
> > shanmuganathan.r@zohocorp.com> wrote:
> >
> > > Hi Kai,
> > >
> > >      There is no datas stored  in the secondarynamenode related 
> to the
> > > Hadoop cluster . Am I correct?
> > > If it correct means If we run the secondaryname node in 
> separate machine
> > > then fetching , merging and transferring time is increased if 
> the cluster
> > > has large data in the namenode fsimage file . At the time if 
> fail over
> > > occurs , then how can we recover the nearly one hour changes in 
> the HDFS
> > > file ? (default check point time is one hour)
> > >
> > > Thanks R.Shanmuganathan
> > >
> > >
> > >
> > >
> > >
> > >
> > > ---- On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigt<k@123.org&gt; 
> wrote> > ----
> > >
> > >
> > > Hi,
> > >
> > > the secondary namenode only fetches the two files when a 
> checkpointing is
> > > needed.
> > >
> > > Kai
> > >
> > > Am 06.10.2011 um 08:45 schrieb shanmuganathan.r:
> > >
> > > &gt; Hi Kai,
> > > &gt;
> > > &gt; In the Second part I meant
> > > &gt;
> > > &gt;
> > > &gt; Is the secondary namenode also contain the FSImage file or 
> the two
> > > files(FSImage and EdiltLog) are transferred from the namenode 
> at the
> > > checkpoint time.
> > > &gt;
> > > &gt;
> > > &gt; Thanks
> > > &gt; Shanmuganathan
> > > &gt;
> > > &gt;
> > > &gt;
> > > &gt;
> > > &gt;
> > > &gt; ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai 
> Voigt&amp;lt;k@123.org> &amp;gt;
> > > wrote ----
> > > &gt;
> > > &gt;
> > > &gt; Hi,
> > > &gt;
> > > &gt; you're correct when saying the namenode hosts the fsimage 
> file and
> > the
> > > edits log file.
> > > &gt;
> > > &gt; The fsimage file contains a snapshot of the HDFS metadata (a
> > filename
> > > to blocks list mapping). Whenever there is a change to HDFS, it 
> will be
> > > appended to the edits file. Think of it as a database 
> transaction log,
> > where
> > > changes will not be applied to the datafile, but appended to a 
> log.> > &gt;
> > > &gt; To prevent the edits file growing infinitely, the 
> secondary namenode
> > > periodically pulls these two files, and the namenode starts 
> writing> changes
> > > to a new edits file. Then, the secondary namenode merges the 
> changes from
> > > the edits file with the old snapshot from the fsimage file and 
> creates an
> > > updated fsimage file. This updated fsimage file is then copied 
> to the
> > > namenode.
> > > &gt;
> > > &gt; Then, the entire cycle starts again. To answer your 
> question: The
> > > namenode has both files, even if the secondary namenode is 
> running on a
> > > different machine.
> > > &gt;
> > > &gt; Kai
> > > &gt;
> > > &gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r:
> > > &gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt; Hi All,
> > > &gt; &amp;gt;
> > > &gt; &amp;gt; I have a doubt in hadoop secondary namenode 
> concept .
> > Please
> > > correct if the following statements are wrong .
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt; The namenode hosts the fsimage and edit log 
> files. The
> > > secondary namenode hosts the fsimage file only. At the time of 
> checkpoint> > the edit log file is transferred to the secondary 
> namenode and the both
> > > files are merged, Then the updated fsimage file is transferred 
> to the
> > > namenode . Is it correct?
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt; If we run the secondary namenode in separate 
> machine , then
> > > both machines contain the fsimage file . Namenode only contains 
> the> editlog
> > > file. Is it true?
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt; Thanks R.Shanmuganathan
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt; &amp;gt;
> > > &gt;
> > > &gt; --
> > > &gt; Kai Voigt
> > > &gt; k@123.org
> > > &gt;
> > > &gt;
> > > &gt;
> > > &gt;
> > > &gt;
> > > &gt;
> > > &gt;
> > >
> > > --
> > > Kai Voigt
> > > k@123.org
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> 

Re: Secondary namenode fsimage concept

Posted by Shouguo Li <th...@gmail.com>.
hey parick

i wanted to configure my cluster to write namenode metadata to multiple
directories as well:
  <property>
    <name>dfs.name.dir</name>
    <value>/hadoop/var/name,/mnt/hadoop/var/name</value>
  </property>

in my case, /hadoop/var/name is local directory, /mnt/hadoop/var/name is NFS
volume. i took down the cluster first, then copied over files from
/hadoop/var/name to /mnt/hadoop/var/name, and then tried to start up the
cluster. but the cluster won't start up properly...
here's the namenode log: http://pastebin.com/gmu0B7yd

any ideas why it wouldn't start up?
thx


On Thu, Oct 6, 2011 at 6:58 PM, patrick sang <si...@gmail.com>wrote:

> I would say your namenode write metadata in local fs (where your secondary
> namenode will pull files), and NFS mount.
>
>  <property>
>    <name>dfs.name.dir</name>
>    <value>/hadoop/name,/hadoop/nfs_server_name</value>
>  </property>
>
>
> my 0.02$
> P
>
> On Thu, Oct 6, 2011 at 12:04 AM, shanmuganathan.r <
> shanmuganathan.r@zohocorp.com> wrote:
>
> > Hi Kai,
> >
> >      There is no datas stored  in the secondarynamenode related to the
> > Hadoop cluster . Am I correct?
> > If it correct means If we run the secondaryname node in separate machine
> > then fetching , merging and transferring time is increased if the cluster
> > has large data in the namenode fsimage file . At the time if fail over
> > occurs , then how can we recover the nearly one hour changes in the HDFS
> > file ? (default check point time is one hour)
> >
> > Thanks R.Shanmuganathan
> >
> >
> >
> >
> >
> >
> > ---- On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigt&lt;k@123.org&gt; wrote
> > ----
> >
> >
> > Hi,
> >
> > the secondary namenode only fetches the two files when a checkpointing is
> > needed.
> >
> > Kai
> >
> > Am 06.10.2011 um 08:45 schrieb shanmuganathan.r:
> >
> > &gt; Hi Kai,
> > &gt;
> > &gt; In the Second part I meant
> > &gt;
> > &gt;
> > &gt; Is the secondary namenode also contain the FSImage file or the two
> > files(FSImage and EdiltLog) are transferred from the namenode at the
> > checkpoint time.
> > &gt;
> > &gt;
> > &gt; Thanks
> > &gt; Shanmuganathan
> > &gt;
> > &gt;
> > &gt;
> > &gt;
> > &gt;
> > &gt; ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai Voigt&amp;lt;k@123.org
> &amp;gt;
> > wrote ----
> > &gt;
> > &gt;
> > &gt; Hi,
> > &gt;
> > &gt; you're correct when saying the namenode hosts the fsimage file and
> the
> > edits log file.
> > &gt;
> > &gt; The fsimage file contains a snapshot of the HDFS metadata (a
> filename
> > to blocks list mapping). Whenever there is a change to HDFS, it will be
> > appended to the edits file. Think of it as a database transaction log,
> where
> > changes will not be applied to the datafile, but appended to a log.
> > &gt;
> > &gt; To prevent the edits file growing infinitely, the secondary namenode
> > periodically pulls these two files, and the namenode starts writing
> changes
> > to a new edits file. Then, the secondary namenode merges the changes from
> > the edits file with the old snapshot from the fsimage file and creates an
> > updated fsimage file. This updated fsimage file is then copied to the
> > namenode.
> > &gt;
> > &gt; Then, the entire cycle starts again. To answer your question: The
> > namenode has both files, even if the secondary namenode is running on a
> > different machine.
> > &gt;
> > &gt; Kai
> > &gt;
> > &gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r:
> > &gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt; Hi All,
> > &gt; &amp;gt;
> > &gt; &amp;gt; I have a doubt in hadoop secondary namenode concept .
> Please
> > correct if the following statements are wrong .
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt; The namenode hosts the fsimage and edit log files. The
> > secondary namenode hosts the fsimage file only. At the time of checkpoint
> > the edit log file is transferred to the secondary namenode and the both
> > files are merged, Then the updated fsimage file is transferred to the
> > namenode . Is it correct?
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt; If we run the secondary namenode in separate machine , then
> > both machines contain the fsimage file . Namenode only contains the
> editlog
> > file. Is it true?
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt; Thanks R.Shanmuganathan
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt; &amp;gt;
> > &gt;
> > &gt; --
> > &gt; Kai Voigt
> > &gt; k@123.org
> > &gt;
> > &gt;
> > &gt;
> > &gt;
> > &gt;
> > &gt;
> > &gt;
> >
> > --
> > Kai Voigt
> > k@123.org
> >
> >
> >
> >
> >
> >
> >
>

Re: Secondary namenode fsimage concept

Posted by patrick sang <si...@gmail.com>.
I would say your namenode write metadata in local fs (where your secondary
namenode will pull files), and NFS mount.

  <property>
    <name>dfs.name.dir</name>
    <value>/hadoop/name,/hadoop/nfs_server_name</value>
  </property>


my 0.02$
P

On Thu, Oct 6, 2011 at 12:04 AM, shanmuganathan.r <
shanmuganathan.r@zohocorp.com> wrote:

> Hi Kai,
>
>      There is no datas stored  in the secondarynamenode related to the
> Hadoop cluster . Am I correct?
> If it correct means If we run the secondaryname node in separate machine
> then fetching , merging and transferring time is increased if the cluster
> has large data in the namenode fsimage file . At the time if fail over
> occurs , then how can we recover the nearly one hour changes in the HDFS
> file ? (default check point time is one hour)
>
> Thanks R.Shanmuganathan
>
>
>
>
>
>
> ---- On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigt&lt;k@123.org&gt; wrote
> ----
>
>
> Hi,
>
> the secondary namenode only fetches the two files when a checkpointing is
> needed.
>
> Kai
>
> Am 06.10.2011 um 08:45 schrieb shanmuganathan.r:
>
> &gt; Hi Kai,
> &gt;
> &gt; In the Second part I meant
> &gt;
> &gt;
> &gt; Is the secondary namenode also contain the FSImage file or the two
> files(FSImage and EdiltLog) are transferred from the namenode at the
> checkpoint time.
> &gt;
> &gt;
> &gt; Thanks
> &gt; Shanmuganathan
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
> &gt; ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai Voigt&amp;lt;k@123.org&amp;gt;
> wrote ----
> &gt;
> &gt;
> &gt; Hi,
> &gt;
> &gt; you're correct when saying the namenode hosts the fsimage file and the
> edits log file.
> &gt;
> &gt; The fsimage file contains a snapshot of the HDFS metadata (a filename
> to blocks list mapping). Whenever there is a change to HDFS, it will be
> appended to the edits file. Think of it as a database transaction log, where
> changes will not be applied to the datafile, but appended to a log.
> &gt;
> &gt; To prevent the edits file growing infinitely, the secondary namenode
> periodically pulls these two files, and the namenode starts writing changes
> to a new edits file. Then, the secondary namenode merges the changes from
> the edits file with the old snapshot from the fsimage file and creates an
> updated fsimage file. This updated fsimage file is then copied to the
> namenode.
> &gt;
> &gt; Then, the entire cycle starts again. To answer your question: The
> namenode has both files, even if the secondary namenode is running on a
> different machine.
> &gt;
> &gt; Kai
> &gt;
> &gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r:
> &gt;
> &gt; &amp;gt;
> &gt; &amp;gt; Hi All,
> &gt; &amp;gt;
> &gt; &amp;gt; I have a doubt in hadoop secondary namenode concept . Please
> correct if the following statements are wrong .
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt; &amp;gt; The namenode hosts the fsimage and edit log files. The
> secondary namenode hosts the fsimage file only. At the time of checkpoint
> the edit log file is transferred to the secondary namenode and the both
> files are merged, Then the updated fsimage file is transferred to the
> namenode . Is it correct?
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt; &amp;gt; If we run the secondary namenode in separate machine , then
> both machines contain the fsimage file . Namenode only contains the editlog
> file. Is it true?
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt; &amp;gt; Thanks R.Shanmuganathan
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt; &amp;gt;
> &gt;
> &gt; --
> &gt; Kai Voigt
> &gt; k@123.org
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
>
> --
> Kai Voigt
> k@123.org
>
>
>
>
>
>
>

Re: Secondary namenode fsimage concept

Posted by Kai Voigt <k...@123.org>.
Hi,

yes, the secondary namenode is actually a badly named piece of software, as it's not a namenode at all. And it's going to be renamed to checkpoint node.

To prevent metadata loss when your namenode fails, you should write the namenode files to a local RAID and also a networked storage (NFS, SAN, DRBD). It's not the secondary namenode's task to make the metadata available.

Kai

Am 06.10.2011 um 09:04 schrieb shanmuganathan.r:

> Hi Kai,
> 
>      There is no datas stored  in the secondarynamenode related to the Hadoop cluster . Am I correct?
> If it correct means If we run the secondaryname node in separate machine then fetching , merging and transferring time is increased if the cluster has large data in the namenode fsimage file . At the time if fail over occurs , then how can we recover the nearly one hour changes in the HDFS file ? (default check point time is one hour)
> 
> Thanks R.Shanmuganathan  
> 
> 
> 
> 
> 
> 
> ---- On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigt&lt;k@123.org&gt; wrote ---- 
> 
> 
> Hi, 
> 
> the secondary namenode only fetches the two files when a checkpointing is needed. 
> 
> Kai 
> 
> Am 06.10.2011 um 08:45 schrieb shanmuganathan.r: 
> 
> &gt; Hi Kai, 
> &gt; 
> &gt; In the Second part I meant 
> &gt; 
> &gt; 
> &gt; Is the secondary namenode also contain the FSImage file or the two files(FSImage and EdiltLog) are transferred from the namenode at the checkpoint time. 
> &gt; 
> &gt; 
> &gt; Thanks 
> &gt; Shanmuganathan 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> &gt; ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai Voigt&amp;lt;k@123.org&amp;gt; wrote ---- 
> &gt; 
> &gt; 
> &gt; Hi, 
> &gt; 
> &gt; you're correct when saying the namenode hosts the fsimage file and the edits log file. 
> &gt; 
> &gt; The fsimage file contains a snapshot of the HDFS metadata (a filename to blocks list mapping). Whenever there is a change to HDFS, it will be appended to the edits file. Think of it as a database transaction log, where changes will not be applied to the datafile, but appended to a log. 
> &gt; 
> &gt; To prevent the edits file growing infinitely, the secondary namenode periodically pulls these two files, and the namenode starts writing changes to a new edits file. Then, the secondary namenode merges the changes from the edits file with the old snapshot from the fsimage file and creates an updated fsimage file. This updated fsimage file is then copied to the namenode. 
> &gt; 
> &gt; Then, the entire cycle starts again. To answer your question: The namenode has both files, even if the secondary namenode is running on a different machine. 
> &gt; 
> &gt; Kai 
> &gt; 
> &gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r: 
> &gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; Hi All, 
> &gt; &amp;gt; 
> &gt; &amp;gt; I have a doubt in hadoop secondary namenode concept . Please correct if the following statements are wrong . 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; The namenode hosts the fsimage and edit log files. The secondary namenode hosts the fsimage file only. At the time of checkpoint the edit log file is transferred to the secondary namenode and the both files are merged, Then the updated fsimage file is transferred to the namenode . Is it correct? 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; If we run the secondary namenode in separate machine , then both machines contain the fsimage file . Namenode only contains the editlog file. Is it true? 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; Thanks R.Shanmuganathan 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; &amp;gt; 
> &gt; 
> &gt; -- 
> &gt; Kai Voigt 
> &gt; k@123.org 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> 
> -- 
> Kai Voigt 
> k@123.org 
> 
> 
> 
> 
> 
> 

-- 
Kai Voigt
k@123.org





Re: Secondary namenode fsimage concept

Posted by "shanmuganathan.r" <sh...@zohocorp.com>.
Hi Kai,

      There is no datas stored  in the secondarynamenode related to the Hadoop cluster . Am I correct?
If it correct means If we run the secondaryname node in separate machine then fetching , merging and transferring time is increased if the cluster has large data in the namenode fsimage file . At the time if fail over occurs , then how can we recover the nearly one hour changes in the HDFS file ? (default check point time is one hour)

Thanks R.Shanmuganathan  






---- On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigt&lt;k@123.org&gt; wrote ---- 


Hi, 
 
the secondary namenode only fetches the two files when a checkpointing is needed. 
 
Kai 
 
Am 06.10.2011 um 08:45 schrieb shanmuganathan.r: 
 
&gt; Hi Kai, 
&gt; 
&gt; In the Second part I meant 
&gt; 
&gt; 
&gt; Is the secondary namenode also contain the FSImage file or the two files(FSImage and EdiltLog) are transferred from the namenode at the checkpoint time. 
&gt; 
&gt; 
&gt; Thanks 
&gt; Shanmuganathan 
&gt; 
&gt; 
&gt; 
&gt; 
&gt; 
&gt; ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai Voigt&amp;lt;k@123.org&amp;gt; wrote ---- 
&gt; 
&gt; 
&gt; Hi, 
&gt; 
&gt; you're correct when saying the namenode hosts the fsimage file and the edits log file. 
&gt; 
&gt; The fsimage file contains a snapshot of the HDFS metadata (a filename to blocks list mapping). Whenever there is a change to HDFS, it will be appended to the edits file. Think of it as a database transaction log, where changes will not be applied to the datafile, but appended to a log. 
&gt; 
&gt; To prevent the edits file growing infinitely, the secondary namenode periodically pulls these two files, and the namenode starts writing changes to a new edits file. Then, the secondary namenode merges the changes from the edits file with the old snapshot from the fsimage file and creates an updated fsimage file. This updated fsimage file is then copied to the namenode. 
&gt; 
&gt; Then, the entire cycle starts again. To answer your question: The namenode has both files, even if the secondary namenode is running on a different machine. 
&gt; 
&gt; Kai 
&gt; 
&gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r: 
&gt; 
&gt; &amp;gt; 
&gt; &amp;gt; Hi All, 
&gt; &amp;gt; 
&gt; &amp;gt; I have a doubt in hadoop secondary namenode concept . Please correct if the following statements are wrong . 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; &amp;gt; The namenode hosts the fsimage and edit log files. The secondary namenode hosts the fsimage file only. At the time of checkpoint the edit log file is transferred to the secondary namenode and the both files are merged, Then the updated fsimage file is transferred to the namenode . Is it correct? 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; &amp;gt; If we run the secondary namenode in separate machine , then both machines contain the fsimage file . Namenode only contains the editlog file. Is it true? 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; &amp;gt; Thanks R.Shanmuganathan 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; &amp;gt; 
&gt; 
&gt; -- 
&gt; Kai Voigt 
&gt; k@123.org 
&gt; 
&gt; 
&gt; 
&gt; 
&gt; 
&gt; 
&gt; 
 
-- 
Kai Voigt 
k@123.org 
 
 
 
 



Re: Secondary namenode fsimage concept

Posted by Kai Voigt <k...@123.org>.
Hi,

the secondary namenode only fetches the two files when a checkpointing is needed.

Kai

Am 06.10.2011 um 08:45 schrieb shanmuganathan.r:

> Hi Kai,
> 
>      In the Second part I meant 
> 
> 
> Is the secondary namenode also contain the FSImage file or the two files(FSImage and EdiltLog) are transferred from the namenode at the checkpoint time.
> 
> 
> Thanks 
> Shanmuganathan
> 
> 
> 
> 
> 
> ---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai Voigt&lt;k@123.org&gt; wrote ---- 
> 
> 
> Hi, 
> 
> you're correct when saying the namenode hosts the fsimage file and the edits log file. 
> 
> The fsimage file contains a snapshot of the HDFS metadata (a filename to blocks list mapping). Whenever there is a change to HDFS, it will be appended to the edits file. Think of it as a database transaction log, where changes will not be applied to the datafile, but appended to a log. 
> 
> To prevent the edits file growing infinitely, the secondary namenode periodically pulls these two files, and the namenode starts writing changes to a new edits file. Then, the secondary namenode merges the changes from the edits file with the old snapshot from the fsimage file and creates an updated fsimage file. This updated fsimage file is then copied to the namenode. 
> 
> Then, the entire cycle starts again. To answer your question: The namenode has both files, even if the secondary namenode is running on a different machine. 
> 
> Kai 
> 
> Am 06.10.2011 um 07:57 schrieb shanmuganathan.r: 
> 
> &gt; 
> &gt; Hi All, 
> &gt; 
> &gt; I have a doubt in hadoop secondary namenode concept . Please correct if the following statements are wrong . 
> &gt; 
> &gt; 
> &gt; The namenode hosts the fsimage and edit log files. The secondary namenode hosts the fsimage file only. At the time of checkpoint the edit log file is transferred to the secondary namenode and the both files are merged, Then the updated fsimage file is transferred to the namenode . Is it correct? 
> &gt; 
> &gt; 
> &gt; If we run the secondary namenode in separate machine , then both machines contain the fsimage file . Namenode only contains the editlog file. Is it true? 
> &gt; 
> &gt; 
> &gt; 
> &gt; Thanks R.Shanmuganathan 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> &gt; 
> 
> -- 
> Kai Voigt 
> k@123.org 
> 
> 
> 
> 
> 
> 
> 

-- 
Kai Voigt
k@123.org





Re: Secondary namenode fsimage concept

Posted by "shanmuganathan.r" <sh...@zohocorp.com>.
Hi Kai,

      In the Second part I meant 


Is the secondary namenode also contain the FSImage file or the two files(FSImage and EdiltLog) are transferred from the namenode at the checkpoint time.


Thanks 
Shanmuganathan





---- On Thu, 06 Oct 2011 11:37:50 +0530 Kai Voigt&lt;k@123.org&gt; wrote ---- 


Hi, 
 
you're correct when saying the namenode hosts the fsimage file and the edits log file. 
 
The fsimage file contains a snapshot of the HDFS metadata (a filename to blocks list mapping). Whenever there is a change to HDFS, it will be appended to the edits file. Think of it as a database transaction log, where changes will not be applied to the datafile, but appended to a log. 
 
To prevent the edits file growing infinitely, the secondary namenode periodically pulls these two files, and the namenode starts writing changes to a new edits file. Then, the secondary namenode merges the changes from the edits file with the old snapshot from the fsimage file and creates an updated fsimage file. This updated fsimage file is then copied to the namenode. 
 
Then, the entire cycle starts again. To answer your question: The namenode has both files, even if the secondary namenode is running on a different machine. 
 
Kai 
 
Am 06.10.2011 um 07:57 schrieb shanmuganathan.r: 
 
&gt; 
&gt; Hi All, 
&gt; 
&gt; I have a doubt in hadoop secondary namenode concept . Please correct if the following statements are wrong . 
&gt; 
&gt; 
&gt; The namenode hosts the fsimage and edit log files. The secondary namenode hosts the fsimage file only. At the time of checkpoint the edit log file is transferred to the secondary namenode and the both files are merged, Then the updated fsimage file is transferred to the namenode . Is it correct? 
&gt; 
&gt; 
&gt; If we run the secondary namenode in separate machine , then both machines contain the fsimage file . Namenode only contains the editlog file. Is it true? 
&gt; 
&gt; 
&gt; 
&gt; Thanks R.Shanmuganathan 
&gt; 
&gt; 
&gt; 
&gt; 
&gt; 
&gt; 
 
-- 
Kai Voigt 
k@123.org 
 
 
 
 




Re: Secondary namenode fsimage concept

Posted by Kai Voigt <k...@123.org>.
Hi,

you're correct when saying the namenode hosts the fsimage file and the edits log file.

The fsimage file contains a snapshot of the HDFS metadata (a filename to blocks list mapping). Whenever there is a change to HDFS, it will be appended to the edits file. Think of it as a database transaction log, where changes will not be applied to the datafile, but appended to a log.

To prevent the edits file growing infinitely, the secondary namenode periodically pulls these two files, and the namenode starts writing changes to a new edits file. Then, the secondary namenode merges the changes from the edits file with the old snapshot from the fsimage file and creates an updated fsimage file. This updated fsimage file is then copied to the namenode.

Then, the entire cycle starts again. To answer your question: The namenode has both files, even if the secondary namenode is running on a different machine.

Kai

Am 06.10.2011 um 07:57 schrieb shanmuganathan.r:

> 
> Hi All,
> 
>            I have a doubt in hadoop secondary namenode concept . Please correct if the following statements are wrong .
> 
> 
> The namenode hosts the fsimage and edit log files. The secondary namenode hosts the fsimage file only. At the time of checkpoint the edit log file is transferred to the secondary namenode and the both files are merged, Then the updated fsimage file is transferred to the namenode . Is it correct?
> 
> 
> If we run the secondary namenode in separate machine , then both machines contain the fsimage file . Namenode only contains the editlog file. Is it true?
> 
> 
> 
> Thanks R.Shanmuganathan  
> 
> 
> 
> 
> 
> 

-- 
Kai Voigt
k@123.org