You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Amit Mittal <am...@gmail.com> on 2014/01/27 13:12:06 UTC

Processing steps of NameNode & Secondary NameNode

Hi,

I have a doubt of the processing steps of NameNode:

*Reference:* "Hadoop: The Definitive Guide:3rd Ed" book by "Tom White"
On page# 340 (Ch 10: HDFS > The file system image & edit log)

Text from book:
....
When a filesystem client performs a write operation (such as creating or
moving a file), it is first recorded in the edit log. The namenode also has
an in-memory representation of the filesystem metadata, which it updates
after the edit log has been modified. The in-memory metadata is used to
serve read requests.
The edit log is flushed and *synced *after every write before a success
code is returned to the client. For namenodes that write to multiple
directories, the write must be flushed and synced to every copy before
returning successfully. This ensures that no operation is lost due to
machine failure.
...
*Question 1: *The in-memory representation is updated before/after
returning to the client or it is done async while updating the status code
to client? I believe it should be before the status is sent to client.
*Question 2: *What does "synced after every write" means here? For one
file, there is only one writer. So when there is any write operation to the
file, it is recorded in the edit log and flushed, no other writer will be
working for this file. However there might be other writers working on
other files and for any operation to that, edit log will be updated. Now
there will multiple copies of edit log which will be merged. Is this
understanding correct ?
*Question 3:* Sorry, I did not get "For namenodes that* write to multiple
directories*, the write must be flushed and synced to *every copy* before
returning successfully." ? Especially the text in bold.

Thanks
Amit Mittal

Re: Processing steps of NameNode & Secondary NameNode

Posted by Haohui Mai <hm...@hortonworks.com>.

Conceptually you can think of the namenode is similar to a journal file
system. For each write, it updates the in-memory data structure, persists
the operations on the stable storage (i.e., calling sync to flush the
buffer of the edit logs), then responds to the client.

Note that all writes are serialized, which means the writes are given a
total order. There are no consistent issues between multiple clients.

For question 3, the NN can writes to multiple edit logs  with the same
content at the same time. This allows the operator to store a copy of edit
logs in NFS. In this case NN calls sync() for each edit log.

~Haohui




On Mon, Jan 27, 2014 at 4:12 AM, Amit Mittal <am...@gmail.com> wrote:

> Hi,
>
> I have a doubt of the processing steps of NameNode:
>
> *Reference:* "Hadoop: The Definitive Guide:3rd Ed" book by "Tom White"
> On page# 340 (Ch 10: HDFS > The file system image & edit log)
>
> Text from book:
> ....
> When a filesystem client performs a write operation (such as creating or
> moving a file), it is first recorded in the edit log. The namenode also has
> an in-memory representation of the filesystem metadata, which it updates
> after the edit log has been modified. The in-memory metadata is used to
> serve read requests.
> The edit log is flushed and *synced *after every write before a success
> code is returned to the client. For namenodes that write to multiple
> directories, the write must be flushed and synced to every copy before
> returning successfully. This ensures that no operation is lost due to
> machine failure.
> ...
> *Question 1: *The in-memory representation is updated before/after
> returning to the client or it is done async while updating the status code
> to client? I believe it should be before the status is sent to client.
> *Question 2: *What does "synced after every write" means here? For one
> file, there is only one writer. So when there is any write operation to the
> file, it is recorded in the edit log and flushed, no other writer will be
> working for this file. However there might be other writers working on
> other files and for any operation to that, edit log will be updated. Now
> there will multiple copies of edit log which will be merged. Is this
> understanding correct ?
> *Question 3:* Sorry, I did not get "For namenodes that* write to multiple
> directories*, the write must be flushed and synced to *every copy* before
> returning successfully." ? Especially the text in bold.
>
> Thanks
> Amit Mittal
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Processing steps of NameNode & Secondary NameNode

Posted by Haohui Mai <hm...@hortonworks.com>.

Conceptually you can think of the namenode is similar to a journal file
system. For each write, it updates the in-memory data structure, persists
the operations on the stable storage (i.e., calling sync to flush the
buffer of the edit logs), then responds to the client.

Note that all writes are serialized, which means the writes are given a
total order. There are no consistent issues between multiple clients.

For question 3, the NN can writes to multiple edit logs  with the same
content at the same time. This allows the operator to store a copy of edit
logs in NFS. In this case NN calls sync() for each edit log.

~Haohui




On Mon, Jan 27, 2014 at 4:12 AM, Amit Mittal <am...@gmail.com> wrote:

> Hi,
>
> I have a doubt of the processing steps of NameNode:
>
> *Reference:* "Hadoop: The Definitive Guide:3rd Ed" book by "Tom White"
> On page# 340 (Ch 10: HDFS > The file system image & edit log)
>
> Text from book:
> ....
> When a filesystem client performs a write operation (such as creating or
> moving a file), it is first recorded in the edit log. The namenode also has
> an in-memory representation of the filesystem metadata, which it updates
> after the edit log has been modified. The in-memory metadata is used to
> serve read requests.
> The edit log is flushed and *synced *after every write before a success
> code is returned to the client. For namenodes that write to multiple
> directories, the write must be flushed and synced to every copy before
> returning successfully. This ensures that no operation is lost due to
> machine failure.
> ...
> *Question 1: *The in-memory representation is updated before/after
> returning to the client or it is done async while updating the status code
> to client? I believe it should be before the status is sent to client.
> *Question 2: *What does "synced after every write" means here? For one
> file, there is only one writer. So when there is any write operation to the
> file, it is recorded in the edit log and flushed, no other writer will be
> working for this file. However there might be other writers working on
> other files and for any operation to that, edit log will be updated. Now
> there will multiple copies of edit log which will be merged. Is this
> understanding correct ?
> *Question 3:* Sorry, I did not get "For namenodes that* write to multiple
> directories*, the write must be flushed and synced to *every copy* before
> returning successfully." ? Especially the text in bold.
>
> Thanks
> Amit Mittal
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Processing steps of NameNode & Secondary NameNode

Posted by Haohui Mai <hm...@hortonworks.com>.

Conceptually you can think of the namenode is similar to a journal file
system. For each write, it updates the in-memory data structure, persists
the operations on the stable storage (i.e., calling sync to flush the
buffer of the edit logs), then responds to the client.

Note that all writes are serialized, which means the writes are given a
total order. There are no consistent issues between multiple clients.

For question 3, the NN can writes to multiple edit logs  with the same
content at the same time. This allows the operator to store a copy of edit
logs in NFS. In this case NN calls sync() for each edit log.

~Haohui




On Mon, Jan 27, 2014 at 4:12 AM, Amit Mittal <am...@gmail.com> wrote:

> Hi,
>
> I have a doubt of the processing steps of NameNode:
>
> *Reference:* "Hadoop: The Definitive Guide:3rd Ed" book by "Tom White"
> On page# 340 (Ch 10: HDFS > The file system image & edit log)
>
> Text from book:
> ....
> When a filesystem client performs a write operation (such as creating or
> moving a file), it is first recorded in the edit log. The namenode also has
> an in-memory representation of the filesystem metadata, which it updates
> after the edit log has been modified. The in-memory metadata is used to
> serve read requests.
> The edit log is flushed and *synced *after every write before a success
> code is returned to the client. For namenodes that write to multiple
> directories, the write must be flushed and synced to every copy before
> returning successfully. This ensures that no operation is lost due to
> machine failure.
> ...
> *Question 1: *The in-memory representation is updated before/after
> returning to the client or it is done async while updating the status code
> to client? I believe it should be before the status is sent to client.
> *Question 2: *What does "synced after every write" means here? For one
> file, there is only one writer. So when there is any write operation to the
> file, it is recorded in the edit log and flushed, no other writer will be
> working for this file. However there might be other writers working on
> other files and for any operation to that, edit log will be updated. Now
> there will multiple copies of edit log which will be merged. Is this
> understanding correct ?
> *Question 3:* Sorry, I did not get "For namenodes that* write to multiple
> directories*, the write must be flushed and synced to *every copy* before
> returning successfully." ? Especially the text in bold.
>
> Thanks
> Amit Mittal
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Processing steps of NameNode & Secondary NameNode

Posted by Haohui Mai <hm...@hortonworks.com>.

Conceptually you can think of the namenode is similar to a journal file
system. For each write, it updates the in-memory data structure, persists
the operations on the stable storage (i.e., calling sync to flush the
buffer of the edit logs), then responds to the client.

Note that all writes are serialized, which means the writes are given a
total order. There are no consistent issues between multiple clients.

For question 3, the NN can writes to multiple edit logs  with the same
content at the same time. This allows the operator to store a copy of edit
logs in NFS. In this case NN calls sync() for each edit log.

~Haohui




On Mon, Jan 27, 2014 at 4:12 AM, Amit Mittal <am...@gmail.com> wrote:

> Hi,
>
> I have a doubt of the processing steps of NameNode:
>
> *Reference:* "Hadoop: The Definitive Guide:3rd Ed" book by "Tom White"
> On page# 340 (Ch 10: HDFS > The file system image & edit log)
>
> Text from book:
> ....
> When a filesystem client performs a write operation (such as creating or
> moving a file), it is first recorded in the edit log. The namenode also has
> an in-memory representation of the filesystem metadata, which it updates
> after the edit log has been modified. The in-memory metadata is used to
> serve read requests.
> The edit log is flushed and *synced *after every write before a success
> code is returned to the client. For namenodes that write to multiple
> directories, the write must be flushed and synced to every copy before
> returning successfully. This ensures that no operation is lost due to
> machine failure.
> ...
> *Question 1: *The in-memory representation is updated before/after
> returning to the client or it is done async while updating the status code
> to client? I believe it should be before the status is sent to client.
> *Question 2: *What does "synced after every write" means here? For one
> file, there is only one writer. So when there is any write operation to the
> file, it is recorded in the edit log and flushed, no other writer will be
> working for this file. However there might be other writers working on
> other files and for any operation to that, edit log will be updated. Now
> there will multiple copies of edit log which will be merged. Is this
> understanding correct ?
> *Question 3:* Sorry, I did not get "For namenodes that* write to multiple
> directories*, the write must be flushed and synced to *every copy* before
> returning successfully." ? Especially the text in bold.
>
> Thanks
> Amit Mittal
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.