You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "girish vaitheeswaran (JIRA)" <ji...@apache.org> on 2008/04/14 20:03:04 UTC

[jira] Created: (HADOOP-3248) Improve Namenode startup performance

Improve Namenode startup performance
------------------------------------

                 Key: HADOOP-3248
                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
            Reporter: girish vaitheeswaran


One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.

Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes

Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes

As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.

Optimization 1: saveImage() 
======================
Avoid allocation of the UTF8 object.

Old code
=======
new UTF8(fullName).write(out);

New Code
========
out.writeUTF(fullName)


Optimization 2: saveImage()
======================
Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.

Old code
=======
fileINode.getPermissionStatus().write(out)

New Code
=========
out.writeBytes(fileINode.getUserName())
out.writeBytes(fileINode.getGroupName())
out.writeShort(fileINode.getFsPermission().toShort())

Optimization 3
============
loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.

Optimization 4
============
A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance

Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594873#action_12594873 ] 

Hudson commented on HADOOP-3248:
--------------------------------

Integrated in Hadoop-trunk #483 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/483/])

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0
>
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594308#action_12594308 ] 

Konstantin Shvachko commented on HADOOP-3248:
---------------------------------------------

Is it going to be a mistake if we set bufStore length to MAX_PATH_LENGTH*4 rather than 512*1024?
It seems to me that we are allocating a buffer 8 times larger than an actual file name could be.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "girish vaitheeswaran (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591533#action_12591533 ] 

girish vaitheeswaran commented on HADOOP-3248:
----------------------------------------------

The following configuration options were used for the Namenode to improve the startup time. The primary ones that helped were

-XX:NewSize=1G
-XX:MaxNewSize=1G
-Xms14000M
-XX:+UseParallelGC


export HADOOP_OPTS="-server -Dcom.sun.management.jmxremote.port=8998 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -XX:+UseParallelGC -XX:ErrorFile=/grid/0/hadoop/var/hs_err_pid%p.log -XX:NewSize=1G -XX:MaxNewSize=1G -Xms14000M -Xloggc:/export/crawlspace/tmp/hadoop-0.16.1/logs/gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"


> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>         Attachments: FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594407#action_12594407 ] 

dhruba borthakur commented on HADOOP-3248:
------------------------------------------

Hi Nicholas, I would hate to put in a reference to an Inode object from anywhere in the fs subsystem. The Inode object is a DFS object and has no place in the fs package. Does it make sense? Is it ok if I leave the code as it is in fastRestart3.patch? Thanks.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Attachment: fastRestarts2.patch

Incorporated most of Konstantin's comments. The saveImage method does not use the UTF8 class. No temporary sting allocations occur.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Attachment: fastRestarts3.patch

This fixed one findbugs warning. There is still one findbug warning that will be present in the final patch submitted here.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597115#action_12597115 ] 

Hudson commented on HADOOP-3248:
--------------------------------

Integrated in Hadoop-trunk #492 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/492/])

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0
>
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "girish vaitheeswaran (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592139#action_12592139 ] 

girish vaitheeswaran commented on HADOOP-3248:
----------------------------------------------

Dhruba,
Can you provide me with a hadoop-core jar file that I can test with this new patch. I would like to make sure that we get the numbers with this recent patch. I will test it out and update the JIRA with the results.
Thanks
-girish

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Attachment: fastRestarts3.patch

I incorporated most of Konstantin's comments. I left the byte array size to be 512K because it is better to avoid reallocs, if any, and this is a temporary buffer and is alive only during saveImage. I made the strbuf and bufStore static private because they are used by multiple methods in FSImage class.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593705#action_12593705 ] 

Konstantin Shvachko commented on HADOOP-3248:
---------------------------------------------

- INode.getLocalNameBytes() vs .getLocalNameInBytes()
- byteStore is of size 512K. Our file names are liminted by MAX_PATH_LENGTH = 8000, utf8 is not more than 4 bytes per character, so you don't need a store larger than 32K bytes, but I would use even smaller buffer - it is ok if the buffer is reallocated if we have really long names.
- byteStore should be local variable as long as strBuf is a static member.
- all static members "used for saving the image to disk" like fileperm and strBuf should be extremely private.
- if you introduce static PermissionStatus.write(out,u,g,p) then you should call it in PermissionStatus.write(out) otherwise we have two ways to serialize PermissionStatus object.

Everything else looks great.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-3248) Improve Namenode startup performance

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594410#action_12594410 ] 

szetszwo edited comment on HADOOP-3248 at 5/5/08 5:11 PM:
------------------------------------------------------------------------

Dhruba, you are right.  PermissionStatus is under fs package.  We definitely should not use INode there.  In OO point of view, the codes in trunk probably is the best.  For performance, we should not do the converting (short-> FsPermission -> short).  Your patch balances both.  I am fine with that.  We could do this further improvement later.

      was (Author: szetszwo):
    Dhruba, you are right.  PermissionStatus is under fs package.  We definitely should not use INode there.  In OO point of view, the codes in trunk probably is the best.  For performance, we should do converting (short-> FsPermission -> short).  Your patch balances both.  I am fine with that.  We could do this further improvement later.
  
> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594113#action_12594113 ] 

Hadoop QA commented on HADOOP-3248:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12381387/fastRestarts3.patch
  against trunk revision 653239.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2392/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2392/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2392/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2392/console

This message is automatically generated.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593991#action_12593991 ] 

Hadoop QA commented on HADOOP-3248:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12381350/fastRestarts3.patch
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs -1.  The patch appears to introduce 2 new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2373/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2373/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2373/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2373/console

This message is automatically generated.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Status: Patch Available  (was: Open)

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Status: Patch Available  (was: Open)

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "girish vaitheeswaran (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591121#action_12591121 ] 

girish vaitheeswaran commented on HADOOP-3248:
----------------------------------------------

As mentioned loadFSImage() is where most of the time is now spent since saveImage() is optmized well with the above changes. Within loadFSImage() most of the time is being spent in getExistingPathINode. 

Here is some data point on that (out of close to 7 minutes)

Time taken in getExistingPathINode : 4 minutes
Total number of calls to getExistingPathINode : 19941941

This was for an image file of size 2.5GB with close to 20 million files.

Since getExistingPathINode() does not have any significant object allocation, most of the optimization work going forward would entail parallelizing the operation in this method and use up all processors on the machine running the namenode. At this point only one processor is utilized and the remaining processors are idle. One other point to mention is that we are not I/O bound in this whole recovery operation and optimizations need to almost entirely be CPU focussed.




> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>         Attachments: FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3248:
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.18.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this. Thank you Dhruba.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>             Fix For: 0.18.0
>
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Status: Open  (was: Patch Available)

Findbugs warnings

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Attachment: fastRestarts3.patch

Changed size of transient buffer from 512K to be 32K as per Konstantin's review comments.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Attachment: fastRestarts.patch

Fixed a debug print.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594410#action_12594410 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3248:
------------------------------------------------

Dhruba, you are right.  PermissionStatus is under fs package.  We definitely should not use INode there.  In OO point of view, the codes in trunk probably is the best.  For performance, we should do converting (short-> FsPermission -> short).  Your patch balances both.  I am fine with that.  We could do this further improvement later.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594073#action_12594073 ] 

Hadoop QA commented on HADOOP-3248:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12381387/fastRestarts3.patch
  against trunk revision 653184.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2390/console

This message is automatically generated.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "girish vaitheeswaran (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

girish vaitheeswaran updated HADOOP-3248:
-----------------------------------------

    Attachment: FSImage.patch

A patch file with the optimizations listed. These changes are mere prototypes and would need to be evaluated for correctness and impact on upgrade etc.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>         Attachments: FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-3248) Improve Namenode startup performance

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594308#action_12594308 ] 

shv edited comment on HADOOP-3248 at 5/5/08 1:22 PM:
---------------------------------------------------------------------

Is it going to be a mistake if we set bufStore length to MAX_PATH_LENGTH*4 rather than 512*1024?
It seems to me that we are allocating a buffer 16 times larger than an actual file name could be.

      was (Author: shv):
    Is it going to be a mistake if we set bufStore length to MAX_PATH_LENGTH*4 rather than 512*1024?
It seems to me that we are allocating a buffer 8 times larger than an actual file name could be.
  
> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591501#action_12591501 ] 

Konstantin Shvachko commented on HADOOP-3248:
---------------------------------------------

Could you please post here the exact parameters you used for jvm to improve the startup time.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>         Attachments: FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "girish vaitheeswaran (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594440#action_12594440 ] 

girish vaitheeswaran commented on HADOOP-3248:
----------------------------------------------

>> You mentioned there were 20 million files. Could you mention how many blocks and how many replicas there are per file? Also, the max heap size was 14GB. Did the >> namenode use all of 14GB when it started?
>>Thanks for your response

I was using the default replication value of 3 for these experiments. Don't recall the memory usage from the Namanode side. The -Xms parameter did help quite a bit in improving startup time and after setting it to different values 14G turned out to be the most performant. As for the number of blocks I am not sure but I beleive it was on an average of slightly less than 2 per file. 

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "girish vaitheeswaran (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592623#action_12592623 ] 

girish vaitheeswaran commented on HADOOP-3248:
----------------------------------------------

Konstantin,
Could you provide me with the hadoop-core.jar file so i can test the effect of your patch on the benchmarking environment.

Thanks
-girish

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594355#action_12594355 ] 

dhruba borthakur commented on HADOOP-3248:
------------------------------------------

Konstantin: Ok, i will change the size of bufStore as you suggsted.

Nicholas: Isn't it better to keep the serialization of PermissionStatus inside PermissionStatus.java rather than putting it into INode.java? thanks. dhruba


> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Status: Patch Available  (was: Open)

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594527#action_12594527 ] 

Hadoop QA commented on HADOOP-3248:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12381478/fastRestarts3.patch
  against trunk revision 653638.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2407/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2407/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2407/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2407/console

This message is automatically generated.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594362#action_12594362 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3248:
------------------------------------------------

How about move the method to PermissionStatus?  i.e.
{code}
static void writePermissionStatus(out, inode) {
  Text.writeString(out, inode.getUserName());
  Text.writeString(out, inode.getGroupName());
  out.writeShort((short)PermissionStatusFormat.MODE.retrieve(inode.permission));
}
{code}
Then, we don't have to convert short to FsPermission and then convert it back to short.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur reassigned HADOOP-3248:
----------------------------------------

    Assignee: dhruba borthakur

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594334#action_12594334 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3248:
------------------------------------------------

For fast read/write permission data, we don't really need to convert the mode in short to FsPermission.  I suggest to add a method in INode.writePermissionStatus(out).

In trunk,
{code}
        fileINode.getPermissionStatus().write(out);
{code}

In  fastRestarts3.patch,
{code}
        fileperm.fromShort(fileINode.getFsPermissionShort());
        PermissionStatus.write(out, fileINode.getUserName(),
                               fileINode.getGroupName(),
                               fileperm);
{code}

My suggestion,
{code}
//In FsImage
fileINode.writePermissionStatus(out);

//In INode
void writePermissionStatus(out) {
  Text.writeString(out, getUserName());
  Text.writeString(out, getGroupName());
  out.writeShort((short)PermissionStatusFormat.MODE.retrieve(permission));
}
{code}

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592506#action_12592506 ] 

Konstantin Shvachko commented on HADOOP-3248:
---------------------------------------------

This is much better than the current code. In my tests it doubles the speed of saving of the fsimage.
But I think we can and should do more here.

This is the sequence of actions proposed  by this patch for storing a file name:
- File name is stored in the name-node memory as byte[].
# INode.getLocalName() creates a new String() from that byte array and returns it.
# This string is then appended to a StringBuffer.
# Then it scans the StringBuffer once in order to calculate the resulting length.
# Then it scans the StringBuffer again in order to convert characters into UTF8 and write them to the output stream.

Conversion on the last step is not necessary, since the original byte array is already a utf8 representation of the file name.
Separate calculation of the length should be avoided. And the whole step with creating String and StringBuffer should be skipped.
We should merely write byte[] directly to the output stream.
I propose to use ByteBuffer instead of StringBuffer for the full path names.
This should also let us eliminate UTF8 without changing the image format.


> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3248) Improve Namenode startup performance

Posted by "Cagdas Gerede (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594376#action_12594376 ] 

Cagdas Gerede commented on HADOOP-3248:
---------------------------------------

You mentioned there were 20 million files. Could you mention how many blocks and how many replicas there are per file? Also, the max heap size  was 14GB. Did the namenode use all of 14GB when it started? 
Thanks for your response

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Attachment: fastRestarts.patch

This patch builds on the previous patch submitted by Girish. It eliminates object allocations in the saveImage path without changing the disk format. 

This patch also eliminates an additional object creation when saveImage invokes INode.getChildren(). Would appreciate it if Girish can review this patch.

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3248) Improve Namenode startup performance

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3248:
-------------------------------------

    Status: Open  (was: Patch Available)

> Improve Namenode startup performance
> ------------------------------------
>
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch, fastRestarts3.patch, fastRestarts3.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode scalability is the HDFS recovery performance especially in scenarios where the number of files is large. There are instances where the number of files are in the vicinity of 20 million and in such cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers on the time taken for namenode startup. These times do not include the time to process block reports.
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code changes are required to bring this time down further. To this end some prototype optimizations were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where the primary methods that were consuming most of the time. Most of the time was being spent on doing object allocations. The goal of the optimizations is to reduce the number of memory allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the fullName was being constructed using string concatenation. This optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down to slightly over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.