You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2007/07/05 20:45:04 UTC

[jira] Created: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

DFSScalability: reduce memory usage of namenode
-----------------------------------------------

Key: HADOOP-1565
URL: https://issues.apache.org/jira/browse/HADOOP-1565
Project: Hadoop
Issue Type: Bug
Reporter: dhruba borthakur
Assignee: dhruba borthakur

Experiments have demonstrated that a single file/block neds about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system namenode that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.

Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:

1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. Thsi saves a TreeMap object for every intermediate node in the directory tree.
2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.

For the records: TreeMap has the following fields:
Object key;
Object value;
Entry left = null;
Entry right = null;
Entry parent;
boolean color = BLACK;

and HashMap object:
final Object key;
Object value;
final int hash;
Entry next;

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517334 ] 

Konstantin Shvachko commented on HADOOP-1565:
---------------------------------------------

+1

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction3.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516838 ] 

Konstantin Shvachko edited comment on HADOOP-1565 at 7/31/07 5:41 PM:
----------------------------------------------------------------------

- I agree ArrayList should better serve the purpose than the TreeMap.
It saves us about 50 bytes per directory entry according to my calculations.
- hashcode. I don't think waisting 4 bytes per INode plus the complexity of supporting the hash code oriented
 ordering worth the performance gain we get from that. I would compare names as they are same as we did before.
 We  are talking about 10-20 entries per directory and file names of length 10 on average.
- is there a reason for reimplementing binary search rather than using Arrays.binarySearch()?
- children = new ArrayList<INode>(5);
   5 should be a constant
- System.out.println() should be removed.
- Since you are cleaning up DatanodeDescriptor, could you please also remove redundant imports of 
NetworkTopology and net.Node;



 was:
- I agree ArrayList should better serve the purpose than the TreeMap.
It saves us about 50 bytes per directory entry according to my calculations.
- hashcode. I don't think waisting 4 bytes per INode plus the complexity of supporting the hash code oriented
 ordering worth the performance gain we get from that. I would compare names as they are same as we did before.
 We  are talking about 10-20 entries per directory and file names of length 10 on average.
- is there a reason for reimplementing binary search rather than using Arrays.binarySearch()?
- children = new ArrayList<INode>(5);
   5 should be a constant
- System.out.println() should be removed.


> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction2.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

    Attachment:     (was: memoryReduction2.patch)

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction3.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514704 ] 

Raghu Angadi commented on HADOOP-1565:
--------------------------------------

Are there only directories? I think 50% is for a directory inode. When we consider all INodes, it would reduce around 50-60 bytes per INode on a 64 bit machine, a 12-15% of INode memory. 15% is also pretty large of course.
 

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516838 ] 

Konstantin Shvachko commented on HADOOP-1565:
---------------------------------------------

- I agree ArrayList should better serve the purpose than the TreeMap.
It saves us about 50 bytes per directory entry according to my calculations.
- hashcode. I don't think waisting 4 bytes per INode plus the complexity of supporting the hash code oriented
 ordering worth the performance gain we get from that. I would compare names as they are same as we did before.
 We  are talking about 10-20 entries per directory and file names of length 10 on average.
- is there a reason for reimplementing binary search rather than using Arrays.binarySearch()?
- children = new ArrayList<INode>(5);
   5 should be a constant
- System.out.println() should be removed.


> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction2.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

    Attachment: memoryReduction.patch

This patch removes the TreeMap for every HDFS directory, instead replaces it with a ArrayList. The FSDirectory code does a binary lookup on the ArrayList.

I measured that with 10M directories (with a fanout of 5 sub-directories per parent directory), the TreeMap occupies a total heap space of 1120 MB where the ArrayList implementation requires only 612MB. A whopping 50% improvement! 

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514721 ] 

Raghu Angadi commented on HADOOP-1565:
--------------------------------------

To add to this, irrespective of how many directories or files, memory reduced per INode is in the patch is 'sizeof(TreeMap.Entry) - sizeof(reference)'. this is true even if there are only directories in  the namespace.

Also how did you measure memory for ArrayList alone? 600M for 10M (across many ArrayLists) seems pretty large.

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

    Status: Patch Available  (was: Open)

This patch replaces each TreeMap in a directory with an ArrayList.

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction3.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

    Attachment:     (was: memoryReduction.patch)

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction2.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

    Attachment: memoryReduction3.patch

Incorporated Konstantin's review comments.

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction3.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510751 ] 

Raghu Angadi commented on HADOOP-1565:
--------------------------------------

Last time I checked couple of months back, file name String somehow ended up using 128 byte array. Could you double check? Milind noticed that this might be because of using substring() to get file name from full path. If this is the case then, this can save around 100 bytes per file.


> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517383 ] 

Hadoop QA commented on HADOOP-1565:
-----------------------------------

-1, build or testing failed

2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12363003/memoryReduction3.patch against trunk revision r562041.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/510/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/510/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction3.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.15.0
           Status: Resolved  (was: Patch Available)

I just committed this.

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: memoryReduction3.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

    Component/s: dfs

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction3.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

    Attachment: memoryReduction2.patch

Merged patch with latest trunk.

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction2.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514705 ] 

dhruba borthakur commented on HADOOP-1565:
------------------------------------------

I measured only directories using an artificial program. You are absolutely right in saying that there are usually far more files than directories. The portion of memory occupied by directories will be reduced by almost half. But the overall total memory usage of the namenode will *not* reduce by 50%.

> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: memoryReduction.patch
>
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1565) DFSScalability: reduce memory usage of namenode

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1565:
-------------------------------------

    Description: 
Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.

Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:

1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.

For the records: TreeMap has the following fields:
	Object key;
	Object value;
	Entry left = null;
	Entry right = null;
	Entry parent;
	boolean color = BLACK;

and HashMap object:
	final Object key;
	Object value;
	final int hash;
	Entry next;


  was:
Experiments have demonstrated that a single file/block neds about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system namenode that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.

Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:

1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. Thsi saves a TreeMap object for every intermediate node in the directory tree.
2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.

For the records: TreeMap has the following fields:
	Object key;
	Object value;
	Entry left = null;
	Entry right = null;
	Entry parent;
	boolean color = BLACK;

and HashMap object:
	final Object key;
	Object value;
	final int hash;
	Entry next;



> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
>                 Key: HADOOP-1565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1565
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
> 2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.
> For the records: TreeMap has the following fields:
> 	Object key;
> 	Object value;
> 	Entry left = null;
> 	Entry right = null;
> 	Entry parent;
> 	boolean color = BLACK;
> and HashMap object:
> 	final Object key;
> 	Object value;
> 	final int hash;
> 	Entry next;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.