You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Jakob Homan (JIRA)" <ji...@apache.org> on 2009/03/16 22:59:50 UTC
[jira] Updated: (HADOOP-5467) Create an offline fsimage image
viewer
[ https://issues.apache.org/jira/browse/HADOOP-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jakob Homan updated HADOOP-5467:
--------------------------------
Attachment: fsimage.xml
HADOOP-5467.patch
Done with first pass at offline image viewer. Still need to do unit tests and documentation, but looking for feedback.
The offline image viewer will process fsimage files of layout versions -18 or -19, creating several types of human-readable output. For instance, with the following (contrived) namespace:
{noformat}
drwxr-xr-x - jhoman supergroup 0 2009-03-16 21:17 /anotherDir
-rw-r--r-- 3 jhoman supergroup 286631664 2009-03-16 21:15 /anotherDir/biggerfile
-rw-r--r-- 3 jhoman supergroup 8754 2009-03-16 21:17 /anotherDir/smallFile
drwxr-xr-x - jhoman supergroup 0 2009-03-16 21:11 /mapredsystem
drwxr-xr-x - jhoman supergroup 0 2009-03-16 21:11 /mapredsystem/jhoman
drwxr-xr-x - jhoman supergroup 0 2009-03-16 21:11 /mapredsystem/jhoman/mapredsystem
drwx-wx-wx - jhoman supergroup 0 2009-03-16 21:11 /mapredsystem/jhoman/mapredsystem/ip.redacted.com
drwxr-xr-x - jhoman supergroup 0 2009-03-16 21:12 /one
drwxr-xr-x - jhoman supergroup 0 2009-03-16 21:12 /one/two
drwxr-xr-x - jhoman supergroup 0 2009-03-16 21:16 /user
drwxr-xr-x - jhoman supergroup 0 2009-03-16 21:19 /user/jhoman
{noformat}
using the default image processor, which mimics the output of ls, generates this:
{noformat}
[1233]mymac:hadoop-0.21.0-dev jhoman$ bin/hadoop offlineimageviewer -i fsimagedemo
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:16 /
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:17 /anotherDir
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:11 /mapredsystem
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:12 /one
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:16 /user
-rw-r--r-- 3 jhoman supergroup 286631664 2009-03-16 14:15 /anotherDir/biggerfile
-rw-r--r-- 3 jhoman supergroup 8754 2009-03-16 14:17 /anotherDir/smallFile
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:11 /mapredsystem/jhoman
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:11 /mapredsystem/jhoman/mapredsystem
drwx-wx-wx - jhoman supergroup 0 2009-03-16 14:11 /mapredsystem/jhoman/mapredsystem/ip.redacted.com
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:12 /one/two
drwxr-xr-x - jhoman supergroup 0 2009-03-16 14:19 /user/jhoman
{noformat}
The line ordering is a different, but this output is very amenable to further processing using standard unix tools and should look familiar to everyone.
Another image processor, Console, displays the namespace in a more verbose format that includes individual block entries and any inodes that are under construction in the fsimage:
{noformat}
[1233]mymac:hadoop-0.21.0-dev jhoman$ bin/hadoop offlineimageviewer -i fsimagedemo -p Console
FSImage
ImageVersion = -19
NamespaceID = 2109123098
GenerationStamp = 1003
INodes [NumInodes = 12]
Inode
INodePath =
Replication = 0
ModificationTime = 2009-03-16 14:16
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = 2147483647
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
Inode
INodePath = /anotherDir
Replication = 0
ModificationTime = 2009-03-16 14:17
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
Inode
INodePath = /mapredsystem
Replication = 0
ModificationTime = 2009-03-16 14:11
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
Inode
INodePath = /one
Replication = 0
ModificationTime = 2009-03-16 14:12
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
Inode
INodePath = /user
Replication = 0
ModificationTime = 2009-03-16 14:16
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
Inode
INodePath = /anotherDir/biggerfile
Replication = 3
ModificationTime = 2009-03-16 14:15
AccessTime = 2009-03-16 14:15
BlockSize = 134217728
Blocks [NumBlocks = 3]
Block
BlockID = -3825289017228345116
NumBytes = 134217728
GenerationStamp = 1002
Block
BlockID = -561951562131659349
NumBytes = 134217728
GenerationStamp = 1002
Block
BlockID = 524543674153268996
NumBytes = 18196208
GenerationStamp = 1002
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rw-r--r--
Inode
INodePath = /anotherDir/smallFile
Replication = 3
ModificationTime = 2009-03-16 14:17
AccessTime = 2009-03-16 14:17
BlockSize = 134217728
Blocks [NumBlocks = 1]
Block
BlockID = 4922053134320058874
NumBytes = 8754
GenerationStamp = 1003
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rw-r--r--
Inode
INodePath = /mapredsystem/jhoman
Replication = 0
ModificationTime = 2009-03-16 14:11
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
Inode
INodePath = /mapredsystem/jhoman/mapredsystem
Replication = 0
ModificationTime = 2009-03-16 14:11
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
Inode
INodePath = /mapredsystem/jhoman/mapredsystem/ip-redacted.com
Replication = 0
ModificationTime = 2009-03-16 14:11
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwx-wx-wx
Inode
INodePath = /one/two
Replication = 0
ModificationTime = 2009-03-16 14:12
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
Inode
INodePath = /user/jhoman
Replication = 0
ModificationTime = 2009-03-16 14:19
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = -1
DSQuota = -1
Permissions
Username = jhoman
GroupName = supergroup
PermString = rwxr-xr-x
INodesUnderConstruction [NumINodesUnderConstruction = 0]
{noformat}
The final current processor implemented is XML, which generates an XML file of the entire structure. I've attached the sample output of this. I think this may be the most interesting because it allows easy automated processing. However, it's also quite verbose. On a cluster here with about 93k files, the resulting XML was 2.7 million lines. However, TextMate was able to handle the output with little grumbling!
One option worth noting is -skipBlocks. In namespaces with a large number of files that span several blocks, this option causes the individual blocks to be omitted, including only the block count. Under this namespace distribution profile, this option will significantly decrease the size of the output.
It should be pretty easy to write new image processors and output formats as needed. I'll work on testing and documentation and upload a patch soon.
> Create an offline fsimage image viewer
> --------------------------------------
>
> Key: HADOOP-5467
> URL: https://issues.apache.org/jira/browse/HADOOP-5467
> Project: Hadoop Core
> Issue Type: New Feature
> Components: util
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Attachments: fsimage.xml, HADOOP-5467.patch
>
>
> It would be useful to have a tool to examine/dump the contents of the fsimage file to human-readable form. This would allow analysis of the namespace (file usage, block sizes, etc) without impacting the operation of the namenode. XML would be reasonable output format, as it can be easily viewed, compressed and manipulated via either XSLT or XQuery.
> I've started work on this and will have an initial version soon.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.